The Polish Open Science Metadata Corpus

The Polish Open Science Metadata Corpus (POSMAC) is a collection of 216,214 abstracts of scientific publications compiled in the CURLICAT project. The Polish subset of CURLICAT contains data acquired from the Library of Science, a platform providing open access to full texts of articles published in over 900 Polish scientific journals and full texts of selected scientific books together with extensive bibliographic metadata.



CC-BY 4.0


Pęzik P., Mikołajczyk A., Wawrzyński A., Nitoń B., Ogrodniczuk M. Keyword Extraction from Short Texts with a Text-to-text Transfer Transformer (T5) (in preparation).