PolEval is a SemEval-inspired evaluation campaign for natural language processing tools for Polish. Submitted tools compete against one another within certain tasks selected by organisers, using available data and are evaluated according to pre-established procedures.
Below you can find language resources associated with PolEval, which may be useful for other projects.
PolEval 2017 POS Tagging Shared Task
This is a corrected version of the corpus created for the evaluation of Task 1 during PolEval 2017 competition. It contains 54 906 segments, annotated manually by two qualified linguists. The annotation has been conducted in two phases: in the first phase the source raw text – coming from the Polish Coreference Corpus – has been annotated in parallel a) manually, by two qualified linguists; b) automatically, using the most recent version of the Concraft tagger, trained on the hand-annotated portion of the NCP. In the second phase, the differences between human annotators and the tagger have been found and cross-corrected by the annotator, which has not previously worked on this text part.