Attachment 'README.txt'
Download 1 This is the README file of the Polish XML subcorpus of the PARSEME corpus
2
3 Agata Savary, 17 November 2017
4
5 The official release in the [[http://typo.uni-konstanz.de/parseme/index.php/2-general/184-parseme-shared-task-format-of-the-final-annotation|parseme-tsv]] format, aligned with morphological and syntactic annotations in [[http://universaldependencies.org/format.html|CoNLL-U]] format is available via [[http://hdl.handle.net/11372/LRT-2282|LINDAT/CLARIN]] (see README.md of the Polish subcorpus).
6
7 This README describes the '''XML version''' of the Polish corpus.
8
9 The Polish data stem from
10 * the [[http://clip.ipipan.waw.pl/NationalCorpusOfPolish|National Corpus of Polish]] - all texts from daily newspapers are included, i.e. those whose identifiers start with 130-2, 130-3 or 130-5; the texts with identifiers starting with 130-5 were merged into bigger files for an easier file management:
11 * from 130-5-000000001 to 130-5-000000099 - merged into 130-5-0000000
12 ° from 130-5-000000100 to 130-5-000000199 - merged into 130-5-0000001
13 ° from 130-5-000000200 to 130-5-000000299 - merged into 130-5-0000002
14 * etc.
15 ° from 130-5-000001900 to 130-5-000001999 - merged into PL-NKJP-130-5-0000019
16 ° from 130-5-000001999 to 130-5-000002000 - merged into PL-NKJP-130-5-0000020
17 * the [[http://zil.ipipan.waw.pl/PolishCoreferenceCorpus|Polish Coreference Corpus]] - the 21 "long" texts from this corpus are included, 36,000 tokens, Rzeczpospolita newspaper
18
19 VMWEs have been annotated by a single annotator per file. The following [[http://parsemefr.lif.univ-mrs.fr/parseme-st-guidelines/1.0/?page=030_Categories_of_VMWEs|categories]] are used: ID, IReflV, LVC, OTH.
20
21 All VMWEs annotations were performed by Agata Savary.
22
23 The VMWEs annotations are distributed under the terms of the [CC-BY v4](https://creativecommons.org/licenses/by/4.0/) license.
24
25 Contact: agata.savary@univ-tours.fr
Attached Files
To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.You are not allowed to attach a file to this page.