Size: 1559
Comment:
|
Size: 1606
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
The Polish Gazetteer is the textual source used within the ''[[http://sprout.dfki.de/|SProUT]]'' platform for the automatic pre-annotation of the National Corpus of Polish (NKJP) on the level of named entities. Its construction, contents and use have been described in: | The Polish Gazetteer is the textual source used within the ''[[http://sprout.dfki.de/|SProUT]]'' platform for the automatic pre-annotation of the '[[http://nkjp.pl/index.php?page=0&lang=1 |National Corpus of Polish]]'' (NKJP) on the level of named entities. Its construction, contents and use have been described in: |
The Polish Gazetteer
The Polish Gazetteer is the textual source used within the SProUT platform for the automatic pre-annotation of the 'National Corpus of Polish (NKJP) on the level of named entities. Its construction, contents and use have been described in: SAVARY, A., PISKORSKI, J. (2011). SAVARY, A., PISKORSKI, J. (2010). The file contains 153,477 inflected entries of Polish (and some foreign) proper names and named entity components: The file DOES NOT contain inhabitant names and relational adjectives stemming from Polish settlements. These data, owned by the PWN publisher, were used within the NKJP project under a particular licence and are concerned by the copyright.
The data is available under 2-clause BSD licence.
Language Resources for Named Entity Annotation in the National Corpus of Polish, to appear in Control and Cybernetics. Copyright information
Available resources