Size: 4298
Comment:
|
Size: 4526
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 18: | Line 18: |
* [[http://langtech.jrc.it/JRC-Acquis.html|JRC-Acquis Multilingual Parallel Corpus]], | * [[http://langtech.jrc.it/JRC-Acquis.html|JRC-Acquis Multilingual Parallel Corpus]]. |
Line 32: | Line 32: |
* [[http://code.google.com/p/pantera-tagger/|PANTERA]] – a morphosyntactic tagger for Polish. | * [[http://code.google.com/p/pantera-tagger/|PANTERA]] – a morphosyntactic tagger for Polish, * a prototype [[http://nlp.ipipan.waw.pl/~adamp/msc/mastalerz.radomir/CD.tgz|Maximum Entropy tagger]] created within Radomir Mastalerz's [[http://nlp.ipipan.waw.pl/~adamp/msc/mastalerz.radomir/1000-MGR-INF-97543.pdf.gz|MSc]]. |
Language Tools and Resources for Polish
Written corpora and corpus-related tools
National Corpus of Polish (under development),
PICLE corpus (the Polish sub-corpus of the International Corpus of Learner English (ICLE)),
Poliqarp – a corpus indexing and search engine,
Anotatornia – a system for multi-level manual annotation of corpora.
Parallel corpora
OPUS – an open source parallel corpus (European Parliament, EMEA, KDE, movie subtitles),
Morphological tools and resources
Morfeusz SGJP – morphological analyser (Z. Saloni, W. Gruszczyński, M. Woliński, R. Wołosz),
Morfologik – morphological analyser (M. Miłkowski),
UAM Text Tools (P. Obrębski, Z. Vetulani; see also http://utt.wmi.amu.edu.pl/trac/wiki/),
Lexical analyser and a Polish proof-reader (S. Galus),
Neurosoft Gram (demo of a morphological analyser),
Finite state utilities (J. Daciuk),
Stemming engine for Polish (D. Weiss),
Stempel, another stemmer (A. Białecki).
Taggers
TaKIPI – a morphosyntactic tagger for Polish,
PANTERA – a morphosyntactic tagger for Polish,
a prototype Maximum Entropy tagger created within Radomir Mastalerz's MSc.
Parsers, grammars, treebanks
Świgra – a DCG parser,
Spejd – a shallow parsing and disambiguation system,
Dendrarium – a treebank development system (under development),
Machine-readable dictionaries
plWordNet, Polish WordNet (M. Piasecki),
Polish OpenThesaurus – a crowdsourced Polish thesaurus (M. Miłkowski),
Słownik języka polskiego (d. alternatywny) – Polish ispell dictionaries, along with some definitions and online form display.
Human-readable dictionaries
Machine translation demonstrations
Translatica (EN-PL-EN),
Bing Translator (multilingual),
Google Translate (multilingual),
InterTran (multilingual),
LingvoBit (EN-PL-EN),
Systran (EN-PL, PL-FR and some more).