Locked History Actions

Diff for "LRT"

Differences between revisions 320 and 326 (spanning 6 versions)
Revision 320 as of 2016-12-30 18:36:17
Size: 24903
Comment: add NFJP
Revision 326 as of 2017-01-30 16:21:52
Size: 25837
Comment:
Deletions are marked like this. Additions are marked like this.
Line 9: Line 9:
== Written corpora and corpus-related tools == == Written corpora of contemporary Polish ==
Line 14: Line 14:
 * [[PL196x|Polish language of the 1960s]],
Line 19: Line 18:
  * Now available also as corpora in the Poliqarp for !DjVu [[http://poliqarp.wbl.klf.uw.edu.pl|search engine]],
 * [[http://poliqarp.sourceforge.net/|Poliqarp]], a corpus indexing and search engine (please see also [[http://nlp.ipipan.waw.pl/Poliqarp/|the beta version of Poliqarp 1.1 with statistical extensions]]),
 * [[http://zil.ipipan.waw.pl/Anotatornia|Anotatornia]], a system for multi-level manual annotation of corpora,
 * [[http://nlp.pwr.wroc.pl/en/tools-and-resources/inforex|Inforex]], a web-based system designed for managing and annotating text corpora on the semantic level,
 * [[http://smyrna.danieljanus.pl/|Smyrna]], a simple, light-weight Polish concordancer,
  * Now available also as corpora in the Poliqarp for !DjVu [[http://poliqarp.wbl.klf.uw.edu.pl|search engine]],
Line 32: Line 27:

== Written corpora of historical Polish ==
 * [[PL196x|Polish language of the 1960s]] (I. Kurcz, A. Lewicki, J. Sambor, J. Woronczak, K. Szafran, J. S. Bień, M. Woliński),
 * [[http://chronopress.clarin-pl.eu/|ChronoPress]], corpus of press texts from 1945–1954 (A. Pawłowski),
 * [[http://www.f19.uw.edu.pl/|Microcorpus of Polish: 1830-1918]], (M. Derwojedowa),
 * [[http://korba.edu.pl|KORBA]], electronic corpus of 17th and 18th century Polish texts (W. Gruszczyński),
 * [[http://www.spxvi.edu.pl/korpus/|Corpus of 16. century Polish]] (IBL PAN),
 * [[https://www.ijp-pan.krakow.pl/publikacje-elektroniczne/korpus-tekstow-staropolskich|Corpus of old Polish (up to 1500)]] (IJP PAN).


== Corpus-related tools and resources ==
 * [[http://poliqarp.sourceforge.net/|Poliqarp]], a corpus indexing and search engine (please see also [[http://nlp.ipipan.waw.pl/Poliqarp/|the beta version of Poliqarp 1.1 with statistical extensions]]),
 * [[http://zil.ipipan.waw.pl/Anotatornia|Anotatornia]], a system for multi-level manual annotation of corpora,
 * [[http://nlp.pwr.wroc.pl/en/tools-and-resources/inforex|Inforex]], a web-based system designed for managing and annotating text corpora on the semantic level,
 * [[http://smyrna.danieljanus.pl/|Smyrna]], a simple, light-weight Polish concordancer,
Line 63: Line 73:
 * [[http://plwordnet.pwr.wroc.pl/wordnet|plWordNet, Polish WordNet]] (M. Piasecki),  * [[http://plwordnet.pwr.wroc.pl/wordnet|plWordNet, Polish WordNet, Słowosieć]] (M. Piasecki),
Line 83: Line 93:
 * [[http://publications.it.p.lodz.pl/2016/word_embeddings/|Word embeddings for Polish]], (M. Rogalski, P. Szczepaniak).
Line 214: Line 225:
 * [[http://ws.clarin-pl.eu/|Online demos of tools for processing Polish texts]] (CLARIN)  * [[http://ws.clarin-pl.eu/|Online demos of tools for processing Polish texts]] (CLARIN-PL),
 * [[http://psi-toolkit.wmi.amu.edu.pl/index.html|PSI-Toolkit]], a chain of publicly available tools for automatic processing of Polish.
Line 227: Line 239:
 * [[http://psi-toolkit.wmi.amu.edu.pl/index.html|PSI-Toolkit]], a chain of publicly available tools for automatic processing of Polish,
Line 234: Line 245:
 * [[http://ws.clarin-pl.eu/demo/stylo2.html|Stylo 2, stylometry demo]].  * [[http://ws.clarin-pl.eu/demo/stylo2.html|Stylo 2, stylometry demo]],
 * [[http://http://zil
.ipipan.waw.pl/TermoPL|TermoPL, multiword expression extraction tool]].

Language Tools and Resources for Polish

This page contains a list of publicly available language tools and resources.

Written corpora of contemporary Polish

Written corpora of historical Polish

Spoken corpora

Parallel corpora and translation memories

Machine-readable dictionaries

Human-readable dictionaries

Morphological tools and resources

Taggers

Parsers, grammars, treebanks

Sentiment analysis

Coreference

Speech analysis and synthesis tools

Machine translation demonstrations

Summarizers

Diacritization

Named Entity Recognition

  • Nerf, a tool for named entity recognition, available on GNU GPL v.3 (J. Waszczuk),

  • Liner2, named entity recognizer released on GNU GPL with models to recognize 5 and 56 categories of proper names (M. Marcińczuk and M. Janicki),

  • TIMEX, a model for Liner2 to recognize and normalize temporal expressions (J. Kocoń and M. Marcińczuk).

Aggregating services

Other