Locked History Actions

Diff for "LRT"

Differences between revisions 307 and 308
Revision 307 as of 2016-02-15 19:59:50
Size: 24386
Comment:
Revision 308 as of 2016-04-21 12:14:59
Size: 24412
Comment:
Deletions are marked like this. Additions are marked like this.
Line 24: Line 24:
 * [[http://nlp.pwr.wroc.pl/kpwr|KPWr]], Polish Corpus of Wrocław University of Technology, collection of documents available on Creative Common license annotated with syntactic chunks, proper names, semantic relations, anaphora and word senses,  * [[http://nlp.pwr.wroc.pl/narzedzia-i-zasoby/zasoby/kpwr|KPWr]], Polish Corpus of Wrocław University of Technology, collection of documents available on Creative Common license annotated with syntactic chunks, proper names, semantic relations, anaphora and word senses,

Language Tools and Resources for Polish

This page contains a list of publicly available language tools and resources.

Spoken corpora

Parallel corpora and translation memories

Machine-readable dictionaries

Human-readable dictionaries

Morphological tools and resources

Taggers

Parsers, grammars, treebanks

Sentiment analysis

Coreference

Speech analysis and synthesis tools

Machine translation demonstrations

Summarizers

Diacritization

Named Entity Recognition

  • Nerf, a tool for named entity recognition, available on GNU GPL v.3,

  • Liner2, named entity recognizer released on GNU GPL with models to recognize 5 and 56 categories of proper names (M. Marcińczuk and M. Janicki).

Aggregating services

Other

  • Mobile plWordNet, free mobile application for plWordNet browsing (J. Kocoń),

  • Kolokacje, a Web crawler and collocation finder (A. Buczyński),

  • WSDDE, a system for designing and performing Word Sense Disambiguation experiments (R. Młodzki et al.),

  • Frazeo, a search engine and clusterer of news in Polish (P. Pęzik),

  • Segment, a rule-based sentence tokenizer supporting SRX standard (J. Lipski; the Polish rules are available in LanguageTool project, see here for short instructions on how to use the tool),

  • Toki, a tokenizer supporting SRX standard, C++ library and toolkit (T. Śniatowski and A. Radziszewski),

  • Translatica SRX sentence segmentation rules for Polish (LGPL),

  • SyMGIZA++, an extension of Giza++ that computes symmetric word alignment models,

  • Hipisek, an experimental question answering system (M. Walas),

  • Narzędzia dygitalizacji tekstów, Poliqarp for DjVu i inne programy,

  • PSI-Toolkit, a chain of publicly available tools for automatic processing of Polish,

  • Fextor, a feature extraction framework,

  • LexCSD, a system for semi-automatic sense disambiguation,

  • SuperMatrix, a general tool for lexical semantic knowledge acquisition,

  • WordnetLoom, an wordnet editor application,

  • Toposław, tool for the creation of electronic inflectional dictionaries of multi-word units,

  • CorpCor, a web-based tool for correcting morphosyntactic annotation in TEI XML encoded corpora (e.g. NKJP).

  • Stylo 2, stylometry demo.