Locked History Actions

Diff for "LRT"

Differences between revisions 138 and 139
Revision 138 as of 2012-02-27 10:26:33
Size: 13092
Comment:
Revision 139 as of 2012-02-27 10:31:58
Size: 12741
Comment:
Deletions are marked like this. Additions are marked like this.
Line 6: Line 6:
 * [[http://nkjp.pl/index.php?page=0&lang=1|National Corpus of Polish]] (NKJP)
  * [[http://nkjp.pl/poliqarp/|Poliqarp search engine for NKJP data]], a search engine for the National Corpus of Polish,
  * [[http://nkjp.uni.lodz.pl/|PELCRA search engine for NKJP data]], a search engine for the National Corpus of Polish,
 * [[http://nkjp.pl/index.php?page=0&lang=1|National Corpus of Polish]] (NKJP),
Line 15: Line 13:
 * [[http://ifa.amu.edu.pl/~ifaconc/blog/?page_id=60|PICLE corpus]], the Polish sub-corpus of the [[http://www.fltr.ucl.ac.be/fltr/germ/etan/cecl/Cecl-Projects/Icle/icle.htm|International Corpus of Learner English]] (ICLE),  * [[http://ifa.amu.edu.pl/~ifaconc/blog/?page_id=60|PICLE corpus]], the Polish sub-corpus of the [[http://www.fltr.ucl.ac.be/fltr/germ/etan/cecl/Cecl-Projects/Icle/icle.htm|International Corpus of Learner English]] (ICLE),   * [[http://dl.psnc.pl/activities/projekty/impact/results/| IMPACT ground-truth data]] for selected Polish historical documents from PIONIER Digital Libraries Federation,
Line 18: Line 17:
 * [[http://smyrna.danieljanus.pl/|Smyrna]], a simple, light-weight Polish concordancer,
 * [[http://dl
.psnc.pl/activities/projekty/impact/results/| IMPACT ground-truth data]] for selected Polish historical documents from PIONIER Digital Libraries Federation.
 * [[http://smyrna.danieljanus.pl/|Smyrna]], a simple, light-weight Polish concordancer.
Line 135: Line 133:
 * [[http://www.nkjp.uni.lodz.pl/collocations.jsp|Kolokator]], a collocation extraction tool for NKJP data,

Language Tools and Resources for Polish

This page contains a list of publicly available language tools and resources.

Parallel corpora

Spoken corpora

Translation memories

  • MyMemory, freely available multilingual TM,

  • TAUS Data, a multilingual TM from the members of TAUS Data Association.

Morphological tools and resources

Taggers

  • TaKIPI, a morphosyntactic tagger for Polish,

  • PANTERA, a morphosyntactic tagger for Polish,

  • WMBT, a morphosyntactic tagger for Polish.

Parsers, grammars, treebanks

Machine-readable dictionaries

Human-readable dictionaries

Speech analysis and synthesis tools

  • Skrybot, commercial speech recognition system (L. Pawlaczyk, P. Bosky),

  • Ivona, commercial text-to-speech system (Expressivo),

  • Acapela, text to speech demo,

  • Synteza mowy polskiej, automatic speech recognition and speech synthesis demos, with background information (K. Szklanny),

  • System syntezy mowy ciągłej (G. Demenko, S. Grocholewski),

  • Polish MBROLA database (K. Szklanny, K. Marasek),

  • SynTalk, commercial speech synthesis system (NeuroSoft),

  • PrimeSpeech, commercial speech recognition systems,

  • OrtFon, phonetic transcriber (AGH DSP),

  • ASR, automatic speech recognition system for Polish (AGH DSP),

  • Anotator, speech corpora anotator dedicated for Polish and focused on connecting existing resources (AGH DSP),

Machine translation demonstrations

Other

  • Kolokacje, a Web crawler and collocation finder (A. Buczyński),

  • WSDDE, a system for designing and performing Word Sense Disambiguation experiments (R. Młodzki et al.),

  • Frazeo, a search engine and clusterer of news in Polish (P. Pęzik),

  • Segment, a rule-based sentence tokenizer supporting SRX standard (J. Lipski; the Polish rules are available in LanguageTool project),

  • Toki, a tokenizer supporting SRX standard, C++ library and toolkit (T. Śniatowski and A. Radziszewski)

  • Translatica SRX sentence segmentation rules for Polish (LGPL)

  • Lakon, a system for news summarization (master's thesis by A. Dudczak),

  • SyMGIZA++, an extension of Giza++ that computes symmetric word alignment models,

  • Multiservice, a sample interface for running NLP Web services for Polish,

  • Hipisek, an experimental question answering system (M. Walas),

  • Narzędzia dygitalizacji tekstów, Poliqarp for DjVu i inne programy,

  • Nerf, a tool for named entity recognition, available on GNU GPL v.3.