Locked History Actions

Diff for "LRT"

Differences between revisions 136 and 138 (spanning 2 versions)
Revision 136 as of 2012-02-26 11:01:35
Size: 13742
Comment:
Revision 138 as of 2012-02-27 10:26:33
Size: 13092
Comment:
Deletions are marked like this. Additions are marked like this.
Line 9: Line 9:
  * [[http://www.nkjp.uni.lodz.pl/collocations.jsp|Kolokator]], a collocation extraction tool for NKJP data,
  * [[http://nlp.ipipan.waw.pl/TEI4NKJP/|TEI4NKJP]], a collection of XML schemata used in NKJP,
Line 12: Line 10:
  * [[attachment:gramatyka_Spejd_NKJP_RC1.0.zip]], a release candidate of a shallow [[Spejd]] grammar for NKJP, available on GNU GPL v.3,
  * [[Nerf]], a tool for named entity recognition, available on GNU GPL v.3,
  * report errors in:
   * [[https://docs.google.com/spreadsheet/viewform?hl=pl&formkey=dERoLWhzYWNveXlvS09ZMDlRNmcydVE6MQ#gid=0|the 1-million word subcorpus]],
   * [[https://docs.google.com/spreadsheet/viewform?hl=pl&formkey=dDgtbVpRTGFYWEROcGVxSVd6VGdZMGc6MA#gid=0|the full NKJP]],
Line 27: Line 20:
 * [[http://dl.psnc.pl/activities/projekty/impact/results/|Wersje cyfrowe wybranych polskich dokumentów historycznych]]
Line 142: Line 133:
 * [[http://hipisek.pl|Hipisek]], an experimental question answering system (M. Walas).
 * [[https://bitbucket.org/jsbien/ndt|Narzędzia dygitalizacji tekstów]], Poliqarp for !DjVu i inne programy.
 * [[http://hipisek.pl|Hipisek]], an experimental question answering system (M. Walas),
 * [[https://bitbucket.org/jsbien/ndt|Narzędzia dygitalizacji tekstów]], Poliqarp for !DjVu i inne programy,
 * [[http://www.nkjp.uni.lodz.pl/collocations.jsp|Kolokator]], a collocation extraction tool for NKJP data,
 * [[Nerf]], a tool for named entity recognition, available on GNU GPL v.3.

Language Tools and Resources for Polish

This page contains a list of publicly available language tools and resources.

Parallel corpora

Spoken corpora

Translation memories

  • MyMemory, freely available multilingual TM,

  • TAUS Data, a multilingual TM from the members of TAUS Data Association.

Morphological tools and resources

Taggers

  • TaKIPI, a morphosyntactic tagger for Polish,

  • PANTERA, a morphosyntactic tagger for Polish,

  • WMBT, a morphosyntactic tagger for Polish.

Parsers, grammars, treebanks

Machine-readable dictionaries

Human-readable dictionaries

Speech analysis and synthesis tools

  • Skrybot, commercial speech recognition system (L. Pawlaczyk, P. Bosky),

  • Ivona, commercial text-to-speech system (Expressivo),

  • Acapela, text to speech demo,

  • Synteza mowy polskiej, automatic speech recognition and speech synthesis demos, with background information (K. Szklanny),

  • System syntezy mowy ciągłej (G. Demenko, S. Grocholewski),

  • Polish MBROLA database (K. Szklanny, K. Marasek),

  • SynTalk, commercial speech synthesis system (NeuroSoft),

  • PrimeSpeech, commercial speech recognition systems,

  • OrtFon, phonetic transcriber (AGH DSP),

  • ASR, automatic speech recognition system for Polish (AGH DSP),

  • Anotator, speech corpora anotator dedicated for Polish and focused on connecting existing resources (AGH DSP),

Machine translation demonstrations

Other

  • Kolokacje, a Web crawler and collocation finder (A. Buczyński),

  • WSDDE, a system for designing and performing Word Sense Disambiguation experiments (R. Młodzki et al.),

  • Frazeo, a search engine and clusterer of news in Polish (P. Pęzik),

  • Segment, a rule-based sentence tokenizer supporting SRX standard (J. Lipski; the Polish rules are available in LanguageTool project),

  • Toki, a tokenizer supporting SRX standard, C++ library and toolkit (T. Śniatowski and A. Radziszewski)

  • Translatica SRX sentence segmentation rules for Polish (LGPL)

  • Lakon, a system for news summarization (master's thesis by A. Dudczak),

  • SyMGIZA++, an extension of Giza++ that computes symmetric word alignment models,

  • Multiservice, a sample interface for running NLP Web services for Polish,

  • Hipisek, an experimental question answering system (M. Walas),

  • Narzędzia dygitalizacji tekstów, Poliqarp for DjVu i inne programy,

  • Kolokator, a collocation extraction tool for NKJP data,

  • Nerf, a tool for named entity recognition, available on GNU GPL v.3.