Locked History Actions

Diff for "LRT"

Differences between revisions 38 and 42 (spanning 4 versions)
Revision 38 as of 2011-04-13 20:02:59
Size: 6032
Comment:
Revision 42 as of 2011-04-18 13:31:27
Size: 7194
Comment:
Deletions are marked like this. Additions are marked like this.
Line 18: Line 18:
 * [[http://parasol.unibe.ch|ParaSol]] – a parallel corpus of Slavic and other languages,
 * [[http://www.domeczek.pl/~polukr/index.php?option=search|PolUKR]] – a Polish-Ukrainian parallel corpus,
Line 19: Line 21:
 * [[http://nl.ijs.si/ME/V4|"1984"]] - an annotated parallel corpus of George Orwell's "1984" in 15 languages, MULTEXT-East, v.4 (licensed download),
Line 28: Line 31:
 * [[http://nl.ijs.si/ME/V4/msd/html|MULTEXT-East, v.4 ]] - morphosyntactic specifications and documentation for 16 languages,
 * [[http://www.domeczek.pl/~polukr/mte-conv|KIPI->MTE]] - a converter from the IPI PAN Corpus to the MULTEXT-East morphosyntactic format (A. Radziszewski, N. Kotsyba),
Line 76: Line 81:
 * [[http://nlp.ipipan.waw.pl/WSDDE/|WSDDE]] – a system for designing and performing Word Sense Disambiguation experiments (R. Młodzki ''et al.'').  * [[http://nlp.ipipan.waw.pl/WSDDE/|WSDDE]] – a system for designing and performing Word Sense Disambiguation experiments (R. Młodzki ''et al.''),
 * [[http://frazeo.pl/|Frazeo]], a search engine and clusterer of news in Polish (P. Pęzik),
 * [[http://segment.sourceforge.net/|Segment]], a rule-based sentence tokenizer supporting SRX standard (J. Lipski; the Polish rules are available in [[http://languagetool.svn.sourceforge.net/viewvc/languagetool/trunk/JLanguageTool/src/resource/segment.srx|LanguageTool project]]),
 * [[http://www.cs.put.poznan.pl/dweiss/research/lakon/|Lakon]], a system for news summarization (master's thesis by A. Dudczak).

Language Tools and Resources for Polish

This page contains a list of publicly available language tools and resources.

Parallel corpora

Morphological tools and resources

Taggers

  • TaKIPI – a morphosyntactic tagger for Polish,

  • PANTERA – a morphosyntactic tagger for Polish,

  • a prototype implementation of Maximum Entropy tagging created within Radomir Mastalerz's MSc.

Parsers, grammars, treebanks

Machine-readable dictionaries

Human-readable dictionaries

Speech analysis and synthesis tools

  • Skrybot - commercial speech recognition system (L. Pawlaczyk, P. Bosky)

  • Ivona - commercial text-to-speech system (Expressivo)

Machine translation demonstrations

Other

  • Kolokacje, a Web crawler and collocation finder (A. Buczyński),

  • WSDDE – a system for designing and performing Word Sense Disambiguation experiments (R. Młodzki et al.),

  • Frazeo, a search engine and clusterer of news in Polish (P. Pęzik),

  • Segment, a rule-based sentence tokenizer supporting SRX standard (J. Lipski; the Polish rules are available in LanguageTool project),

  • Lakon, a system for news summarization (master's thesis by A. Dudczak).