plWordNet 2.0
Project factsheet
English title: |
|
Polish title: |
Półautomatyczna konstrukcja zasobów leksykalnych przez rozpoznawanie relacji semantycznych na podstawie danych morfo-syntaktycznych i semantycznych w korpusach tekstu |
Project type: |
A national Ministry of Science and Higher Education research grant (number N N516 068637) |
Duration: |
1 October 2009 ‒ 30 September 2012 |
Project Web page: |
|
Principal investigator: |
Maciej Piasecki |
Institution: |
Institute of Applied Informatics, Wrocław University of Technology |
Project description
This project is continuation of the former project (3 T11C 018 29) on the construction of plWordNet 1.0 -- the first publicly available wordnet for Polish. The main goal of this project is to extend and improve Distributional Semantics methods and pattern-based methods developed in the former project and to build a complex, semi-automatic system supporting linguists working on plWordNet construction.
The second goal is to extend plWordNet 1.0 to the size of 70000-80000 lexical units (pairs: lemma, sense number) and 45000-55000 synsets.
The main objective of both projects was to construct a Polish WordNet as economically as possible.
Polish WordNet is a network of lexical-semantic relations, an electronic thesaurus with a structure modelled on that of the Princeton WordNet and those constructed in the EuroWordNet project. Polish WordNet describes the meaning of a lexical unit of one or more words by placing this unit in a network of links which represent such relations as synonymy, hypernyny, meronymy etc.
To reduce the cost of the project, Polish WordNet was built semi-automatically. Lexical relations were automatically recognized in large corpora of Polish, e.g., IPI PAN Corpus) and suggested to linguists/lexicographers via a graphical interface.