Project factsheet

English name:


Polish name:


Project type:

A national KBN grant (8S50301007)


1993 ‒ 1996

Project Web page:


Principal investigator:

Zygmunt Vetulani


Adam Mickiewicz University, Poznań

Project description

The main objective of POLEX was to fill the gap in the available language engineering resources for Polish, particularly in the domain of morphological electronic dictionaries necessary to design applications involving language processing. POLEX aimed to create morphologic dictionary for the core Polish vocabulary of general interest and based on a precise machine-interpretable formalism (coding system). The authors also considered as important the human transparency and readability of the formalism. This aspect, sometimes considered as secondary from the automatic processing point of view, is however important for maintenance and further development of electronic resources (openness). Compatibility with the LADL-style dictionaries developed now for various languages (e.g. French, English, German, Italian, Hungarian, Serbo-Croatian, modern Greek) was also a desired effect. The project benefited from the experience and collaboration with the Laboratoire d'Automatique Documentaire et Linguistique (Université Paris 7, M. Gross, E. Laporte) which contributed mainly with the methodological support. POLEX takes into account results of traditional descriptive research on Polish morphology. Polish is in fact one of the well described languages in the Indo-European family, but the existing works were in most cases dedicated to human interpreters, so that the existing (paper) dictionaries are not directly interpretable by the available software. In particular, this concerns also the existing descriptions and classifications, especially for nouns, essentially based on the rule/exception methodology. Improvement of this situation was the main challenge of the project. Among the main achievements of the project were: elaboration of a precise and computationally appropriate data format, adoption of a precise morphological classification system for lexemes (for nouns - cf. publication below), development of huge-electronic-dictionary-construction methodology, data encoding for the core Polish vocabulary (ca. 80,000 items at the project end). A part of these results was successfully applied in the CEGLEX and GRAMLEX projects. Dictionary data obtained within POLEX were extended (up to 120,000 entries) and continue to be maintained by the LEX S.C.