Project factsheet
English name: |
CEGLEX |
Polish name: |
CEGLEX |
Project type: |
4th Framework Programme, COPERNICUS Project 1032 |
Duration: |
April 1995 ‒ April 1996 |
Principal investigator: |
Antoine Ogonowski |
Polish partners involved
Adam Mickiewicz University, Poznań (PI: Zygmunt Vetulani; Web page)
Project description
The main goal of the CEGLEX consortium was to test the GENELEX proposal of a generic model for re-usable lexicons, first implemented for a number of West-European languages (French, English, German, Italian,...), for three more languages spoken in Central Europe: Czech, Hungarian and Polish. This generic model takes the form of a SGML DTD where linguistic information is associated with dictionary units via SGML tags and attributes. This model presupposes a universe composed of "elements". These elements are complex, e.g. may be parts of another elements. We may think of an element definition as characterizing a class of individuals of the same type labeled by an element name. An element has a characteristic set of attributes and an associated "is-a-part" relation pattern. The notion of element corresponds to the classical notion of "category". The CEGLEX/GENELEX model claims to be:
- theory-welcoming, i.e. to admit various theoretical approaches,
- complete, i.e. to cover all relevant phenomena on three classical layers of linguistic description of lexemes which may be represented in electronic dictionaries: morphological layer, syntactical layer and semantic layer,
- easily transportable.
The three layers of the CEGLEX/GENELEX model were confronted with the data of the considered languages with generally positive results, especially for Czech and Polish. For Polish this confrontation consisted in the adaptation of the model to the Polish data. On this occasion some modifications were proposed, in particular concerning the representation of the inflection phenomena as described in the project POLEX.