Size: 9600
Comment:
|
Size: 9645
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 18: | Line 18: |
|| [[../../../../~adamp/|Adam Przepiórkowski]], PhD, Head of the Group || [[mailto:adam.przepiorkowski@ipipan.waw.pl|adam.przepiorkowski@ipipan.waw.pl]] || | || [[http://nlp.ipipan.waw.pl/~adamp/|Adam Przepiórkowski]], PhD, Head of the Group || [[mailto:adam.przepiorkowski@ipipan.waw.pl|adam.przepiorkowski@ipipan.waw.pl]] || |
Line 60: | Line 60: |
* ''A [[../../../CRIT2/|Treebank / Test-Suite of Polish Utterances]]'' ‒ a EU [[http://www.ipipan.waw.pl/en/research/grants-completed.html#euro|CRIT-2]] subproject (ICS-MM), 15 October 1997 ‒ 14 October 2000. Coordinator: Leonard Bolc. * ''An [[../../../HPSG/hpsg.html|HPSG Grammar of Polish]] (theory and [[../../../HPSG/PolishInHPSG.pl|implementation]])'' ‒ a national [[http://www.kbn.gov.pl/|KBN]] grant (8T11C01110), 1 January 1996 ‒ 31 December 1998. Polish title: ''Zastosowanie metod inżynierii lingwistycznej do automatycznej analizy i syntezy tekstów języka polskiego''. PI: Leonard Bolc. |
* ''A [[http://nlp.ipipan.waw.pl/CRIT2/|Treebank / Test-Suite of Polish Utterances]]'' ‒ a EU [[http://www.ipipan.waw.pl/en/research/grants-completed.html#euro|CRIT-2]] subproject (ICS-MM), 15 October 1997 ‒ 14 October 2000. Coordinator: Leonard Bolc. * ''An [[http://nlp.ipipan.waw.pl/HPSG/hpsg.html|HPSG Grammar of Polish]] (theory and [[../../../HPSG/PolishInHPSG.pl|implementation]])'' ‒ a national [[http://www.kbn.gov.pl/|KBN]] grant (8T11C01110), 1 January 1996 ‒ 31 December 1998. Polish title: ''Zastosowanie metod inżynierii lingwistycznej do automatycznej analizy i syntezy tekstów języka polskiego''. PI: Leonard Bolc. |
The Linguistic Engineering Group
The Linguistic Engineering (LE) Group is part of the Department of Artificial Intelligence at the Institute of Computer Science, Polish Academy of Sciences (ICS PAS).
People
Anna Andrzejczuk, MSc |
|
Leonard Bolc, PhD (Professor Emeritus) |
|
Łukasz Degórski, MSc |
|
Elżbieta Hajnicz, PhD |
|
Łukasz Kobyliński, MSc |
|
Anna Kupść, PhD (on leave) |
|
Małgorzata Marciniak, PhD |
|
Marcin Miłkowski, PhD (part time) |
|
Agnieszka Mykowiecka, PhD |
|
Maciej Ogrodniczuk, PhD |
|
Jakub Piskorski, PhD (Associate) |
|
Adam Przepiórkowski, PhD, Head of the Group |
|
Piotr Rychlik, PhD |
|
Tomek Strzałkowski, PhD, Foreign Associate |
|
Łukasz Szałkiewicz, MSc |
|
Stan Szpakowicz, PhD, Foreign Associate |
|
Aleksander Wawer, MSc |
|
Marcin Woliński, PhD |
|
Alina Wróblewska, MSc |
Research
The main research areas of the Group
(Polish) corpus linguistics; cf. the IPI PAN Corpus of Polish and the National Corpus of Polish,
syntactic and semantic parsing of Polish; cf. Spejd and Świgra,
- extraction of linguistic knowledge from corpora,
- information extraction,
- sentiment analysis,
- morphosyntactic system of Polish,
- generative linguistic formalisms, esp., HPSG and LFG.
The Group is a member of CLARIN, FLaReNet and META-NET.
Current externally funded projects
CESAR (CEntral and South-east europeAn Resources),
SYNAT (Creation of a universal, open repository platform for hosting and communication of networked resources of knowledge for science, education and open knowledge-based society),
NEKST (An adaptive system to support problem-solving on the basis of document collections in the Internet),
ATLAS (Applied Technology for Language-Aided CMS),
Construction of a treebank for Polish using automatic syntactic analysis,
CLARIN (Common Language Resources and Technology Infrastructure),
NKJP (National Corpus of Polish).
Some of our past projects
Automatic detection of semantic dependencies within verb argument structures in large treebanks ‒ a national Ministry of Science and Higher Education habilitation grant (number N N516 0165 33), 2 November 2007 ‒ 1 November 2009. Polish title: Automatyczne wykrywanie zależności semantycznych w strukturze argumentowej czasowników w dużych korpusach tekstów anotowanych syntaktycznie. PI: Elżbieta Hajnicz.
LUNA (spoken Language UNderstanding in multilinguAl communication systems) ‒ a European ( IST) Specific Targeted Research Project (contract number 033549), 4 September 2006 ‒ 3 September 2009. Polish PI: Agnieszka Mykowiecka.
Spoken language understanding in multilingual communication systems ‒ a Ministry of Science and Higher Education support for the Polish participation in the LUNA project, 1 March 2008 ‒ 1 September 2009. Polish title: Rozumienie mowy w wielojęzycznych systemach komunikacji. PI: Małgorzata Marciniak.
LT4eL (Language Technology for eLearning) ‒ a European ( IST) Specific Targeted Research Project (contract number 027391), 1 December 2005 ‒ 31 May 2008. Polish PI: Adam Przepiórkowski.
Automatic extraction of linguistic knowledge from a large corpus of Polish ‒ a national Ministry of Science and Higher Education research grant (number 3T11C00328), 9 March 2005 ‒ 8 March 2008. Polish title: Automatyczna ekstrakcja wiedzy lingwistycznej z dużego korpusu języka polskiego. PI: Adam Przepiórkowski. The first publicly available tagger of Polish, TaKIPI has originally been developed within this project.
Information Extraction from Polish free text ‒ a national Ministry of Science and Higher Education research grant (number 3T11C00727), 20 October 2004 ‒ 19 October 2007. Polish title: Opracowanie narzędzi do ekstrakcji informacji z tekstów w języku polskim. PI: Agnieszka Mykowiecka.
The IPI PAN Corpus of Polish ‒ a national KBN grant (7T11C04320), 1 April 2001 ‒ 31 March 2004. Polish title: Anotowany korpus pisanego języka polskiego z dostępem przez internet (z uwzględnieniem zastosowań w inżynierii lingwistycznej). PI: Adam Przepiórkowski.
A Treebank / Test-Suite of Polish Utterances ‒ a EU CRIT-2 subproject (ICS-MM), 15 October 1997 ‒ 14 October 2000. Coordinator: Leonard Bolc.
An HPSG Grammar of Polish (theory and implementation) ‒ a national KBN grant (8T11C01110), 1 January 1996 ‒ 31 December 1998. Polish title: Zastosowanie metod inżynierii lingwistycznej do automatycznej analizy i syntezy tekstów języka polskiego. PI: Leonard Bolc.
Publicly available tools and resources
Here are some of the tools and resources created within our projects.
Tools (all open source, under GPL):
Świgra – a DCG parser,
Spejd – a shallow parsing and disambiguation system,
TaKIPI – a morphosyntactic tagger for Polish,
PANTERA – a morphosyntactic tagger for Polish,
Poliqarp – a corpus indexing and search engine,
Dendrarium – a treebank development system (under development),
Anotatornia – a system for multi-level manual annotation of corpora (forthcoming),
WSDDE – a system for designing and performing Word Sense Disambiguation experiments (forthcoming),
Resources:
National Corpus of Polish (under development).
Other activities
Links to some other activities of the Group:
Intelligent Information Systems series of conferences.