Differences between revisions 4 and 5

The Linguistic Engineering Group

The Linguistic Engineering (LE) Group is part of the Department of Artificial Intelligence at the Institute of Computer Science, Polish Academy of Sciences (ICS PAS).

People

Anna Andrzejczuk, MSc	anna.andrzejczuk@ipipan.waw.pl
Leonard Bolc, PhD (Professor Emeritus)	leonard.bolc@ipipan.waw.pl
Łukasz Degórski, MSc	ldegorski@bach.ipipan.waw.pl
Elżbieta Hajnicz, PhD	elzbieta.hajnicz@ipipan.waw.pl
Łukasz Kobyliński, MSc	lkobylinski@ipipan.waw.pl
Anna Kupść, PhD (on leave)	anna.kupsc@ipipan.waw.pl
Małgorzata Marciniak, PhD	malgorzata.marciniak@ipipan.waw.pl
Marcin Miłkowski, PhD (part time)	marcin.milkowski@ifispan.waw.pl
Agnieszka Mykowiecka, PhD	agnieszka.mykowiecka@ipipan.waw.pl
Maciej Ogrodniczuk, PhD	maciej.ogrodniczuk@ipipan.waw.pl
Jakub Piskorski, PhD (Associate)	jakub.piskorski@ipipan.waw.pl
Adam Przepiórkowski, PhD, Head of the Group	adam.przepiorkowski@ipipan.waw.pl
Piotr Rychlik, PhD	rychlik@ipipan.waw.pl
Tomek Strzałkowski, PhD, Foreign Associate	tomek@cs.albany.edu
Łukasz Szałkiewicz, MSc	lukasz.szalkiewicz@ipipan.waw.pl
Stan Szpakowicz, PhD, Foreign Associate	szpak@site.uottawa.ca
Aleksander Wawer, MSc	aleksander.wawer@ipipan.waw.pl
Marcin Woliński, PhD	marcin.wolinski@ipipan.waw.pl
Alina Wróblewska, MSc	alina.wroblewska@ipipan.waw.pl

Research

The main research areas of the Group

(Polish) corpus linguistics; cf. the IPI PAN Corpus of Polish and the National Corpus of Polish,
syntactic and semantic parsing of Polish; cf. Spejd and Świgra,
extraction of linguistic knowledge from corpora,
information extraction,
sentiment analysis,
morphosyntactic system of Polish,
generative linguistic formalisms, esp., HPSG and LFG.

The Group is a member of CLARIN, FLaReNet and META-NET.

Current externally funded projects

CESAR (CEntral and South-east europeAn Resources),
SYNAT (Creation of a universal, open repository platform for hosting and communication of networked resources of knowledge for science, education and open knowledge-based society),
NEKST (An adaptive system to support problem-solving on the basis of document collections in the Internet),
ATLAS (Applied Technology for Language-Aided CMS),
Construction of a treebank for Polish using automatic syntactic analysis,
CLARIN (Common Language Resources and Technology Infrastructure),
NKJP (National Corpus of Polish).

Some of our past projects

Automatic detection of semantic dependencies within verb argument structures in large treebanks ‒ a national Ministry of Science and Higher Education habilitation grant (number N N516 0165 33), 2 November 2007 ‒ 1 November 2009. Polish title: Automatyczne wykrywanie zależności semantycznych w strukturze argumentowej czasowników w dużych korpusach tekstów anotowanych syntaktycznie. PI: Elżbieta Hajnicz.
LUNA (spoken Language UNderstanding in multilinguAl communication systems) ‒ a European ( IST) Specific Targeted Research Project (contract number 033549), 4 September 2006 ‒ 3 September 2009. Polish PI: Agnieszka Mykowiecka.
Spoken language understanding in multilingual communication systems ‒ a Ministry of Science and Higher Education support for the Polish participation in the LUNA project, 1 March 2008 ‒ 1 September 2009. Polish title: Rozumienie mowy w wielojęzycznych systemach komunikacji. PI: Małgorzata Marciniak.
LT4eL (Language Technology for eLearning) ‒ a European ( IST) Specific Targeted Research Project (contract number 027391), 1 December 2005 ‒ 31 May 2008. Polish PI: Adam Przepiórkowski.
Automatic extraction of linguistic knowledge from a large corpus of Polish ‒ a national Ministry of Science and Higher Education research grant (number 3T11C00328), 9 March 2005 ‒ 8 March 2008. Polish title: Automatyczna ekstrakcja wiedzy lingwistycznej z dużego korpusu języka polskiego. PI: Adam Przepiórkowski. The first publicly available tagger of Polish, TaKIPI has originally been developed within this project.
Information Extraction from Polish free text ‒ a national Ministry of Science and Higher Education research grant (number 3T11C00727), 20 October 2004 ‒ 19 October 2007. Polish title: Opracowanie narzędzi do ekstrakcji informacji z tekstów w języku polskim. PI: Agnieszka Mykowiecka.
The IPI PAN Corpus of Polish ‒ a national KBN grant (7T11C04320), 1 April 2001 ‒ 31 March 2004. Polish title: Anotowany korpus pisanego języka polskiego z dostępem przez internet (z uwzględnieniem zastosowań w inżynierii lingwistycznej). PI: Adam Przepiórkowski.
A Treebank / Test-Suite of Polish Utterances ‒ a EU CRIT-2 subproject (ICS-MM), 15 October 1997 ‒ 14 October 2000. Coordinator: Leonard Bolc.
An HPSG Grammar of Polish (theory and implementation) ‒ a national KBN grant (8T11C01110), 1 January 1996 ‒ 31 December 1998. Polish title: Zastosowanie metod inżynierii lingwistycznej do automatycznej analizy i syntezy tekstów języka polskiego. PI: Leonard Bolc.

Publicly available tools and resources

Here are some of the tools and resources created within our projects.

Tools (all open source, under GPL):

Świgra – a DCG parser,
Spejd – a shallow parsing and disambiguation system,
TaKIPI – a morphosyntactic tagger for Polish,
PANTERA – a morphosyntactic tagger for Polish,
Poliqarp – a corpus indexing and search engine,
Dendrarium – a treebank development system (under development),
Anotatornia – a system for multi-level manual annotation of corpora (forthcoming),
WSDDE – a system for designing and performing Word Sense Disambiguation experiments (forthcoming),
etc.

Resources:

IPI PAN Corpus of Polish,
National Corpus of Polish (under development).

Other activities

Links to some other activities of the Group:

NLP Seminar at IPI PAN;
Intelligent Information Systems series of conferences.

-  ⇤ ← Revision 4 as of 2011-03-18 11:15:19 → 
  Size: 9600
  Editor: MaciejOgrodniczuk
  Comment:
+   ← Revision 5 as of 2011-03-18 19:32:09 → ⇥
  Size: 9645
  Editor: AdamPrzepiorkowski
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 18:
-|| [[../../../../~adamp/|Adam Przepiórkowski]], PhD, Head of the Group             || [[mailto:adam.przepiorkowski@ipipan.waw.pl|adam.przepiorkowski@ipipan.waw.pl]] ||
+|| [[http://nlp.ipipan.waw.pl/~adamp/|Adam Przepiórkowski]], PhD, Head of the Group             || [[mailto:adam.przepiorkowski@ipipan.waw.pl|adam.przepiorkowski@ipipan.waw.pl]] ||
 Line 60:
- * ''A [[../../../CRIT2/|Treebank / Test-Suite of Polish Utterances]]'' ‒ a EU [[http://www.ipipan.waw.pl/en/research/grants-completed.html#euro|CRIT-2]] subproject (ICS-MM), 15 October 1997 ‒ 14 October 2000. Coordinator: Leonard Bolc.
 * ''An [[../../../HPSG/hpsg.html|HPSG Grammar of Polish]] (theory and  [[../../../HPSG/PolishInHPSG.pl|implementation]])'' ‒ a national [[http://www.kbn.gov.pl/|KBN]] grant (8T11C01110), 1 January 1996 ‒ 31 December 1998.  Polish title: ''Zastosowanie metod inżynierii lingwistycznej do automatycznej analizy i syntezy tekstów języka polskiego''. PI: Leonard Bolc.
+ * ''A [[http://nlp.ipipan.waw.pl/CRIT2/|Treebank / Test-Suite of Polish Utterances]]'' ‒ a EU [[http://www.ipipan.waw.pl/en/research/grants-completed.html#euro|CRIT-2]] subproject (ICS-MM), 15 October 1997 ‒ 14 October 2000. Coordinator: Leonard Bolc.
 * ''An [[http://nlp.ipipan.waw.pl/HPSG/hpsg.html|HPSG Grammar of Polish]] (theory and  [[../../../HPSG/PolishInHPSG.pl|implementation]])'' ‒ a national [[http://www.kbn.gov.pl/|KBN]] grant (8T11C01110), 1 January 1996 ‒ 31 December 1998.  Polish title: ''Zastosowanie metod inżynierii lingwistycznej do automatycznej analizy i syntezy tekstów języka polskiego''. PI: Leonard Bolc.

Diff for "Linguistic Engineering Group"

Menu

Wiki