Locked History Actions

Diff for "Linguistic Engineering Group"

Differences between revisions 1 and 12 (spanning 11 versions)
Revision 1 as of 2011-03-07 11:49:32
Size: 11970
Comment:
Revision 12 as of 2011-06-20 17:23:27
Size: 7033
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
#acl CLIPWarszawaGroup:read,write All:read
Line 3: Line 4:
The Linguistic Engineering (LE) Group is part of the [[http://www.ipipan.waw.pl/en/dept/dept-ai.html|Department of Artificial Intelligence]] at the [[http://www.ipipan.waw.pl/en/|Institute of
Computer Science]], [[http://www.english.pan.pl/|Polish
Academy of Sciences]] (ICS PAS).
The Linguistic Engineering (LE) Group is part of the [[http://www.ipipan.waw.pl/en/dept/dept-ai.html|Department of Artificial Intelligence]] at the [[http://www.ipipan.waw.pl/en/|Institute of Computer Science]], [[http://www.english.pan.pl/|Polish Academy of Sciences]] (ICS PAS).
Line 10: Line 9:
|| Leonard Bolc, PhD (Professor Emeritus) || [[mailto:leonard.bolc@ipipan.waw.pl|leonard.bolc@ipipan.waw.pl]] || || Leonard Bolc, PhD, Professor Emeritus || [[mailto:leonard.bolc@ipipan.waw.pl|leonard.bolc@ipipan.waw.pl]] ||
Line 18: Line 17:
|| Maciej Ogrodniczuk, PhD || [[mailto:maciej.ogrodniczuk@ipipan.waw.pl|maciej.ogrodniczuk@ipipan.waw.pl]] ||
|| Jakub Piskorski, PhD (Associate) || [[mailto:jakub.piskorski@ipipan.waw.pl|jakub.piskorski@ipipan.waw.pl]] ||
|| [[../../../../~adamp/|Adam Przepiórkowski]]
, PhD, Head of the Group || [[mailto:adam.przepiorkowski@ipipan.waw.pl|adam.przepiorkowski@ipipan.waw.pl]] ||
||[[Maciej-Ogrodniczuk|Maciej Ogrodniczuk]], PhD || [[mailto:maciej.ogrodniczuk@ipipan.waw.pl|maciej.ogrodniczuk@ipipan.waw.pl]] ||
|| Jakub Piskorski, PhD, Associate || [[mailto:jakub.piskorski@ipipan.waw.pl|jakub.piskorski@ipipan.waw.pl]] ||
|| [[http://nlp.ipipan.waw.pl/~adamp/|Adam Przepiórkowski]], PhD,
Head of the Group || [[mailto:adam.przepiorkowski@ipipan.waw.pl|adam.przepiorkowski@ipipan.waw.pl]] ||
Line 45: Line 44:
 * ''[[http://ec.europa.eu/information_society/apps/projects/factsheet/index.cfm?project_ref=271022|CESAR]]'' (''CEntral and South-east europeAn Resources''; part of [[http://www.meta-net.eu/|META-NET]]) ‒ a European ([[http://ec.europa.eu/information_society/activities/ict_psp/index_en.htm|CIP ICT-PSP]]) project (grant agreement 271022), 1 February 2011 ‒ 31 January 2013. Polish PI: Adam Przepiórkowski.
 * ''[[http://www.elka.pw.edu.pl/pol/Media2/Serwis-Informacyjny/Aktualnosci/Projekt-SYNAT-System-Nauki-i-Techniki-realizowany-w-ramach-INFINITI-PASSIM|SYNAT]] (Creation of a universal, open repository platform for hosting and communication of networked resources of knowledge for science, education and open knowledge-based society)'' – a [[http://en.ncbir.pl/|National Centre for Research and Development]] grant, 16 August 2010 – 16 August 2013. Polish title: ''Utworzenie uniwersalnej, otwartej, repozytoryjnej platformy hostingowej i komunikacyjnej dla sieciowych zasobów wiedzy dla nauki, edukacji i otwartego społeczeństwa wiedzy''. PI: Beata Konikowska.
 * ''[[http://www.ipipan.waw.pl/nekst/|Nekst]](An adaptive system to support problem-solving on the basis of document collections in the Internet)'' ‒ a national [[http://www.eng.nauka.gov.pl/meinen/|Ministry of Science and Higher Education]] Innovative Economy Operational Programme (PO IG) grant, 1 January 2010 ‒ 31 December 2013. Polish title: ''Adaptacyjny system wspomagający rozwiązywanie problemów w oparciu o analizę treści dostępnych źródeł elektronicznych''. PI: Jacek Koronacki.
 * ''[[http://www.atlasproject.eu/|ATLAS]]'' (''Applied Technology for Language-Aided CMS'') ‒ a European ([[http://ec.europa.eu/information_society/activities/ict_psp/index_en.htm|CIP ICT-PSP]]) project (grant agreement 250467), 1 March 2010 ‒ 28 February 2013. Polish PI: Adam Przepiórkowski.
 * ''Construction of a treebank for Polish using automatic syntactic analysis'' ‒ a national [[http://www.eng.nauka.gov.pl/meinen/|Ministry of Science and Higher Education]] research grant (number N N104 224735), 14 October 2008 ‒ 13 April 2011. Polish title: ''Budowa banku drzew składniowych dla języka polskiego z wykorzystaniem automatycznej analizy składniowej''. PI: Marcin Woliński.
 * ''[[http://www.clarin.eu/|CLARIN]] (Common Language Resources and Technology Infrastructure)'' ‒ a European ([[http://cordis.europa.eu/esfri/|ESFRI]]) infrastructure project, FP7 (contract number 212230), 1 January 2008 ‒ 31 December 2010 (plus 6 months extension). PI at ICS PAS: Adam Przepiórkowski.
 * ''[[http://nkjp.pl/|NKJP]] (National Corpus of Polish)'' ‒ a national [[http://www.eng.nauka.gov.pl/meinen/|Ministry of Science and Higher Education]] research/development grant (number R17 003 03), 13 December 2007 ‒ 12 December 2010 (plus 6 months extension). Polish title: ''Narodowy Korpus Języka Polskiego''. PI: Adam Przepiórkowski.
 * [[CORE]] (Computer-based methods for coreference resolution in Polish texts),
 * [[NEKST]] (An adaptive system to support problem-solving on the basis of document collections in the Internet),
 * [[SYNAT]] (Creation of a universal, open repository platform for hosting and communication of networked resources of knowledge for science, education and open knowledge-based society),
 * [[ATLAS]] (Applied Technology for Language-Aided CMS),
 * [[CESAR]] (CEntral and South-east europeAn Resources),
 * [[Construction of a treebank for Polish using automatic syntactic analysis]],
 * [[CLARIN]] (Common Language Resources and Technology Infrastructure).
Line 56: Line 54:
 * ''Automatic detection of semantic dependencies within verb argument structures in large treebanks'' ‒ a national [[http://www.eng.nauka.gov.pl/meinen/|Ministry of Science and Higher Education]] habilitation grant (number N N516 0165 33), 2 November 2007 ‒ 1 November 2009. Polish title: ''Automatyczne wykrywanie zależności semantycznych w strukturze argumentowej czasowników w dużych korpusach tekstów anotowanych syntaktycznie''. PI: Elżbieta Hajnicz.
 * ''[[http://www.ist-luna.eu/|LUNA]] (spoken Language UNderstanding in multilinguAl communication systems)'' ‒ a European ( [[http://www.cordis.lu/ist/|IST]]) Specific Targeted Research Project (contract number 033549), 4 September 2006 ‒ 3 September 2009. Polish PI: Agnieszka Mykowiecka.
 * ''Spoken language understanding in multilingual communication systems'' ‒ a [[http://www.eng.nauka.gov.pl/meinen/|Ministry of Science and Higher Education]] support for the Polish participation in the [[http://www.ist-luna.eu/|LUNA]] project, 1 March 2008 ‒ 1 September 2009. Polish title: ''Rozumienie mowy w wielojęzycznych systemach komunikacji''. PI: Małgorzata Marciniak.
 * ''[[http://www.lt4el.eu/|LT4eL]] (Language Technology for eLearning)'' ‒ a European ( [[http://www.cordis.lu/ist/|IST]]) Specific Targeted Research Project (contract number 027391), 1 December 2005 ‒ 31 May 2008. Polish PI: Adam Przepiórkowski.
 * ''[[http://nlp.ipipan.waw.pl/PPJP/|Automatic extraction of linguistic knowledge from a large corpus of Polish]]'' ‒ a national [[http://www.eng.nauka.gov.pl/meinen/|Ministry of Science and Higher Education]] research grant (number 3T11C00328), 9 March 2005 ‒ 8 March 2008. Polish title: ''Automatyczna ekstrakcja wiedzy lingwistycznej z dużego korpusu języka polskiego''. PI: Adam Przepiórkowski. The first publicly available tagger of Polish, [[http://nlp.pwr.wroc.pl/takipi/|TaKIPI]] has originally been developed within this project.
 * ''Information Extraction from Polish free text'' ‒ a national [[http://www.eng.nauka.gov.pl/meinen/|Ministry of Science and Higher Education]] research grant (number 3T11C00727), 20 October 2004 ‒ 19 October 2007. Polish title: ''Opracowanie narzędzi do ekstrakcji informacji z tekstów w języku polskim''. PI: Agnieszka Mykowiecka.
 * ''[[http://korpus.pl/|The IPI PAN Corpus]] of Polish'' ‒ a national [[http://www.kbn.gov.pl/|KBN]] grant (7T11C04320), 1 April 2001 ‒ 31 March 2004. Polish title: ''Anotowany korpus pisanego języka polskiego z dostępem przez internet (z uwzględnieniem zastosowań w inżynierii lingwistycznej)''. PI: Adam Przepiórkowski.
 * ''A [[../../../CRIT2/|Treebank / Test-Suite of Polish Utterances]]'' ‒ a EU [[http://www.ipipan.waw.pl/en/research/grants-completed.html#euro|CRIT-2]] subproject (ICS-MM), 15 October 1997 ‒ 14 October 2000. Coordinator: Leonard Bolc.
 * ''An [[../../../HPSG/hpsg.html|HPSG Grammar of Polish]] (theory and [[../../../HPSG/PolishInHPSG.pl|implementation]])'' ‒ a national [[http://www.kbn.gov.pl/|KBN]] grant (8T11C01110), 1 January 1996 ‒ 31 December 1998. Polish title: ''Zastosowanie metod inżynierii lingwistycznej do automatycznej analizy i syntezy tekstów języka polskiego''. PI: Leonard Bolc.
 * [[NKJP]] (National Corpus of Polish),
 * [[Automatic detection of semantic dependencies within verb argument structures in large treebanks]],
 * [[LUNA]] (spoken Language UNderstanding in multilinguAl communication systems) with the Polish support,
 * [[LT4eL]] (Language Technology for eLearning),
 * [[Automatic extraction of linguistic knowledge from a large corpus of Polish]],
 * [[Information Extraction from Polish free text]],
 * [[IPI PAN Corpus|The IPI PAN Corpus of Polish]],
 * [[Test Suite of Polish Utterances|Treebank / Test Suite of Polish Utterances]],
 * [[HPSG Grammar of Polish]].
Line 95: Line 92:
 * [[../../../../../../seminar-e.html|NLP Seminar at IPI PAN]];  * [[http://nlp.ipipan.waw.pl/seminar-e.html|NLP Seminar at IPI PAN]];

The Linguistic Engineering Group

The Linguistic Engineering (LE) Group is part of the Department of Artificial Intelligence at the Institute of Computer Science, Polish Academy of Sciences (ICS PAS).

People

Anna Andrzejczuk, MSc

anna.andrzejczuk@ipipan.waw.pl

Leonard Bolc, PhD, Professor Emeritus

leonard.bolc@ipipan.waw.pl

Łukasz Degórski, MSc

ldegorski@bach.ipipan.waw.pl

Elżbieta Hajnicz, PhD

elzbieta.hajnicz@ipipan.waw.pl

Łukasz Kobyliński, MSc

lkobylinski@ipipan.waw.pl

Anna Kupść, PhD (on leave)

anna.kupsc@ipipan.waw.pl

Małgorzata Marciniak, PhD

malgorzata.marciniak@ipipan.waw.pl

Marcin Miłkowski, PhD (part time)

marcin.milkowski@ifispan.waw.pl

Agnieszka Mykowiecka, PhD

agnieszka.mykowiecka@ipipan.waw.pl

Maciej Ogrodniczuk, PhD

maciej.ogrodniczuk@ipipan.waw.pl

Jakub Piskorski, PhD, Associate

jakub.piskorski@ipipan.waw.pl

Adam Przepiórkowski, PhD, Head of the Group

adam.przepiorkowski@ipipan.waw.pl

Piotr Rychlik, PhD

rychlik@ipipan.waw.pl

Tomek Strzałkowski, PhD, Foreign Associate

tomek@cs.albany.edu

Łukasz Szałkiewicz, MSc

lukasz.szalkiewicz@ipipan.waw.pl

Stan Szpakowicz, PhD, Foreign Associate

szpak@site.uottawa.ca

Aleksander Wawer, MSc

aleksander.wawer@ipipan.waw.pl

Marcin Woliński, PhD

marcin.wolinski@ipipan.waw.pl

Alina Wróblewska, MSc

alina.wroblewska@ipipan.waw.pl

Research

The main research areas of the Group

  • (Polish) corpus linguistics; cf. the IPI PAN Corpus of Polish and the National Corpus of Polish,

  • syntactic and semantic parsing of Polish; cf. Spejd and Świgra,

  • extraction of linguistic knowledge from corpora,
  • information extraction,
  • sentiment analysis,
  • morphosyntactic system of Polish,
  • generative linguistic formalisms, esp., HPSG and LFG.

The Group is a member of CLARIN, FLaReNet and META-NET.

Current externally funded projects

  • CORE (Computer-based methods for coreference resolution in Polish texts),

  • NEKST (An adaptive system to support problem-solving on the basis of document collections in the Internet),

  • SYNAT (Creation of a universal, open repository platform for hosting and communication of networked resources of knowledge for science, education and open knowledge-based society),

  • ATLAS (Applied Technology for Language-Aided CMS),

  • CESAR (CEntral and South-east europeAn Resources),

  • Construction of a treebank for Polish using automatic syntactic analysis,

  • CLARIN (Common Language Resources and Technology Infrastructure).

Some of our past projects

Publicly available tools and resources

Here are some of the tools and resources created within our projects.

Tools (all open source, under GPL):

  • Świgra – a DCG parser,

  • Spejd – a shallow parsing and disambiguation system,

  • TaKIPI – a morphosyntactic tagger for Polish,

  • PANTERA – a morphosyntactic tagger for Polish,

  • Poliqarp – a corpus indexing and search engine,

  • Dendrarium – a treebank development system (under development),

  • Anotatornia – a system for multi-level manual annotation of corpora (forthcoming),

  • WSDDE – a system for designing and performing Word Sense Disambiguation experiments (forthcoming),

  • etc.

Resources:

Other activities

Links to some other activities of the Group: