Size: 5054
Comment:
|
← Revision 96 as of 2023-02-09 11:55:58 ⇥
Size: 6789
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
## page was renamed from Projects | |
Line 3: | Line 4: |
== Research infrastructures == | This page contains a list of ''externally financed'' projects relevant to language resources and technologies carried out by Polish research institutions. |
Line 5: | Line 6: |
* [[http://www.clarin.eu/|CLARIN]]: * [[Linguistic Engineering Group|Institute of Computer Science, Polish Academy of Sciences]], * [[http://www.ispan.waw.pl/zakjez/semantyka/index_eng.htm|Institute of Slavic Studies, Polish Academy of Sciences]], * [[http://www.pjwstk.edu.pl/|Polish-Japanese Institute of Information Technology]] * [[http://www.uni.lodz.pl/|Institute of English Language, University of Łódź]] * [[http://www.iis.pwr.wroc.pl/|Institute of Applied Informatics, Wrocław University of Technology]] |
== Ending in… == |
Line 12: | Line 8: |
* [[http://www.flarenet.eu/|FLaReNet]] – * [[Linguistic Engineering Group|Institute of Computer Science, Polish Academy of Sciences]], * [[http://amu.edu.pl/en/|Adam Mickiewicz University in Poznań]], |
=== 2026 === |
Line 16: | Line 10: |
* [[http://www.meta-net.eu/|META-NET]] – * [[Linguistic Engineering Group|Institute of Computer Science, Polish Academy of Sciences]], * [[http://www.uni.lodz.pl/|Institute of English Language, University of Łódź]] |
* [[https://www.cost.eu/actions/CA21167/|UniDive]] (Universality, diversity and idiosyncrasy in language technology) === 2023 === * [[https://clarin.biz/|CLARIN-PL-Biz]] (Common Language Resources and Technology Infrastructure) * [[http://clip.ipipan.waw.pl/DARIAH.Lab|DARIAH.Lab]] (Digital Research Infrastructure for the Arts and Humanities DARIAH-PL) === 2022 === * [[http://zil.ipipan.waw.pl/HOMADOS|HOMADOS]] (Hampering Misinformation by Assessing Credibility of Online Sources) * [[http://clip.ipipan.waw.pl/CURLICAT|CURLICAT]] (Curated Multilingual Language Resources for CEF AT) === 2021 === * [[http://clip.ipipan.waw.pl/CLARIN-PL-3|CLARIN-PL]] (Polish chapter of [[http://www.clarin.eu/|Common Language Resources and Technology Infrastructure]]) * [[http://zil.ipipan.waw.pl/CORMETAN|CORMETAN]] (Cognitive and sociocultural analysis of metaphoric expressions in Polish texts) * [[http://zil.ipipan.waw.pl/Quantifiers|Kwantyfikatory w języku: użycie i znaczenie]] (Quantifiers in Language: Use and Meaning) === 2020 === * [[http://clip.ipipan.waw.pl/MARCELL|MARCELL]] (Multilingual Resources for CEF.AT in the legal domain) === 2019 === * [[http://zil.ipipan.waw.pl/Chronofleks|Chronofleks]] (A diachronic formal model of Polish inflection and its implementation) * [[http://zil.ipipan.waw.pl/CoDeS |CoDeS]] (Compositional distributional semantic models for identification, discrimination and disambiguation of senses in Polish texts) * [[http://clip.ipipan.waw.pl/ELRC|ELRC]] (European Language Resource Coordination) * [[Parthenos]] (Pooling Activities, Resources and Tools for Heritage E-research Networking, Optimization and Synergies) * [[http://zil.ipipan.waw.pl/Scwad|Scwad]] (Compositional distributional modelling of Polish language semantics) * [[http://synamet.uw.edu.pl/|SYNAMET]] (Microcorpus of Synaesthetic Metaphors. Towards a Formal Description and Efficient Methods of Analysis of Metaphors in Discourse) === 2018 === * [[COTHEC]] (Computable theory of coreference) * [[http://clip.ipipan.waw.pl/CLARIN-PL-2|CLARIN-PL]] (Common Language Resources and Technology Infrastructure) * [[KORBA]] (Electronic corpus of 17th and 18th century Polish texts) * [[TextLink]] (Structuring Discourse in Multilingual Europe) === 2017 === * [[http://zil.ipipan.waw.pl/PARSEME|PARSEME]] (PARSing and Multi-word Expressions. Towards linguistic precision and computational efficiency in natural language processing) === 2016 === * [[CLARIN-PL]] (Common Language Resources and Technology Infrastructure) * [[http://www.dsp.agh.edu.pl/doku.php?id=pl:research:bezpieczenstwo|Opracowanie systemu informatycznego umożliwiającego identyfikację głosową osób dzwoniących pod numer alarmowy]] * [[http://zil.ipipan.waw.pl/OPTA|OPTA]] (Identifying OPinion TArgets and opinion expressions in Polish texts) === 2015 === * [[http://www.dsp.agh.edu.pl/doku.php?id=pl:research:lingwarsz|Lingwistyczny Warsztat]] (Linguistic workshop for speech analysis and recognition) * [[http://www.dsp.agh.edu.pl/doku.php?id=pl:research:300|SpeechSamples]] (Cross-language comparative analysis of phonemes) * [[http://www.dsp.agh.edu.pl/doku.php?id=pl:research:lider|TELEDS]] (Human-computer dialog system based on telephone connection) * [[Readability]] (Measuring the degree of readability of nonliterary Polish texts) === 2014 === * [[NEKST]] (An adaptive system to support problem-solving on the basis of document collections in the Internet) * [[CORE]] (Computer-based methods for coreference resolution in Polish texts) * [[http://www.dsp.agh.edu.pl/doku.php?id=pl:research:pbs|Biometric voice verification and identification]] * [[TrendMiner]] (Large-scale, Cross-lingual Trend Mining and Summarisation of Real-time Media Streams) |
Line 21: | Line 72: |
== Current externally funded projects == | === 2013 === |
Line 23: | Line 74: |
* [[CESAR]] (CEntral and South-east europeAn Resources), * [[SYNAT]] (Creation of a universal, open repository platform for hosting and communication of networked resources of knowledge for science, education and open knowledge-based society), * [[NEKST]] (An adaptive system to support problem-solving on the basis of document collections in the Internet), * [[ATLAS]] (Applied Technology for Language-Aided CMS), * [[Construction of a treebank for Polish using automatic syntactic analysis]], * [[CLARIN]] (Common Language Resources and Technology Infrastructure), * [[NKJP]] (National Corpus of Polish). |
* [[ATLAS]] (Applied Technology for Language-Aided CMS) * [[CESAR]] (CEntral and South-east europeAn Resources), part of [[http://www.meta-net.eu/|META-NET]] * [[http://zil.ipipan.waw.pl/Automatic%20detection%20and%20correction%20of%20annotation%20errors%20in%20Polish%20language%20corpora|Automatic detection and correction of annotation errors in Polish language corpora]] * [[Polish-Russian Parallel Corpus]] * [[psi-toolkit]] (Publicly available tools for automatic processing of Polish language) * [[SYNAT]] (Creation of a universal, open repository platform for hosting and communication of networked resources of knowledge for science, education and open knowledge-based society) |
Line 31: | Line 81: |
== Selected past projects == | |
Line 33: | Line 82: |
* ''Automatic detection of semantic dependencies within verb argument structures in large treebanks'' ‒ a national [[http://www.eng.nauka.gov.pl/meinen/|Ministry of Science and Higher Education]] habilitation grant (number N N516 0165 33), 2 November 2007 ‒ 1 November 2009. Polish title: ''Automatyczne wykrywanie zależności semantycznych w strukturze argumentowej czasowników w dużych korpusach tekstów anotowanych syntaktycznie''. PI: Elżbieta Hajnicz. * ''[[http://www.ist-luna.eu/|LUNA]] (spoken Language UNderstanding in multilinguAl communication systems)'' ‒ a European ( [[http://www.cordis.lu/ist/|IST]]) Specific Targeted Research Project (contract number 033549), 4 September 2006 ‒ 3 September 2009. Polish PI: Agnieszka Mykowiecka. * ''Spoken language understanding in multilingual communication systems'' ‒ a [[http://www.eng.nauka.gov.pl/meinen/|Ministry of Science and Higher Education]] support for the Polish participation in the [[http://www.ist-luna.eu/|LUNA]] project, 1 March 2008 ‒ 1 September 2009. Polish title: ''Rozumienie mowy w wielojęzycznych systemach komunikacji''. PI: Małgorzata Marciniak. * ''[[http://www.lt4el.eu/|LT4eL]] (Language Technology for eLearning)'' ‒ a European ( [[http://www.cordis.lu/ist/|IST]]) Specific Targeted Research Project (contract number 027391), 1 December 2005 ‒ 31 May 2008. Polish PI: Adam Przepiórkowski. * ''[[http://nlp.ipipan.waw.pl/PPJP/|Automatic extraction of linguistic knowledge from a large corpus of Polish]]'' ‒ a national [[http://www.eng.nauka.gov.pl/meinen/|Ministry of Science and Higher Education]] research grant (number 3T11C00328), 9 March 2005 ‒ 8 March 2008. Polish title: ''Automatyczna ekstrakcja wiedzy lingwistycznej z dużego korpusu języka polskiego''. PI: Adam Przepiórkowski. The first publicly available tagger of Polish, [[http://nlp.pwr.wroc.pl/takipi/|TaKIPI]] has originally been developed within this project. * ''Information Extraction from Polish free text'' ‒ a national [[http://www.eng.nauka.gov.pl/meinen/|Ministry of Science and Higher Education]] research grant (number 3T11C00727), 20 October 2004 ‒ 19 October 2007. Polish title: ''Opracowanie narzędzi do ekstrakcji informacji z tekstów w języku polskim''. PI: Agnieszka Mykowiecka. * ''[[http://korpus.pl/|The IPI PAN Corpus]] of Polish'' ‒ a national [[http://www.kbn.gov.pl/|KBN]] grant (7T11C04320), 1 April 2001 ‒ 31 March 2004. Polish title: ''Anotowany korpus pisanego języka polskiego z dostępem przez internet (z uwzględnieniem zastosowań w inżynierii lingwistycznej)''. PI: Adam Przepiórkowski. * ''A [[../../../CRIT2/|Treebank / Test-Suite of Polish Utterances]]'' ‒ a EU [[http://www.ipipan.waw.pl/en/research/grants-completed.html#euro|CRIT-2]] subproject (ICS-MM), 15 October 1997 ‒ 14 October 2000. Coordinator: Leonard Bolc. * ''An [[../../../HPSG/hpsg.html|HPSG Grammar of Polish]] (theory and [[../../../HPSG/PolishInHPSG.pl|implementation]])'' ‒ a national [[http://www.kbn.gov.pl/|KBN]] grant (8T11C01110), 1 January 1996 ‒ 31 December 1998. Polish title: ''Zastosowanie metod inżynierii lingwistycznej do automatycznej analizy i syntezy tekstów języka polskiego''. PI: Leonard Bolc. |
=== 2012 === * [[http://www.impact-project.eu/|IMPACT]] (Improving Access to Text) * [[https://bitbucket.org/jsbien/ndt|Text digitalization tools for philological research]] * [[plWordNet2|plWordNet 2.0]] === 2011 === * [[CLARIN]] (Common Language Resources and Technology Infrastructure) * [[Construction of a treebank for Polish using automatic syntactic analysis]] * [[FLaReNet]] (Fostering Language Resources Network) * [[NKJP]] (National Corpus of Polish) * [[http://www.ppbw.pl/ppbw/8.html|Speech Technology Systems for Public Security Systems]] === 2010 === * [[MONDILEX]] (Conceptual Modelling of Networking of Centres for High-Quality Research in Slavic Lexicography and Their Digital Resources) === 2009 === * [[Automatic detection of semantic dependencies within verb argument structures in large treebanks]] * [[LUNA]] (spoken Language UNderstanding in multilinguAl communication systems) * [[RAMKI]] (Polish Framenet) === 2008 === * [[Automatic extraction of linguistic knowledge from a large corpus of Polish]] * [[LT4eL]] (Language Technology for eLearning) * [[plWordNet]] (Automatic methods of constructing a semantic network of Polish lexemes for natural language processing) === 2007 === * [[Information Extraction from Polish free text]] === 2004 === * The [[IPI PAN Corpus]] of Polish === 2000 === * [[Test Suite of Polish Utterances|Test-Suite of Polish Utterances]] === 1999 === * [[http://bc.klf.uw.edu.pl/33/|Test Suites for Validation and Evaluation of Polish Language Parsers]] === 1998 === * [[GRAMLEX]] * [[HPSG Grammar of Polish]] (theory and implementation) === 1997 === * [[Natural language access to knowledge databases with spatial information]] === 1996 === * [[CEGLEX]] * [[http://nauka-polska.pl/dhtml/raporty/praceBadawcze?rtype=opis&lang=pl&objectId=51593|Morphologico-syntactic Analyser for a Large Subset of Polish]] * [[POLEX]] |
Linguistic Engineering Projects
This page contains a list of externally financed projects relevant to language resources and technologies carried out by Polish research institutions.
Ending in…
2026
UniDive (Universality, diversity and idiosyncrasy in language technology)
2023
CLARIN-PL-Biz (Common Language Resources and Technology Infrastructure)
DARIAH.Lab (Digital Research Infrastructure for the Arts and Humanities DARIAH-PL)
2022
HOMADOS (Hampering Misinformation by Assessing Credibility of Online Sources)
CURLICAT (Curated Multilingual Language Resources for CEF AT)
2021
CLARIN-PL (Polish chapter of Common Language Resources and Technology Infrastructure)
CORMETAN (Cognitive and sociocultural analysis of metaphoric expressions in Polish texts)
Kwantyfikatory w języku: użycie i znaczenie (Quantifiers in Language: Use and Meaning)
2020
MARCELL (Multilingual Resources for CEF.AT in the legal domain)
2019
Chronofleks (A diachronic formal model of Polish inflection and its implementation)
CoDeS (Compositional distributional semantic models for identification, discrimination and disambiguation of senses in Polish texts)
ELRC (European Language Resource Coordination)
Parthenos (Pooling Activities, Resources and Tools for Heritage E-research Networking, Optimization and Synergies)
Scwad (Compositional distributional modelling of Polish language semantics)
SYNAMET (Microcorpus of Synaesthetic Metaphors. Towards a Formal Description and Efficient Methods of Analysis of Metaphors in Discourse)
2018
COTHEC (Computable theory of coreference)
CLARIN-PL (Common Language Resources and Technology Infrastructure)
KORBA (Electronic corpus of 17th and 18th century Polish texts)
TextLink (Structuring Discourse in Multilingual Europe)
2017
PARSEME (PARSing and Multi-word Expressions. Towards linguistic precision and computational efficiency in natural language processing)
2016
CLARIN-PL (Common Language Resources and Technology Infrastructure)
OPTA (Identifying OPinion TArgets and opinion expressions in Polish texts)
2015
Lingwistyczny Warsztat (Linguistic workshop for speech analysis and recognition)
SpeechSamples (Cross-language comparative analysis of phonemes)
TELEDS (Human-computer dialog system based on telephone connection)
Readability (Measuring the degree of readability of nonliterary Polish texts)
2014
NEKST (An adaptive system to support problem-solving on the basis of document collections in the Internet)
CORE (Computer-based methods for coreference resolution in Polish texts)
TrendMiner (Large-scale, Cross-lingual Trend Mining and Summarisation of Real-time Media Streams)
2013
ATLAS (Applied Technology for Language-Aided CMS)
CESAR (CEntral and South-east europeAn Resources), part of META-NET
Automatic detection and correction of annotation errors in Polish language corpora
psi-toolkit (Publicly available tools for automatic processing of Polish language)
SYNAT (Creation of a universal, open repository platform for hosting and communication of networked resources of knowledge for science, education and open knowledge-based society)
2012
IMPACT (Improving Access to Text)
2011
CLARIN (Common Language Resources and Technology Infrastructure)
Construction of a treebank for Polish using automatic syntactic analysis
FLaReNet (Fostering Language Resources Network)
NKJP (National Corpus of Polish)
2010
MONDILEX (Conceptual Modelling of Networking of Centres for High-Quality Research in Slavic Lexicography and Their Digital Resources)
2009
Automatic detection of semantic dependencies within verb argument structures in large treebanks
LUNA (spoken Language UNderstanding in multilinguAl communication systems)
RAMKI (Polish Framenet)
2008
Automatic extraction of linguistic knowledge from a large corpus of Polish
LT4eL (Language Technology for eLearning)
plWordNet (Automatic methods of constructing a semantic network of Polish lexemes for natural language processing)
2007
2004
The IPI PAN Corpus of Polish
2000
1999
1998
HPSG Grammar of Polish (theory and implementation)
1997