⇤ ← Revision 1 as of 2024-02-15 08:54:23
Size: 2752
Comment:
|
Size: 911
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
The Polish Round Table Corpus (PRTC) is ... The first edition of the PSC was prepared in October 2011 and was co-funded by project [[http://clip.ipipan.waw.pl/CESAR|CESAR]]. The update of the resource with newer data was co-funded by [[CLARIN-PL-2|CLARIN-PL]] infrastructure. The Sejm Corpus was recently included in the [[http://clip.ipipan.waw.pl/PPC|Polish Parliamentary Corpus]]. == Corpus data == The corpus contain transcripts of Sejm sessions saved in TEI P5 format compatible with the annotation used by the [[http://nkjp.pl/index.php?page=0&lang=1|National Corpus of Polish]]. The resource contains automatically created annotation of: * utterance-level segmentation, * tokenization, * lemmatization, * disambiguated morphosyntactic description, * syntactic words, * syntactic groups, * named entities. === Single package === * Interpellations and questions terms 3-8, sittings terms 1-8, 37.2 GB === Divided by term and document type === ||<style="width:0.5%;text-align:center">'''Term''' ||<style="width:3%;">'''Years''' ||<5%>'''Sittings''' ||<5%>'''Interpellations and questions'''||<style="border:0;width:20%">|| ||<:> 1 || 1991–93 || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Posiedzenia-kad1.tar|1.0 GB]] || || ||<:> 2 || 1993–97 || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Posiedzenia-kad2.tar|2.6 GB]] || || ||<:> 3 || 1997–2001 || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Posiedzenia-kad3.tar|2.9 GB]] || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Interpelacje-kad3.tar|1.7 GB]] || ||<:> 4 || 2001–05 || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Posiedzenia-kad4.tar|3.4 GB]] || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Interpelacje-kad4.tar|2.4 GB]] || ||<:> 5 || 2005–07 || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Posiedzenia-kad5.tar|1.4 GB]] || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Interpelacje-kad5.tar|2.0 GB]] || ||<:> 6 || 2007–11 || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Posiedzenia-kad6.tar|3.2 GB]] || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Interpelacje-kad6.tar|4.8 GB]] || ||<:> 7 || 2011–15 || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Posiedzenia-kad7.tar|2.7 GB]] || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Interpelacje-kad7.tar|8.0 GB]] || ||<:> 8 || 2015– || 0.7 GB || [[http://sejm.nlp.ipipan.waw.pl/static/PSC-Interpelacje-kad8.tar|1.1 GB]] || == Publications == <<BibMate(key,"ogro:12:lrec",omitYears=true)>> <<BibMate(key,"ogr:2018:parlaclarin",omitYears=true)>> |
The Polish Round Table Corpus (PRTC) is a dataset documenting negotiations between the authorities in communist People's Republic of Poland and a section of the opposition (Solidarity movement, led by Lech Wałęsa) held in 1989 between 6 February and 5 April. The scanned transcripts have been acquired from the [[https://biblioteka.sejm.gov.pl/okragly_stol/|Library of Polish Sejm]], OCR-ed, manually corrected and indexed in a concordancer. |
Line 43: | Line 7: |
Online search of the corpus (including Senate data) is available at http://sejm.nlp.ipipan.waw.pl/. | The corpus is currently available for [[https://kos.nlp.ipipan.waw.pl/|online search]]. == Acknowledgments == Preparation of the transcripts was financed by the European Regional Development Fund as a part of the 2014-2020 Smart Growth Operational Programme, CLARIN - Common Language Resources and Technology Infrastructure, project no. POIR.04.02.00-00C002/19. |
The Polish Round Table Corpus / Korpus Okrągłego Stołu
The Polish Round Table Corpus (PRTC) is a dataset documenting negotiations between the authorities in communist People's Republic of Poland and a section of the opposition (Solidarity movement, led by Lech Wałęsa) held in 1989 between 6 February and 5 April. The scanned transcripts have been acquired from the Library of Polish Sejm, OCR-ed, manually corrected and indexed in a concordancer.
Searching the corpus
The corpus is currently available for online search.
Acknowledgments
Preparation of the transcripts was financed by the European Regional Development Fund as a part of the 2014-2020 Smart Growth Operational Programme, CLARIN - Common Language Resources and Technology Infrastructure, project no. POIR.04.02.00-00C002/19.