Locked History Actions

Diff for "Converters4NKJP"

Differences between revisions 1 and 6 (spanning 5 versions)
Revision 1 as of 2011-09-26 12:10:07
Size: 335
Comment:
Revision 6 as of 2011-09-28 11:36:19
Size: 989
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
== Converters from the output of [[http://nlp.ipipan.waw.pl/Anotatornia/|Anotatornia]] == This page (under construction as of the end of September 2011) collects converters from and to the [[http://nlp.ipipan.waw.pl/TEI4NKJP/|TEI4NKJP]] XML format, as used in the [[http://nkjp.pl/|National Corpus of Polish]].

== Converters from the output of Anotatornia to TEI NKJP ==

The format evolved during the project and the final NKJP TEI is a little bit different than the Anotatornia (see [[Anotatornia|http://nlp.ipipan.waw.pl/Anotatornia/]]) output. To 'upgrade', use the following scripts:
Line 8: Line 12:

The scripts were meant to be simple. Fatal error reporting in modify-tei-morphosyntax.pl is straightforward: a line is printed to the output file, rendering the XML not well-formed. In all cases, the resulting files should be validated.

Converters for NKJP formats

This page (under construction as of the end of September 2011) collects converters from and to the TEI4NKJP XML format, as used in the National Corpus of Polish.

Converters from the output of Anotatornia to TEI NKJP

The format evolved during the project and the final NKJP TEI is a little bit different than the Anotatornia (see http://nlp.ipipan.waw.pl/Anotatornia/) output. To 'upgrade', use the following scripts:

The scripts were meant to be simple. Fatal error reporting in modify-tei-morphosyntax.pl is straightforward: a line is printed to the output file, rendering the XML not well-formed. In all cases, the resulting files should be validated.