= Converters for NKJP formats = This page (under construction as of the end of September 2011) collects converters from and to the [[http://nlp.ipipan.waw.pl/TEI4NKJP/|TEI4NKJP]] XML format, as used in the [[http://nkjp.pl/|National Corpus of Polish]]. == Converters from the output of Anotatornia to TEI NKJP == The format evolved during the project and the final TEI4NKJP is a little bit different than the Anotatornia (see [[Anotatornia|http://nlp.ipipan.waw.pl/Anotatornia/]]) output. To upgrade, use the following scripts: * [[attachment:modify-tei-morphosyntax.pl|modify-tei-morphosyntax.pl]] * [[attachment:modify-tei-segmentation.pl|modify-tei-segmentation.pl]] * [[attachment:modify-tei-senses.pl|modify-tei-senses.pl]] The scripts were meant to be simple. Fatal error reporting in modify-tei-morphosyntax.pl is straightforward: a line is printed to the output file, rendering the XML not well-formed. In all cases, the resulting files should be [[http://nlp.ipipan.waw.pl/TEI4NKJP/|validated]].