Locked History Actions

attachment:README_en.txt of Spejd

Attachment 'README_en.txt'

Download

   1 Spejd 0.8.4
   2 
   3 Copyright (C) IPI PAN, 2007-2010. All rights reserved.
   4 Available under the terms of the GNU General Public License;
   5 see the file doc/gpl.txt for details.
   6 
   7 ABOUT
   8 
   9 Spejd is a shallow parser, which allows for simultaneous syntactic 
  10 parsing and morphological disambiguation, developed at the 
  11 Institute of Computer Science, Polish Academy od Sciences, Warsaw.
  12 
  13 Spejd homepage:
  14 http://nlp.ipipan.waw.pl/Spejd/
  15 
  16 Last releases:
  17 0.8.4: bugfix release
  18 0.8.3: bugfix release
  19 0.8.2: bugfix release
  20 0.8.1:
  21 
  22 Compared to the previous release, major changes in this version include:
  23 - Integrated plain text mode processing module based on morphological 
  24   analyzer Morfologik (http://morfologik.blogspot.com/). This module requires 
  25   appropriately encoded input, as defined by inputEncoding config parameter.
  26   Plain text module is enabled by inputType parameter (auto or txt).
  27 - Parallel processing (benefits are immediate on multicore CPUs). 
  28   The number of processing threads are defined by maxThreads parameter.
  29 - A simple spelling correction module, addressing lacks of Polish
  30   diactrics. Possible transformations are listed in ogonkifier.ini.
  31 - Changes listed in doc/changes0_5.txt.
  32 
  33 REQUIREMENTS
  34 
  35 Sun Java Runtime Environment version 1.5 or higher.
  36 
  37 Notice: it may be possible to run the program on alternative Java
  38 implementation, but because of differences in regular expression
  39 implementations, we can not guarantee its behaviour.
  40 
  41 INSTALLATION
  42 
  43 Unzip the file spade.zip.  Installation finished!
  44 
  45 SYNOPSIS
  46 
  47 java -jar spejd.jar path [options]
  48 
  49 where:
  50 
  51 - path - a single file or a folder with XML CES (see doc/xcesIPIAna.dtd) 
  52     or plain text files (.txt, encoding defined by inputEncoding parameter)
  53     to parse; the parser looks for files matching a pattern defined in 
  54     config.ini (inputFiles parameter) and recursively checks subdirectories.
  55 
  56 - options - optional list of assignments var=value; var has to be one
  57     of variables from config.ini; values passed as an invocations
  58     argument override the default values from the file.
  59 
  60 Examples:
  61 
  62 java -jar spejd.jar corpus nullAgreement=1
  63 java -jar spejd.jar corpus rules=rules2.sr logDir=log2
  64 java -jar spejd.jar corpus discardDeleted=true outputSuffix=.sh2.xml
  65 
  66 RESULTS
  67 
  68 In the case of xml input, for each directory, in which filename.xml(.gz)
  69 has been found, a new filenameSh.xml is created.  It is a copy of a
  70 corresponding .xml, but with additional annotation: token
  71 identifiers, disambiguation attributes, syntactic word and groups.  
  72 In the case of plain text input filename.txt, a new xml file 
  73 (file name ends with Sh.xml) is created for each corresponding .txt file.
  74 
  75 A few additional files are generated in logs subdirectory of the spade
  76 directory:
  77 
  78 rules.compiled - a compiled set of rules
  79 
  80 rules.matched.csv - rules statistics: for each rule gives the number
  81     of completed (evaluated to true) matches, the number of matches,
  82     matching time, evaluation time, total time
  83 
  84 tagdict.ini - tags dictionary, translating the tagset defined in
  85     configuration file to inner positional tagset
  86 
  87 DOCUMENTATION
  88 
  89 doc/spade.pdf      - a paper about Spejd
  90 doc/xcesAnaIPI.dtd - DTD of the input format
  91 api/               - technical documentation
  92 
  93 EXAMPLE
  94 
  95 ./sample-morfeusz.cfg      - example Morfeusz tagset file
  96 ./sample-morfologik.cfg    - example Morfologik tagset file (for plain text input)
  97 ./rules.sr                 - example set of rules
  98 doc/morph.xml              - example XML input to the parser
  99 doc/morphSh.xml            - example output 
 100 doc/display.*              - stylesheets and example output
 101 
 102 WHAT'S NEW IN THIS VERSION
 103 
 104 
 105 
 106 FOR DEVELOPERS
 107 
 108 Please feel free to play around with the sources, modify them and post
 109 patches on Spejd's bugtracker at sourceforge (linked from the homepage)!
 110 See api/ - for a brief introduction to the code structure.

Attached Files

To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.
  • [get | view] (2014-12-29 14:19:30, 3.8 KB) [[attachment:README_en.txt]]
  • [get | view] (2014-12-29 14:19:30, 4.0 KB) [[attachment:README_pl.txt]]
  • [get | view] (2014-12-29 14:19:30, 4847.8 KB) [[attachment:spejd0_84.zip]]
 All files | Selected Files: delete move to page

You are not allowed to attach a file to this page.