morfologik.stemmers
Class Stempel

java.lang.Object
  extended by morfologik.stemmers.Stempel
All Implemented Interfaces:
IStemmer

public final class Stempel
extends java.lang.Object
implements IStemmer

A wrapper around Stempel - a heuristic stemmer by Andrzej Bialecki.


Field Summary
static java.lang.String PROPERTY_NAME_STEMPEL_TABLE
          Name of a system property pointing to a stempel dictionary (stemmer table).
 
Constructor Summary
Stempel()
          Instantiate Stempel with default dictionaries.
 
Method Summary
 java.lang.String[] stem(java.lang.String word)
          Returns an array of potential base forms (stems) of the word, or null if the word is not found in the dictionary.
 java.lang.String[] stemAndForm(java.lang.String word)
          Returns an array of pairs of the form: String stem1, String form1, String stem2, String stem2, ...
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PROPERTY_NAME_STEMPEL_TABLE

public static final java.lang.String PROPERTY_NAME_STEMPEL_TABLE
Name of a system property pointing to a stempel dictionary (stemmer table). The property can be a URL (stringified) or a resource path.

See Also:
Constant Field Values
Constructor Detail

Stempel

public Stempel()
        throws java.io.IOException

Instantiate Stempel with default dictionaries. The default dictionary path can be overriden using system property PROPERTY_NAME_STEMPEL_TABLE.

Throws:
java.io.IOException
Method Detail

stem

public java.lang.String[] stem(java.lang.String word)
Description copied from interface: IStemmer
Returns an array of potential base forms (stems) of the word, or null if the word is not found in the dictionary.

Specified by:
stem in interface IStemmer
See Also:
IStemmer.stem(String)

stemAndForm

public java.lang.String[] stemAndForm(java.lang.String word)
Description copied from interface: IStemmer

Returns an array of pairs of the form:

 String stem1, String form1, String stem2, String stem2, ...
 
or null if the word is not found in the dictionary.

The form tag is a simple string and depends on what was saved in the automaton (it may be nonsensical or even null).

Specified by:
stemAndForm in interface IStemmer
See Also:
IStemmer.stemAndForm(String)