|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectmorfologik.fsa.core.FSA
morfologik.fsa.core.FSAVer5Impl
public final class FSAVer5Impl
FSA (Finite State Automaton) dictionary traversal implementation for version 5 of the FSA automaton.
Version 5 indicates the dictionary was built with these flags:
FSA.FSA_FLEXIBLE
, FSA.FSA_STOPBIT
and FSA.FSA_NEXTBIT
.
The internal representation
of the FSA must therefore follow this description (please note this
format describes only a single transition (arc), not the entire dictionary file).
Byte +-+-+-+-+-+-+-+-+\ 0 | | | | | | | | | +------ label +-+-+-+-+-+-+-+-+/ +------------- node pointed to is next | +----------- the last arc of the node | | +--------- the arc is final | | | +-----------+ | | | | | ___+___ | | | | / \ | | | | MSB LSB | 7 6 5 4 3 2 1 0 | +-+-+-+-+-+-+-+-+ | 1 | | | | | | | | | \ \ +-+-+-+-+-+-+-+-+ \ \ LSB +-+-+-+-+-+-+-+-+ + 2 | | | | | | | | | | +-+-+-+-+-+-+-+-+ | 3 | | | | | | | | | +----- target node address (in bytes) +-+-+-+-+-+-+-+-+ | (not present except for the byte : : : : : : : : : | with flags if the node pointed to +-+-+-+-+-+-+-+-+ + is next) gtl | | | | | | | | | / MSB +-+-+-+-+-+-+-+-+ / gtl+1 (gtl = gotoLength)
The traversal of such FSA could be made extremely fast using pointers only (or integer indices over a byte array in case of Java). However, for the sake of clarity, this class uses explicit Node/ Arc objects.
Field Summary | |
---|---|
protected byte[] |
arcs
An array of bytes with the internal representation of the automaton. |
protected int |
arcSize
Size of a single arc (in bytes). |
protected static int |
gotoOffset
An offset in the arc structure, where the address field begins. |
Fields inherited from class morfologik.fsa.core.FSA |
---|
filler, FSA_FLEXIBLE, FSA_LARGE_DICTIONARIES, FSA_NEXTBIT, FSA_STOPBIT, FSA_TAILS, FSA_WEIGHTED, gotoLength, version, VERSION_5 |
Constructor Summary | |
---|---|
FSAVer5Impl(java.io.InputStream fsaStream,
java.lang.String dictionaryEncoding)
Creates a new automaton reading it from a file in FSA format, version 5. |
Method Summary | |
---|---|
int |
getNumberOfArcs()
Returns the number of arcs in this automaton. |
int |
getNumberOfNodes()
Returns the number of nodes in this automaton. |
FSA.Node |
getStartNode()
Returns the start node of this automaton. |
protected int |
gotoFieldToOffset(int start,
int n)
Returns an integer offset from bitpacked representation. |
protected void |
readHeader(java.io.DataInput in,
long fileSize)
Reads a FSA header from a stream. |
Methods inherited from class morfologik.fsa.core.FSA |
---|
getAnnotationSeparator, getFillerCharacter, getFlags, getInstance, getInstance, getTraversalHelper, getVersion, readFully |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected int arcSize
protected static final int gotoOffset
protected byte[] arcs
Constructor Detail |
---|
public FSAVer5Impl(java.io.InputStream fsaStream, java.lang.String dictionaryEncoding) throws java.io.IOException
java.io.IOException
Method Detail |
---|
public int getNumberOfArcs()
getNumberOfArcs
in class FSA
public int getNumberOfNodes()
getNumberOfNodes
in class FSA
public FSA.Node getStartNode()
null
if the start node
is also an end node.
getStartNode
in class FSA
protected void readHeader(java.io.DataInput in, long fileSize) throws java.io.IOException
readHeader
in class FSA
java.io.IOException
- If the stream is not a dictionary,
or if the version is not supported.protected final int gotoFieldToOffset(int start, int n)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |