opennlp.tools.dictionary
Class Dictionary

java.lang.Object
  extended by opennlp.tools.dictionary.Dictionary
All Implemented Interfaces:
Iterable<StringList>

public class Dictionary
extends Object
implements Iterable<StringList>

This class is a dictionary.


Constructor Summary
Dictionary()
          Initializes an empty Dictionary.
Dictionary(boolean caseSensitive)
           
Dictionary(InputStream in)
          Initializes the Dictionary from an existing dictionary resource.
Dictionary(InputStream in, boolean caseSensitive)
          Deprecated. This constructor is deprecated. Passing the case sensitivity flag has no effect. Use Dictionary(InputStream) instead and set the case sensitivity during the dictionary creation.
 
Method Summary
 Set<String> asStringSet()
          Gets this dictionary as a Set<String>.
 boolean contains(StringList tokens)
          Checks if this dictionary has the given entry.
 boolean equals(Object obj)
           
 int getMaxTokenCount()
           
 int getMinTokenCount()
           
 int hashCode()
           
 Iterator<StringList> iterator()
          Retrieves an Iterator over all tokens.
static Dictionary parseOneEntryPerLine(Reader in)
          Reads a dictionary which has one entry per line.
 void put(StringList tokens)
          Adds the tokens to the dictionary as one new entry.
 void remove(StringList tokens)
          Removes the given tokens form the current instance.
 void serialize(OutputStream out)
          Writes the current instance to the given OutputStream.
 int size()
          Retrieves the number of tokens in the current instance.
 String toString()
           
 
Methods inherited from class java.lang.Object
getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Dictionary

public Dictionary()
Initializes an empty Dictionary.


Dictionary

public Dictionary(boolean caseSensitive)

Dictionary

public Dictionary(InputStream in)
           throws IOException,
                  InvalidFormatException
Initializes the Dictionary from an existing dictionary resource.

Parameters:
in -
Throws:
IOException
InvalidFormatException

Dictionary

public Dictionary(InputStream in,
                  boolean caseSensitive)
           throws IOException,
                  InvalidFormatException
Deprecated. This constructor is deprecated. Passing the case sensitivity flag has no effect. Use Dictionary(InputStream) instead and set the case sensitivity during the dictionary creation.

Loads a Dictionary from a XML file.

Parameters:
in - the dictionary in its XML format
caseSensitive - has no effect
Throws:
IOException
InvalidFormatException
Method Detail

put

public void put(StringList tokens)
Adds the tokens to the dictionary as one new entry.

Parameters:
tokens - the new entry

getMinTokenCount

public int getMinTokenCount()
Returns:
minimum token count in the dictionary

getMaxTokenCount

public int getMaxTokenCount()
Returns:
maximum token count in the dictionary

contains

public boolean contains(StringList tokens)
Checks if this dictionary has the given entry.

Parameters:
tokens -
Returns:
true if it contains the entry otherwise false

remove

public void remove(StringList tokens)
Removes the given tokens form the current instance.

Parameters:
tokens -

iterator

public Iterator<StringList> iterator()
Retrieves an Iterator over all tokens.

Specified by:
iterator in interface Iterable<StringList>
Returns:
token-Iterator

size

public int size()
Retrieves the number of tokens in the current instance.

Returns:
number of tokens

serialize

public void serialize(OutputStream out)
               throws IOException
Writes the current instance to the given OutputStream.

Parameters:
out -
Throws:
IOException

equals

public boolean equals(Object obj)
Overrides:
equals in class Object

hashCode

public int hashCode()
Overrides:
hashCode in class Object

toString

public String toString()
Overrides:
toString in class Object

parseOneEntryPerLine

public static Dictionary parseOneEntryPerLine(Reader in)
                                       throws IOException
Reads a dictionary which has one entry per line. The tokens inside an entry are whitespace delimited.

Parameters:
in -
Returns:
the parsed dictionary
Throws:
IOException

asStringSet

public Set<String> asStringSet()
Gets this dictionary as a Set<String>. Only iterator(), size() and contains(Object) methods are implemented. If this dictionary entries are multi tokens only the first token of the entry will be part of the Set.

Returns:
a Set containing the entries of this dictionary


Copyright © 2013 The Apache Software Foundation. All Rights Reserved.