Class Dictionary

java.lang.Object
opennlp.tools.dictionary.Dictionary
All Implemented Interfaces:
Iterable<StringList>, SerializableArtifact

public class Dictionary extends Object implements Iterable<StringList>, SerializableArtifact
An iterable and serializable dictionary implementation.
See Also:
  • Constructor Details

    • Dictionary

      public Dictionary()
      Initializes an empty Dictionary. By default, the resulting instance will not be case-sensitive.
    • Dictionary

      public Dictionary(boolean caseSensitive)
      Initializes an empty Dictionary.
      Parameters:
      caseSensitive - Whether the new instance will operate case-sensitive, or not.
    • Dictionary

      public Dictionary(InputStream in) throws IOException
      Initializes the Dictionary from an existing dictionary resource.
      Parameters:
      in - The InputStream that references the dictionary content.
      Throws:
      IOException - Thrown if IO errors occurred.
  • Method Details

    • put

      public void put(StringList tokens)
      Adds the tokens to the dictionary as one new entry.
      Parameters:
      tokens - the new entry
    • getMinTokenCount

      public int getMinTokenCount()
    • getMaxTokenCount

      public int getMaxTokenCount()
    • contains

      public boolean contains(StringList tokens)
      Checks if this dictionary has the given entry.
      Parameters:
      tokens - The query of tokens to be checked for.
      Returns:
      true if it contains the entry, false otherwise.
    • remove

      public void remove(StringList tokens)
      Removes the given tokens form the current instance.
      Parameters:
      tokens - The tokens to be filtered out (= removed).
    • iterator

      public Iterator<StringList> iterator()
      Specified by:
      iterator in interface Iterable<StringList>
      Returns:
      Retrieves a token-Iterator over all elements.
    • size

      public int size()
      Returns:
      Retrieves the number of tokens in the current instance.
    • serialize

      public void serialize(OutputStream out) throws IOException
      Writes the current instance to the given OutputStream.
      Parameters:
      out - A valid OutputStream, ready for serialization.
      Throws:
      IOException - Thrown if IO errors occurred.
    • equals

      public boolean equals(Object obj)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • parseOneEntryPerLine

      public static Dictionary parseOneEntryPerLine(Reader in) throws IOException
      Reads a Dictionary which has one entry per line. The tokens inside an entry are whitespace delimited.
      Parameters:
      in - A Reader instance used to parse the dictionary from.
      Returns:
      The parsed Dictionary instance; guaranteed to be non-null.
      Throws:
      IOException - Thrown if IO errors occurred during read and parse operations.
    • asStringSet

      public Set<String> asStringSet()
      Converts this Dictionary to a Set<String>.

      Note: Only AbstractCollection.iterator(), AbstractCollection.size() and AbstractCollection.contains(Object) methods are implemented.

      If this dictionary entries are multi tokens only the first token of the entry will be part of the Set.

      Returns:
      A Set containing all entries of this Dictionary.
    • getArtifactSerializerClass

      public Class<?> getArtifactSerializerClass()
      Description copied from interface: SerializableArtifact
      Retrieves the class which can serialize and recreate this artifact.

      Note: The serializer class must have a public zero argument constructor or an exception is thrown during model serialization/loading.

      Specified by:
      getArtifactSerializerClass in interface SerializableArtifact
      Returns:
      Retrieves the serializer class for Dictionary
      See Also: