Package opennlp.tools.dictionary
Class Dictionary
- java.lang.Object
-
- opennlp.tools.dictionary.Dictionary
-
- All Implemented Interfaces:
Iterable<StringList>
,SerializableArtifact
public class Dictionary extends Object implements Iterable<StringList>, SerializableArtifact
An iterable and serializable dictionary implementation.- See Also:
SerializableArtifact
,Iterable
-
-
Constructor Summary
Constructors Constructor Description Dictionary()
Initializes an emptyDictionary
.Dictionary(boolean caseSensitive)
Initializes an emptyDictionary
.Dictionary(InputStream in)
Initializes theDictionary
from an existing dictionary resource.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Set<String>
asStringSet()
Converts thisDictionary
to aSet
.boolean
contains(StringList tokens)
Checks if this dictionary has the given entry.boolean
equals(Object obj)
Class<?>
getArtifactSerializerClass()
Retrieves the class which can serialize and recreate this artifact.int
getMaxTokenCount()
int
getMinTokenCount()
int
hashCode()
Iterator<StringList>
iterator()
static Dictionary
parseOneEntryPerLine(Reader in)
Reads aDictionary
which has one entry per line.void
put(StringList tokens)
Adds the tokens to the dictionary as one new entry.void
remove(StringList tokens)
Removes the given tokens form the current instance.void
serialize(OutputStream out)
Writes the current instance to the givenOutputStream
.int
size()
String
toString()
-
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
-
-
-
Constructor Detail
-
Dictionary
public Dictionary()
Initializes an emptyDictionary
. By default, the resulting instance will not be case-sensitive.
-
Dictionary
public Dictionary(boolean caseSensitive)
Initializes an emptyDictionary
.- Parameters:
caseSensitive
- Whether the new instance will operate case-sensitive, or not.
-
Dictionary
public Dictionary(InputStream in) throws IOException
Initializes theDictionary
from an existing dictionary resource.- Parameters:
in
- TheInputStream
that references the dictionary content.- Throws:
IOException
- Thrown if IO errors occurred.
-
-
Method Detail
-
put
public void put(StringList tokens)
Adds the tokens to the dictionary as one new entry.- Parameters:
tokens
- the new entry
-
getMinTokenCount
public int getMinTokenCount()
-
getMaxTokenCount
public int getMaxTokenCount()
-
contains
public boolean contains(StringList tokens)
Checks if this dictionary has the given entry.- Parameters:
tokens
- The query of tokens to be checked for.- Returns:
true
if it contains the entry,false
otherwise.
-
remove
public void remove(StringList tokens)
Removes the given tokens form the current instance.- Parameters:
tokens
- The tokens to be filtered out (= removed).
-
iterator
public Iterator<StringList> iterator()
- Specified by:
iterator
in interfaceIterable<StringList>
- Returns:
- Retrieves a token-
Iterator
over all elements.
-
size
public int size()
- Returns:
- Retrieves the number of tokens in the current instance.
-
serialize
public void serialize(OutputStream out) throws IOException
Writes the current instance to the givenOutputStream
.- Parameters:
out
- A validOutputStream
, ready for serialization.- Throws:
IOException
- Thrown if IO errors occurred.
-
parseOneEntryPerLine
public static Dictionary parseOneEntryPerLine(Reader in) throws IOException
Reads aDictionary
which has one entry per line. The tokens inside an entry are whitespace delimited.- Parameters:
in
- AReader
instance used to parse the dictionary from.- Returns:
- The parsed
Dictionary
instance; guaranteed to be non-null
. - Throws:
IOException
- Thrown if IO errors occurred during read and parse operations.
-
asStringSet
public Set<String> asStringSet()
Converts thisDictionary
to aSet
.Note: Only
AbstractCollection.iterator()
,AbstractCollection.size()
andAbstractCollection.contains(Object)
methods are implemented.If this dictionary entries are multi tokens only the first token of the entry will be part of the
Set
.- Returns:
- A
Set
containing all entries of thisDictionary
.
-
getArtifactSerializerClass
public Class<?> getArtifactSerializerClass()
Description copied from interface:SerializableArtifact
Retrieves the class which can serialize and recreate this artifact.Note: The serializer class must have a
public zero argument constructor
or an exception is thrown during model serialization/loading.- Specified by:
getArtifactSerializerClass
in interfaceSerializableArtifact
- Returns:
- Retrieves the serializer class for
Dictionary
- See Also:
DictionarySerializer
-
-