Package opennlp.tools.tokenize
Class ThreadSafeTokenizerME
java.lang.Object
opennlp.tools.tokenize.ThreadSafeTokenizerME
- All Implemented Interfaces:
AutoCloseable
,Tokenizer
A thread-safe version of TokenizerME. Using it is completely transparent. You can use it in
a single-threaded context as well, it only incurs a minimal overhead.
Note, however, that this implementation uses a ThreadLocal
. Although the implementation is
lightweight because the model is not duplicated, if you have many long-running threads,
you may run into memory problems.
Be careful when using this in a Jakarta EE application, for example.
The user is responsible for clearing theThreadLocal
.-
Constructor Details
-
ThreadSafeTokenizerME
-
-
Method Details
-
tokenize
Description copied from interface:Tokenizer
Splits a string into its atomic parts. -
tokenizePos
Description copied from interface:Tokenizer
Finds the boundaries of atomic parts in a string.- Specified by:
tokenizePos
in interfaceTokenizer
- Parameters:
s
- The string to be tokenized.- Returns:
- The
spans (offsets into
for each token as the individuals array elements.s
)
-
getProbabilities
public double[] getProbabilities() -
close
public void close()- Specified by:
close
in interfaceAutoCloseable
-