Class TokenSampleStream
java.lang.Object
opennlp.tools.util.FilterObjectStream<String, opennlp.tools.tokenize.TokenSample>
opennlp.tools.tokenize.TokenSampleStream
- All Implemented Interfaces:
AutoCloseable, opennlp.tools.util.ObjectStream<opennlp.tools.tokenize.TokenSample>
public class TokenSampleStream
extends FilterObjectStream<String, opennlp.tools.tokenize.TokenSample>
This class is a
stream filter which reads in string encoded
samples and creates samples out of them.
The input string sample is tokenized if a whitespace or the special separator chars occur.
Sample:
"token1 token2 token3<SPLIT>token4"
The tokens token1 and token2 are separated by a whitespace,
token3 and token4 are separated by the special character sequence.
In this case, the default split sequence applies.
Note: The sequence must be unique in the input string and is not escaped.
-
Constructor Summary
ConstructorsConstructorDescriptionTokenSampleStream(opennlp.tools.util.ObjectStream<String> sentences) Initializes ainstance.TokenSampleStream(opennlp.tools.util.ObjectStream<String> samples, String separatorChars) Initializes ainstance. -
Method Summary
Methods inherited from class FilterObjectStream
close, reset
-
Constructor Details
-
TokenSampleStream
Initializes ainstance.- Parameters:
samples- A plain textline stream. Must not benull.separatorChars- The characters to be considered separators. SeeTokenSample.DEFAULT_SEPARATOR_CHARS. Must not benull.
-
TokenSampleStream
-
-
Method Details
-
read
- Throws:
IOException
-