Package opennlp.tools.tokenize
Class TokenSampleStream
- java.lang.Object
-
- opennlp.tools.util.FilterObjectStream<String,TokenSample>
-
- opennlp.tools.tokenize.TokenSampleStream
-
- All Implemented Interfaces:
AutoCloseable
,ObjectStream<TokenSample>
public class TokenSampleStream extends FilterObjectStream<String,TokenSample>
This class is astream filter
which reads in string encoded samples and createssamples
out of them. The input string sample is tokenized if a whitespace or the special separator chars occur.Sample:
"token1 token2 token3<SPLIT>token4"
The tokenstoken1
andtoken2
are separated by a whitespace,token3
andtoken4
are separated by the special character sequence. In this case, the default split sequence applies.Note: The sequence must be unique in the input string and is not escaped.
-
-
Constructor Summary
Constructors Constructor Description TokenSampleStream(ObjectStream<String> sentences)
Initializes ainstance
.TokenSampleStream(ObjectStream<String> samples, String separatorChars)
Initializes ainstance
.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description TokenSample
read()
Returns the nextObjectStream
object.-
Methods inherited from class opennlp.tools.util.FilterObjectStream
close, reset
-
-
-
-
Constructor Detail
-
TokenSampleStream
public TokenSampleStream(ObjectStream<String> samples, String separatorChars)
Initializes ainstance
.- Parameters:
samples
- A plain textline stream
. Must not benull
.separatorChars
- The characters to be considered separators. SeeTokenSample.DEFAULT_SEPARATOR_CHARS
. Must not benull
.
-
TokenSampleStream
public TokenSampleStream(ObjectStream<String> sentences)
Initializes ainstance
.- Parameters:
sentences
- A plain textline stream
. Must not benull
.
-
-
Method Detail
-
read
public TokenSample read() throws IOException
Description copied from interface:ObjectStream
Returns the nextObjectStream
object. Calling this method repeatedly until it returnsnull
will return each object from the underlying source exactly once.- Returns:
- The next object or
null
to signal that the stream is exhausted. - Throws:
IOException
- Thrown if there is an error during reading.
-
-