Package opennlp.tools.tokenize
Klasse TokenSampleStream
- Alle implementierten Schnittstellen:
AutoCloseable
,ObjectStream<TokenSample>
This class is a
stream filter
which reads in string encoded
samples and creates samples
out of them.
The input string sample is tokenized if a whitespace or the special separator chars occur.
Sample:
"token1 token2 token3<SPLIT>token4"
The tokens token1
and token2
are separated by a whitespace,
token3
and token4
are separated by the special character sequence.
In this case, the default split sequence applies.
Note: The sequence must be unique in the input string and is not escaped.
-
Konstruktorübersicht
KonstruktorenKonstruktorBeschreibungTokenSampleStream
(ObjectStream<String> sentences) Initializes ainstance
.TokenSampleStream
(ObjectStream<String> samples, String separatorChars) Initializes ainstance
. -
Methodenübersicht
Von Klasse geerbte Methoden opennlp.tools.util.FilterObjectStream
close, reset
-
Konstruktordetails
-
TokenSampleStream
Initializes ainstance
.- Parameter:
samples
- A plain textline stream
. Must not benull
.separatorChars
- The characters to be considered separators. SeeTokenSample.DEFAULT_SEPARATOR_CHARS
. Must not benull
.
-
TokenSampleStream
Initializes ainstance
.- Parameter:
sentences
- A plain textline stream
. Must not benull
.
-
-
Methodendetails
-
read
Beschreibung aus Schnittstelle kopiert:ObjectStream
Returns the nextObjectStream
object. Calling this method repeatedly until it returnsnull
will return each object from the underlying source exactly once.- Gibt zurück:
- The next object or
null
to signal that the stream is exhausted. - Löst aus:
IOException
- Thrown if there is an error during reading.
-