opennlp.tools.tokenize
Class TokenSampleStream

java.lang.Object
  extended by opennlp.tools.util.FilterObjectStream<String,TokenSample>
      extended by opennlp.tools.tokenize.TokenSampleStream
All Implemented Interfaces:
ObjectStream<TokenSample>

public class TokenSampleStream
extends FilterObjectStream<String,TokenSample>

This class is a stream filter which reads in string encoded samples and creates TokenSamples out of them. The input string sample is tokenized if a whitespace or the special separator chars occur.

Sample:
"token1 token2 token3token4"
The tokens token1 and token2 are separated by a whitespace, token3 and token3 are separated by the special character sequence, in this case the default split sequence.

The sequence must be unique in the input string and is not escaped.


Constructor Summary
TokenSampleStream(ObjectStream<String> sentences)
           
TokenSampleStream(ObjectStream<String> sampleStrings, String separatorChars)
           
 
Method Summary
 TokenSample read()
          Returns the next object.
 
Methods inherited from class opennlp.tools.util.FilterObjectStream
close, reset
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TokenSampleStream

public TokenSampleStream(ObjectStream<String> sampleStrings,
                         String separatorChars)

TokenSampleStream

public TokenSampleStream(ObjectStream<String> sentences)
Method Detail

read

public TokenSample read()
                 throws IOException
Description copied from interface: ObjectStream
Returns the next object. Calling this method repeatedly until it returns null will return each object from the underlying source exactly once.

Returns:
the next object or null to signal that the stream is exhausted
Throws:
IOException


Copyright © 2013 The Apache Software Foundation. All Rights Reserved.