Class EmptyLinePreprocessorStream

java.lang.Object
opennlp.tools.util.FilterObjectStream<String,String>
opennlp.tools.sentdetect.EmptyLinePreprocessorStream
All Implemented Interfaces:
AutoCloseable, ObjectStream<String>

@Internal public class EmptyLinePreprocessorStream extends FilterObjectStream<String,String>
ObjectStream to clean up empty lines for empty line separated document streams.
- Skips empty line at training data start
- Transforms multiple empty lines in a row into one
- Replaces white space lines with empty lines
- TODO: Terminates last document with empty line if it is missing

This stream should be used by the components that mark empty lines to mark document boundaries.

Note: This class is not thread safe.

Note: Do not use this class, internal use only!

  • Constructor Details

    • EmptyLinePreprocessorStream

      public EmptyLinePreprocessorStream(ObjectStream<String> in)
  • Method Details

    • read

      public String read() throws IOException
      Description copied from interface: ObjectStream
      Returns the next ObjectStream object. Calling this method repeatedly until it returns null will return each object from the underlying source exactly once.
      Returns:
      The next object or null to signal that the stream is exhausted.
      Throws:
      IOException - Thrown if there is an error during reading.