Class DocumentSampleStream

  • All Implemented Interfaces:
    AutoCloseable, ObjectStream<DocumentSample>

    public class DocumentSampleStream
    extends FilterObjectStream<String,​DocumentSample>
    This class reads in string encoded training samples, parses them and outputs DocumentSample objects.

    Format:
    Each line contains one sample document.
    The category is the first string in the line followed by a tab and whitespace separated document tokens.
    Sample line: category-string tab-char whitespace-separated-tokens line-break-char(s)

    • Constructor Detail

    • Method Detail

      • read

        public DocumentSample read()
                            throws IOException
        Description copied from interface: ObjectStream
        Returns the next object. Calling this method repeatedly until it returns null will return each object from the underlying source exactly once.
        Returns:
        the next object or null to signal that the stream is exhausted
        Throws:
        IOException - if there is an error during reading