Class LeipzigLanguageSampleStream

    • Constructor Detail

      • LeipzigLanguageSampleStream

        public LeipzigLanguageSampleStream​(File leipzigFolder,
                                           int sentencesPerSample,
                                           int samplesPerLanguage)
                                    throws IOException
        Parameters:
        leipzigFolder - The directory which contains files to process.
        sentencesPerSample - The number of sentences per sample.
        samplesPerLanguage - The number of samples per language to process at maximum.
        Throws:
        IOException - Thrown if IO errors occurred.
    • Method Detail

      • read

        public LanguageSample read()
                            throws IOException
        Description copied from interface: ObjectStream
        Returns the next ObjectStream object. Calling this method repeatedly until it returns null will return each object from the underlying source exactly once.
        Specified by:
        read in interface ObjectStream<LanguageSample>
        Returns:
        The next object or null to signal that the stream is exhausted.
        Throws:
        IOException - Thrown if there is an error during reading.
      • reset

        public void reset()
                   throws IOException
        Description copied from interface: ObjectStream
        Repositions the stream at the beginning and the previously seen object sequence will be repeated exactly. This method can be used to re-read the stream if multiple passes over the objects are required.

        The implementation of this method is optional.

        Specified by:
        reset in interface ObjectStream<LanguageSample>
        Throws:
        IOException - Thrown if there is an error during resetting the stream.