Package related to identifying sentence boundaries.
ClassDescriptionDefault implementation of the
EndOfSentenceScanner.Generate event contexts for maxent decisions for sentence detection.
ObjectStreamto clean up empty lines for empty line separated document streams.
- Skips empty line at training data start
- Transforms multiple empty lines in a row into one
- Replaces white space lines with empty lines
- TODO: Terminates last document with empty line if it is missing
This stream should be used by the components that mark empty lines to mark document boundaries.The Newline
SentenceDetectorassumes that sentences are line delimited and recognizes one sentence per non-empty line.Interface for
SentenceDetectorMEcontext generators.A cross validator for
sentence detectors.The interface for sentence detectors, which find the sentence boundaries in a text.The factory that provides
SentenceDetectordefault implementations and resourcesA sentence detector for splitting up raw text into sentences.A
SentenceSamplecontains a document with begin indexes of the individual sentences.