Interface EndOfSentenceScanner

All Known Implementing Classes:
DefaultEndOfSentenceScanner

public interface EndOfSentenceScanner
Scans CharSequence, StringBuffer, and char[] for the offsets of sentence ending characters.

Implementations of this interface can use regular expressions, hand-coded DFAs, and other scanning techniques to locate end of sentence offsets.

  • Method Details

    • getEOSCharacters

      Set<Character> getEOSCharacters()
      Returns:
      a set of characters which can indicate the end of a sentence.
    • getPositions

      List<Integer> getPositions(CharSequence s)
      The receiver scans the specified string for sentence ending characters and returns their offsets.
      Parameters:
      s - A CharSequence to be scanned.
      Returns:
      A List of Integer objects.
    • getPositions

      List<Integer> getPositions(StringBuffer buf)
      The receiver scans buf for sentence ending characters and returns their offsets.
      Parameters:
      buf - A StringBuffer to be scanned.
      Returns:
      A List of Integer objects.
    • getPositions

      List<Integer> getPositions(char[] cbuf)
      The receiver scans cbuf for sentence ending characters and returns their offsets.
      Parameters:
      cbuf - A char[] to be scanned.
      Returns:
      A List of Integer objects.