Interface EndOfSentenceScanner

  • All Known Implementing Classes:
    DefaultEndOfSentenceScanner

    public interface EndOfSentenceScanner
    Scans CharSequence, StringBuffer, and char[] for the offsets of sentence ending characters.

    Implementations of this interface can use regular expressions, hand-coded DFAs, and other scanning techniques to locate end of sentence offsets.

    • Method Detail

      • getEndOfSentenceCharacters

        @Deprecated
        char[] getEndOfSentenceCharacters()
        Deprecated.
        Returns:
        an array of character which can indicate the end of a sentence.
      • getEOSCharacters

        Set<Character> getEOSCharacters()
        Returns:
        a set of characters which can indicate the end of a sentence.
      • getPositions

        List<Integer> getPositions​(CharSequence s)
        The receiver scans the specified string for sentence ending characters and returns their offsets.
        Parameters:
        s - A CharSequence to be scanned.
        Returns:
        A List of Integer objects.
      • getPositions

        List<Integer> getPositions​(StringBuffer buf)
        The receiver scans buf for sentence ending characters and returns their offsets.
        Parameters:
        buf - A StringBuffer to be scanned.
        Returns:
        A List of Integer objects.
      • getPositions

        List<Integer> getPositions​(char[] cbuf)
        The receiver scans cbuf for sentence ending characters and returns their offsets.
        Parameters:
        cbuf - A char[] to be scanned.
        Returns:
        A List of Integer objects.