Class DefaultSDContextGenerator

    • Constructor Detail

      • DefaultSDContextGenerator

        public DefaultSDContextGenerator​(char[] eosCharacters)
        Creates a new instance with no induced abbreviations.
        Parameters:
        eosCharacters - The characters to be used to detect sentence endings.
      • DefaultSDContextGenerator

        public DefaultSDContextGenerator​(Set<String> inducedAbbreviations,
                                         char[] eosCharacters)
        Creates a new SDContextGenerator instance which uses the set of induced abbreviations.
        Parameters:
        inducedAbbreviations - a Set of Strings representing induced abbreviations in the training data. Example: "Mr."
        eosCharacters - The characters to be used to detect sentence endings.
    • Method Detail

      • getContext

        public String[] getContext​(CharSequence sb,
                                   int position)
        Description copied from interface: SDContextGenerator
        Returns an array of contextual features for the potential sentence boundary at the specified position within the specified string buffer.
        Specified by:
        getContext in interface SDContextGenerator
        Parameters:
        sb - The String for which sentences are being determined.
        position - An index into the specified string buffer when a sentence boundary may occur.
        Returns:
        an array of contextual features for the potential sentence boundary at the specified position within the specified string buffer.