Class SentenceDetectorME

java.lang.Object
opennlp.tools.sentdetect.SentenceDetectorME
All Implemented Interfaces:
opennlp.tools.ml.Probabilistic, opennlp.tools.sentdetect.SentenceDetector

public class SentenceDetectorME extends Object implements opennlp.tools.sentdetect.SentenceDetector, opennlp.tools.ml.Probabilistic
A sentence detector for splitting up raw text into sentences.

A maximum entropy model is used to evaluate end-of-sentence characters in a string to determine if they signify the end of a sentence.

  • Field Details

  • Constructor Details

  • Method Details

    • sentDetect

      public String[] sentDetect(CharSequence s)
      Detects sentences in given input CharSequence..
      Specified by:
      sentDetect in interface opennlp.tools.sentdetect.SentenceDetector
      Parameters:
      s - The CharSequence. to be processed.
      Returns:
      A string array containing individual sentences as elements.
    • sentPosDetect

      public opennlp.tools.util.Span[] sentPosDetect(CharSequence s)
      Detects the position of the first words of sentences in a CharSequence.
      Specified by:
      sentPosDetect in interface opennlp.tools.sentdetect.SentenceDetector
      Parameters:
      s - The CharSequence to be processed.
      Returns:
      An span array containing the positions of the end index of every sentence.
    • probs

      public double[] probs()
      The sequence was determined based on the previous call to sentDetect(CharSequence).
      Specified by:
      probs in interface opennlp.tools.ml.Probabilistic
      Returns:
      An array with the same number of probabilities as tokens were sent to sentDetect(CharSequence) when it was last called. If not applicable, an empty array is returned.
    • getSentenceProbabilities

      @Deprecated(forRemoval=true, since="2.5.5") public double[] getSentenceProbabilities()
      Deprecated, for removal: This API element is subject to removal in a future version.
      Use probs() instead.
      Returns:
      The probability for each sentence returned for the most recent call to sentDetect(CharSequence). If not applicable, an empty array is returned.
    • train

      public static SentenceModel train(String languageCode, opennlp.tools.util.ObjectStream<opennlp.tools.sentdetect.SentenceSample> samples, SentenceDetectorFactory sdFactory, opennlp.tools.util.TrainingParameters mlParams) throws IOException
      Starts a training of a SentenceModel with the given parameters.
      Parameters:
      languageCode - The ISO language code to train the model. Must not be null.
      samples - The ObjectStream of SentenceSample used as input for training.
      sdFactory - The SentenceDetectorFactory for creating related objects as defined via mlParams.
      mlParams - The TrainingParameters for the context of the training process.
      Returns:
      A valid, trained SentenceModel instance.
      Throws:
      IOException - Thrown if IO errors occurred.