Package opennlp.tools.sentdetect
Class SentenceDetectorME
- java.lang.Object
-
- opennlp.tools.sentdetect.SentenceDetectorME
-
- All Implemented Interfaces:
SentenceDetector
public class SentenceDetectorME extends Object implements SentenceDetector
A sentence detector for splitting up raw text into sentences.A maximum entropy model is used to evaluate end-of-sentence characters in a string to determine if they signify the end of a sentence.
-
-
Constructor Summary
Constructors Constructor Description SentenceDetectorME(String language)
Initializes the sentence detector by downloading a default model.SentenceDetectorME(SentenceModel model)
Initializes the current instance.SentenceDetectorME(SentenceModel model, Factory factory)
Deprecated.Use aSentenceDetectorFactory
to extend SentenceDetector functionality.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description double[]
getSentenceProbabilities()
Returns the probabilities associated with the most recent calls tosentDetect(CharSequence)
.String[]
sentDetect(CharSequence s)
Detects sentences in given inputCharSequence
..Span[]
sentPosDetect(CharSequence s)
Detects the position of the first words of sentences in aCharSequence
.static SentenceModel
train(String languageCode, ObjectStream<SentenceSample> samples, boolean useTokenEnd, Dictionary abbreviations)
Deprecated.static SentenceModel
train(String languageCode, ObjectStream<SentenceSample> samples, boolean useTokenEnd, Dictionary abbreviations, TrainingParameters mlParams)
Deprecated.static SentenceModel
train(String languageCode, ObjectStream<SentenceSample> samples, SentenceDetectorFactory sdFactory, TrainingParameters mlParams)
-
-
-
Field Detail
-
SPLIT
public static final String SPLIT
Constant indicates a sentence split.- See Also:
- Constant Field Values
-
NO_SPLIT
public static final String NO_SPLIT
Constant indicates no sentence split.- See Also:
- Constant Field Values
-
-
Constructor Detail
-
SentenceDetectorME
public SentenceDetectorME(String language) throws IOException
Initializes the sentence detector by downloading a default model.- Parameters:
language
- The language of the sentence detector.- Throws:
IOException
- Thrown if the model cannot be downloaded or saved.
-
SentenceDetectorME
public SentenceDetectorME(SentenceModel model)
Initializes the current instance.- Parameters:
model
- theSentenceModel
-
SentenceDetectorME
public SentenceDetectorME(SentenceModel model, Factory factory)
Deprecated.Use aSentenceDetectorFactory
to extend SentenceDetector functionality.
-
-
Method Detail
-
sentDetect
public String[] sentDetect(CharSequence s)
Detects sentences in given inputCharSequence
..- Specified by:
sentDetect
in interfaceSentenceDetector
- Parameters:
s
- TheCharSequence
. to be processed.- Returns:
- A string array containing individual sentences as elements.
-
sentPosDetect
public Span[] sentPosDetect(CharSequence s)
Detects the position of the first words of sentences in aCharSequence
.- Specified by:
sentPosDetect
in interfaceSentenceDetector
- Parameters:
s
- TheCharSequence
to be processed.- Returns:
- An
span array
containing the positions of the end index of every sentence.
-
getSentenceProbabilities
public double[] getSentenceProbabilities()
Returns the probabilities associated with the most recent calls tosentDetect(CharSequence)
.- Returns:
- The probability for each sentence returned for the most recent
call to
sentDetect(CharSequence)
. If not applicable, an empty array is returned.
-
train
public static SentenceModel train(String languageCode, ObjectStream<SentenceSample> samples, boolean useTokenEnd, Dictionary abbreviations, TrainingParameters mlParams) throws IOException
Deprecated.- Throws:
IOException
-
train
public static SentenceModel train(String languageCode, ObjectStream<SentenceSample> samples, SentenceDetectorFactory sdFactory, TrainingParameters mlParams) throws IOException
- Throws:
IOException
-
train
@Deprecated public static SentenceModel train(String languageCode, ObjectStream<SentenceSample> samples, boolean useTokenEnd, Dictionary abbreviations) throws IOException
Deprecated.- Throws:
IOException
-
-