Package opennlp.tools.sentdetect
Klasse SentenceDetectorME
java.lang.Object
opennlp.tools.sentdetect.SentenceDetectorME
- Alle implementierten Schnittstellen:
SentenceDetector
A sentence detector for splitting up raw text into sentences.
A maximum entropy model is used to evaluate end-of-sentence characters in a string to determine if they signify the end of a sentence.
-
Feldübersicht
Felder -
Konstruktorübersicht
KonstruktorenKonstruktorBeschreibungSentenceDetectorME
(String language) Initializes the sentence detector by downloading a default model.SentenceDetectorME
(SentenceModel model) Initializes the current instance.SentenceDetectorME
(SentenceModel model, Dictionary abbDict) Instantiates aSentenceDetectorME
with an existingSentenceModel
.SentenceDetectorME
(SentenceModel model, Factory factory) Veraltet. -
Methodenübersicht
Modifizierer und TypMethodeBeschreibungdouble[]
Returns the probabilities associated with the most recent calls tosentDetect(CharSequence)
.String[]
Detects sentences in given inputCharSequence
..Span[]
Detects the position of the first words of sentences in aCharSequence
.static SentenceModel
train
(String languageCode, ObjectStream<SentenceSample> samples, SentenceDetectorFactory sdFactory, TrainingParameters mlParams) Starts a training of aSentenceModel
with the given parameters.
-
Felddetails
-
SPLIT
Constant indicates a sentence split.- Siehe auch:
-
NO_SPLIT
Constant indicates no sentence split.- Siehe auch:
-
-
Konstruktordetails
-
SentenceDetectorME
Initializes the sentence detector by downloading a default model.- Parameter:
language
- The language of the sentence detector.- Löst aus:
IOException
- Thrown if the model cannot be downloaded or saved.
-
SentenceDetectorME
Initializes the current instance.- Parameter:
model
- theSentenceModel
-
SentenceDetectorME
Instantiates aSentenceDetectorME
with an existingSentenceModel
.- Parameter:
model
- TheSentenceModel
to be used.abbDict
- TheDictionary
to be used. It must fit the language of themodel
.
-
SentenceDetectorME
Veraltet.Use aSentenceDetectorFactory
to extend SentenceDetector functionality.
-
-
Methodendetails
-
sentDetect
Detects sentences in given inputCharSequence
..- Angegeben von:
sentDetect
in SchnittstelleSentenceDetector
- Parameter:
s
- TheCharSequence
. to be processed.- Gibt zurück:
- A string array containing individual sentences as elements.
-
sentPosDetect
Detects the position of the first words of sentences in aCharSequence
.- Angegeben von:
sentPosDetect
in SchnittstelleSentenceDetector
- Parameter:
s
- TheCharSequence
to be processed.- Gibt zurück:
- An
span array
containing the positions of the end index of every sentence.
-
getSentenceProbabilities
public double[] getSentenceProbabilities()Returns the probabilities associated with the most recent calls tosentDetect(CharSequence)
.- Gibt zurück:
- The probability for each sentence returned for the most recent
call to
sentDetect(CharSequence)
. If not applicable, an empty array is returned.
-
train
public static SentenceModel train(String languageCode, ObjectStream<SentenceSample> samples, SentenceDetectorFactory sdFactory, TrainingParameters mlParams) throws IOException Starts a training of aSentenceModel
with the given parameters.- Parameter:
languageCode
- The ISO language code to train the model. Must not benull
.samples
- TheObjectStream
ofSentenceSample
used as input for training.sdFactory
- TheSentenceDetectorFactory
for creating related objects as defined viamlParams
.mlParams
- TheTrainingParameters
for the context of the training process.- Gibt zurück:
- A valid, trained
SentenceModel
instance. - Löst aus:
IOException
- Thrown if IO errors occurred.
-
SentenceDetectorFactory
to extend SentenceDetector functionality.