Klasse LanguageDetectorME
- Alle implementierten Schnittstellen:
Serializable,LanguageDetector
LanguageDetector.
This will process the entire string when called with
predictLanguage(CharSequence) or
predictLanguages(CharSequence).
If you want this to stop early, use probingPredictLanguages(CharSequence)
or probingPredictLanguages(CharSequence, LanguageDetectorConfig).
When run in probing mode, this starts at the beginning of the char sequence
and runs language detection on chunks of text. If the end of the
string is reached or there are LanguageDetectorConfig.getMinConsecImprovements()
consecutive predictions for the best language and the confidence
increases over those last predictions and if the difference
in confidence between the highest confidence language
and the second highest confidence language is greater than
LanguageDetectorConfig.getMinDiff(), the language detector will
stop and report the results.
The authors wish to thank Ken Krugler and Yalder} for the inspiration for many of the design components of this detector.
- Siehe auch:
-
Konstruktorübersicht
Konstruktoren -
Methodenübersicht
Modifizierer und TypMethodeBeschreibungString[]predictLanguage(CharSequence content) Predicts theLanguagefor the fullcontentlength.Language[]predictLanguages(CharSequence content) Predicts thelanguagesfor the fullcontentlength.probingPredictLanguages(CharSequence content) This will stop processing early if the stopping criteria specified inLanguageDetectorConfig.DEFAULT_LANGUAGE_DETECTOR_CONFIGare met.probingPredictLanguages(CharSequence content, LanguageDetectorConfig config) This will stop processing early if the stopping criteria specified inLanguageDetectorConfig.DEFAULT_LANGUAGE_DETECTOR_CONFIGare met.static LanguageDetectorModeltrain(ObjectStream<LanguageSample> samples, TrainingParameters mlParams, LanguageDetectorFactory factory) Starts a training of aLanguageDetectorModelwith the given parameters.
-
Konstruktordetails
-
LanguageDetectorME
Initializes an instance with a specificLanguageDetectorModel. Default feature generation is used.- Parameter:
model- theLanguageDetectorModelto be used.
-
-
Methodendetails
-
predictLanguages
Beschreibung aus Schnittstelle kopiert:LanguageDetectorPredicts thelanguagesfor the fullcontentlength.- Angegeben von:
predictLanguagesin SchnittstelleLanguageDetector- Parameter:
content- The textual content to detect potentiallanguagesfrom.- Gibt zurück:
- the predicted languages
-
predictLanguage
Beschreibung aus Schnittstelle kopiert:LanguageDetectorPredicts theLanguagefor the fullcontentlength.- Angegeben von:
predictLanguagein SchnittstelleLanguageDetector- Parameter:
content- The textual content to detect potentiallanguagesfrom.- Gibt zurück:
- the language with the highest confidence
-
getSupportedLanguages
- Angegeben von:
getSupportedLanguagesin SchnittstelleLanguageDetector- Gibt zurück:
- Retrieves an array of language (codes) that are supported by a
LanguageDetector.
-
probingPredictLanguages
This will stop processing early if the stopping criteria specified inLanguageDetectorConfig.DEFAULT_LANGUAGE_DETECTOR_CONFIGare met.- Parameter:
content- content to be processed- Gibt zurück:
- A computed
ProbingLanguageDetectionResult.
-
probingPredictLanguages
public ProbingLanguageDetectionResult probingPredictLanguages(CharSequence content, LanguageDetectorConfig config) This will stop processing early if the stopping criteria specified inLanguageDetectorConfig.DEFAULT_LANGUAGE_DETECTOR_CONFIGare met.- Parameter:
content- The textual content to process.config- TheLanguageDetectorConfigto customize detection.- Gibt zurück:
- A computed
ProbingLanguageDetectionResult.
-
train
public static LanguageDetectorModel train(ObjectStream<LanguageSample> samples, TrainingParameters mlParams, LanguageDetectorFactory factory) throws IOException Starts a training of aLanguageDetectorModelwith the given parameters.- Parameter:
samples- TheObjectStreamofLanguageSampleused as input for training.mlParams- TheTrainingParametersfor the context of the training.factory- TheLanguageDetectorFactoryfor creating related objects defined viamlParams.- Gibt zurück:
- A valid, trained
LanguageDetectorModelinstance. - Löst aus:
IOException- Thrown if IO errors occurred.
-