public final class TokenizerModel extends BaseModel
TokenizerModel
is the model used
by a learnable Tokenizer
.TokenizerME
TRAINING_CUTOFF_PROPERTY, TRAINING_EVENTHASH_PROPERTY, TRAINING_ITERATIONS_PROPERTY
Constructor and Description |
---|
TokenizerModel(File modelFile)
Initializes the current instance.
|
TokenizerModel(InputStream in)
Initializes the current instance.
|
TokenizerModel(MaxentModel tokenizerModel,
Map<String,String> manifestInfoEntries,
TokenizerFactory tokenizerFactory)
Initializes the current instance.
|
TokenizerModel(String language,
AbstractModel tokenizerMaxentModel,
boolean useAlphaNumericOptimization)
Deprecated.
Use
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory . |
TokenizerModel(String language,
AbstractModel tokenizerMaxentModel,
boolean useAlphaNumericOptimization,
Map<String,String> manifestInfoEntries)
Deprecated.
Use
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory . |
TokenizerModel(String language,
MaxentModel tokenizerMaxentModel,
Dictionary abbreviations,
boolean useAlphaNumericOptimization,
Map<String,String> manifestInfoEntries)
Deprecated.
Use
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory . |
TokenizerModel(URL modelURL)
Initializes the current instance.
|
Modifier and Type | Method and Description |
---|---|
Dictionary |
getAbbreviations() |
TokenizerFactory |
getFactory() |
MaxentModel |
getMaxentModel() |
static void |
main(String[] args) |
boolean |
useAlphaNumericOptimization() |
getArtifact, getLanguage, getManifestProperty, getVersion, isLoadedFromSerialized, serialize
public TokenizerModel(MaxentModel tokenizerModel, Map<String,String> manifestInfoEntries, TokenizerFactory tokenizerFactory)
tokenizerModel
- the modelmanifestInfoEntries
- the manifesttokenizerFactory
- the factorypublic TokenizerModel(String language, MaxentModel tokenizerMaxentModel, Dictionary abbreviations, boolean useAlphaNumericOptimization, Map<String,String> manifestInfoEntries)
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory
.language
- the language the tokenizer should usetokenizerMaxentModel
- the statistical model of the tokenizerabbreviations
- the dictionary containing the abbreviationsuseAlphaNumericOptimization
- if true alpha numeric optimization is enabled, otherwise notmanifestInfoEntries
- the additional meta data which should be written into manifestpublic TokenizerModel(String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization, Map<String,String> manifestInfoEntries)
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory
.language
- the language the tokenizer should usetokenizerMaxentModel
- the statistical model of the tokenizeruseAlphaNumericOptimization
- if true alpha numeric optimization is enabled, otherwise notmanifestInfoEntries
- the additional meta data which should be written into manifestpublic TokenizerModel(String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization)
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory
.language
- the language the tokenizer should usetokenizerMaxentModel
- the statistical model of the tokenizeruseAlphaNumericOptimization
- if true alpha numeric optimization is enabled, otherwise notpublic TokenizerModel(InputStream in) throws IOException, InvalidFormatException
in
- the Input Stream to load the model fromIOException
- if reading from the stream fails in anywayInvalidFormatException
- if the stream doesn't have the expected formatpublic TokenizerModel(File modelFile) throws IOException, InvalidFormatException
modelFile
- the file containing the tokenizer modelIOException
- if reading from the stream fails in anywayInvalidFormatException
- if the stream doesn't have the expected formatpublic TokenizerModel(URL modelURL) throws IOException, InvalidFormatException
modelURL
- the URL pointing to the tokenizer modelIOException
- if reading from the stream fails in anywayInvalidFormatException
- if the stream doesn't have the expected formatpublic TokenizerFactory getFactory()
public MaxentModel getMaxentModel()
public Dictionary getAbbreviations()
public boolean useAlphaNumericOptimization()
public static void main(String[] args) throws IOException
IOException
Copyright © 2015 The Apache Software Foundation. All rights reserved.