public class TokenizerFactory extends BaseToolFactory
Tokenizer default implementations and
 resources. Users can extend this class if their application requires
 overriding the TokenContextGenerator, Dictionary etc.| Constructor and Description | 
|---|
TokenizerFactory()
Creates a  
TokenizerFactory that provides the default implementation
 of the resources. | 
TokenizerFactory(String languageCode,
                Dictionary abbreviationDictionary,
                boolean useAlphaNumericOptimization,
                Pattern alphaNumericPattern)
Creates a  
TokenizerFactory. | 
| Modifier and Type | Method and Description | 
|---|---|
static TokenizerFactory | 
create(String subclassName,
      String languageCode,
      Dictionary abbreviationDictionary,
      boolean useAlphaNumericOptimization,
      Pattern alphaNumericPattern)
Factory method the framework uses create a new  
TokenizerFactory. | 
Map<String,Object> | 
createArtifactMap()
Creates a  
Map with pairs of keys and objects. | 
Map<String,String> | 
createManifestEntries()
Creates the manifest entries that will be added to the model manifest 
 | 
Dictionary | 
getAbbreviationDictionary()
Gets the abbreviation dictionary 
 | 
Pattern | 
getAlphaNumericPattern()
Gets the alpha numeric pattern. 
 | 
TokenContextGenerator | 
getContextGenerator()
Gets the context generator 
 | 
String | 
getLanguageCode()
Retrieves the language code. 
 | 
boolean | 
isUseAlphaNumericOptmization()
Gets whether to use alphanumeric optimization. 
 | 
void | 
validateArtifactMap()
Validates the parsed artifacts. 
 | 
create, create, createArtifactSerializersMappublic TokenizerFactory()
TokenizerFactory that provides the default implementation
 of the resources.public TokenizerFactory(String languageCode, Dictionary abbreviationDictionary, boolean useAlphaNumericOptimization, Pattern alphaNumericPattern)
TokenizerFactory. Use this constructor to
 programmatically create a factory.languageCode - the language of the natural textabbreviationDictionary - an abbreviations dictionaryuseAlphaNumericOptimization - if true alpha numerics are skippedalphaNumericPattern - null or a custom alphanumeric pattern (default is:
          "^[A-Za-z0-9]+$", provided by Factory.DEFAULT_ALPHANUMERICpublic void validateArtifactMap()
                         throws InvalidFormatException
BaseToolFactoryInvalidFormatException.
 Note:
 Subclasses should generally invoke super.validateArtifactMap at the beginning
 of this method.validateArtifactMap in class BaseToolFactoryInvalidFormatExceptionpublic Map<String,Object> createArtifactMap()
BaseToolFactoryMap with pairs of keys and objects. The models
 implementation should call this constructor that creates a model
 programmatically.
 
 The base implementation will return a HashMap that should be
 populated by sub-classes.
createArtifactMap in class BaseToolFactorypublic Map<String,String> createManifestEntries()
BaseToolFactorycreateManifestEntries in class BaseToolFactorypublic static TokenizerFactory create(String subclassName, String languageCode, Dictionary abbreviationDictionary, boolean useAlphaNumericOptimization, Pattern alphaNumericPattern) throws InvalidFormatException
TokenizerFactory.subclassName - the name of the class implementing the TokenizerFactorylanguageCode - the language code the tokenizer should useabbreviationDictionary - an optional dictionary containing abbreviations, or null if not presentuseAlphaNumericOptimization - indicate if the alpha numeric optimization
     should be enabled or disabledalphaNumericPattern - the pattern the alpha numeric optimization should useInvalidFormatException - if once of the input parameters doesn't comply if the expected formatpublic Pattern getAlphaNumericPattern()
public boolean isUseAlphaNumericOptmization()
public Dictionary getAbbreviationDictionary()
public String getLanguageCode()
public TokenContextGenerator getContextGenerator()
Copyright © 2017 The Apache Software Foundation. All rights reserved.