Package opennlp.tools.util.featuregen
package opennlp.tools.util.featuregen
This package contains classes for generating sequence features.
-
ClassDescriptionAn interface for generating features for name entity identification and for updating document level contexts.The
AdditionalContextFeatureGenerator
generates the context from the passed in additional context.TheAggregatedFeatureGenerator
aggregates a set ofAdaptiveFeatureGenerator
s and calls them to generate the features.Generates Brown cluster features for token bigrams.Class to load a Brown cluster document: word\tword_class\tprobGenerates Brown clustering features for token bigrams.Generates Brown clustering features for token classes.Generates Brown clustering features for current token.Obtain the paths listed in the pathLengths array from the Brown class.GeneratesBrownCluster
features for current token and token class.GeneratesBrownCluster
features for current token.Caches features of the aggregatedgenerators
.TheCharacterNgramFeatureGenerator
uses character ngrams to generate features about each token.TheDictionaryFeatureGenerator
uses theDictionaryNameFinder
to generated features for detected names based on theInSpanGenerator
.TheFeatureGeneratorResourceProvider
provides access to the resources available in the model.This class provide common utilities for feature generation.Creates a set of feature generators based on a provided XML descriptor.Generates features if the tokens are recognized by the providedTokenNameFinder
.The definition feature maps the underlying distribution of outcomes.Adds the token POS Tag as feature.ThisAdaptiveFeatureGenerator
generates features indicating the outcome associated with a previously occurring word.ThisAdaptiveFeatureGenerator
generates features indicating the outcome associated with two previously occurring words.This feature generator creates sentence begin and end features.Recognizes predefined patterns in strings.Generates features for different for the class of the token.Generates a feature which contains the token itself.Partitions tokens into sub-tokens based on character classes and generates class features for each of the sub-tokens and combinations of those sub-tokens.Adds trigram features based on tokens and token classes.Generates previous and next features for a givenAdaptiveFeatureGenerator
.Defines a word cluster generator factory; it reads an element containing 'w2vwordcluster' as a tag name; these clusters are typically produced by word2vec or clark pos induction systems.