Package opennlp.tools.util.featuregen
Class CharacterNgramFeatureGenerator
java.lang.Object
opennlp.tools.util.featuregen.CharacterNgramFeatureGenerator
- All Implemented Interfaces:
AdaptiveFeatureGenerator
The
CharacterNgramFeatureGenerator
uses character ngrams to
generate features about each token.- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionInitializes aCharacterNgramFeatureGenerator
with default values for ngrams.CharacterNgramFeatureGenerator
(int minLength, int maxLength) Initializes aCharacterNgramFeatureGenerator
with the specified parameters. -
Method Summary
Modifier and TypeMethodDescriptionvoid
createFeatures
(List<String> features, String[] tokens, int index, String[] preds) Adds the appropriate features for the token at the specifiedindex
with the specified array ofpreviousOutcomes
to the specified list of features.Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface opennlp.tools.util.featuregen.AdaptiveFeatureGenerator
clearAdaptiveData, updateAdaptiveData
-
Constructor Details
-
CharacterNgramFeatureGenerator
public CharacterNgramFeatureGenerator(int minLength, int maxLength) Initializes aCharacterNgramFeatureGenerator
with the specified parameters.- Parameters:
minLength
- The minimum length to use. Must not be negative.maxLength
- The maximum length to use. Must not be negative.
-
CharacterNgramFeatureGenerator
public CharacterNgramFeatureGenerator()Initializes aCharacterNgramFeatureGenerator
with default values for ngrams. The minimal length is set to2
and maximum length to a value of5
.
-
-
Method Details
-
createFeatures
Description copied from interface:AdaptiveFeatureGenerator
Adds the appropriate features for the token at the specifiedindex
with the specified array ofpreviousOutcomes
to the specified list of features.- Specified by:
createFeatures
in interfaceAdaptiveFeatureGenerator
- Parameters:
features
- The list of features to be added to.tokens
- The tokens of the sentence or other text unit being processed.index
- The index of the token which is currently being processed.preds
- The outcomes for the tokens prior to the specified index.
-