Package opennlp.tools.postag
Class POSTaggerME
java.lang.Object
opennlp.tools.postag.POSTaggerME
- All Implemented Interfaces:
- POSTagger
A 
part-of-speech tagger implementation that uses maximum entropy.
 
 Tries to predict whether words are nouns, verbs, or any other POS tags
 depending on their surrounding context.
- See Also:
- 
Field SummaryFieldsModifier and TypeFieldDescriptionstatic final intThe default beam size value is 3.
- 
Constructor SummaryConstructorsConstructorDescriptionPOSTaggerME(String language) Initializes aPOSTaggerMEby downloading a default model for a givenlanguage.POSTaggerME(String language, POSTagFormat format) Initializes aPOSTaggerMEby downloading a default model for a givenlanguage.POSTaggerME(POSModel model) Initializes aPOSTaggerMEwith the providedmodel.POSTaggerME(POSModel model, POSTagFormat format) Initializes aPOSTaggerMEwith the providedmodel.
- 
Method SummaryModifier and TypeMethodDescriptionstatic DictionarybuildNGramDictionary(ObjectStream<POSSample> samples, int cutoff) Constructs anGram dictionaryfrom anObjectStreamof samples.String[]String[]getOrderedTags(List<String> words, List<String> tags, int index) String[]getOrderedTags(List<String> words, List<String> tags, int index, double[] tprobs) static voidpopulatePOSDictionary(ObjectStream<POSSample> samples, MutableTagDictionary dict, int cutoff) Populates aPOSDictionaryfrom anObjectStreamof samples.double[]probs()voidprobs(double[] probs) Populates the specifiedprobsarray with the probabilities for each tag of the last tagged sentence.String[][]Returns at most the specifiednumTaggingsfor the specifiedsentence.String[]Assigns the sentence of tokens pos tags.String[]Assigns the sentence of tokens pos tags.Sequence[]topKSequences(String[] sentence) Assigns the sentence the top-ksequences.Sequence[]topKSequences(String[] sentence, Object[] additionalContext) Assigns the sentence the top-ksequences.static POSModeltrain(String languageCode, ObjectStream<POSSample> samples, TrainingParameters mlParams, POSTaggerFactory posFactory) Starts a training of aPOSModelwith the given parameters.
- 
Field Details- 
DEFAULT_BEAM_SIZEpublic static final int DEFAULT_BEAM_SIZEThe default beam size value is 3.- See Also:
 
 
- 
- 
Constructor Details- 
POSTaggerMEInitializes aPOSTaggerMEby downloading a default model for a givenlanguage.- Parameters:
- language- An ISO conform language code.
- Throws:
- IOException- Thrown if the model could not be downloaded or saved.
 
- 
POSTaggerMEInitializes aPOSTaggerMEby downloading a default model for a givenlanguage.- Parameters:
- language- An ISO conform language code.
- format- A valid- POSTagFormat.
- Throws:
- IOException- Thrown if the model could not be downloaded or saved.
 
- 
POSTaggerMEInitializes aPOSTaggerMEwith the providedmodel.- Parameters:
- model- A valid- POSModel.
 
- 
POSTaggerMEInitializes aPOSTaggerMEwith the providedmodel.- Parameters:
- model- A valid- POSModel.
- format- A valid- POSTagFormat.
 
 
- 
- 
Method Details- 
getAllPosTags- Returns:
- Retrieves an array of all possible part-of-speech tags from the tagger.
 
- 
tagAssigns the sentence of tokens pos tags.
- 
tagAssigns the sentence of tokens pos tags.
- 
tagReturns at most the specifiednumTaggingsfor the specifiedsentence.- Parameters:
- numTaggings- The number of tagging to be returned.
- sentence- An array of tokens which make up a sentence.
- Returns:
- At most the specified number of taggings for the specified sentence.
 
- 
topKSequencesAssigns the sentence the top-ksequences.- Specified by:
- topKSequencesin interface- POSTagger
- Parameters:
- sentence- The sentence of tokens to be tagged.
- Returns:
- An array of sequencesfor each token provided insentence.
 
- 
topKSequencesAssigns the sentence the top-ksequences.- Specified by:
- topKSequencesin interface- POSTagger
- Parameters:
- sentence- The sentence of tokens to be tagged.
- additionalContext- The context to provide additional information with.
- Returns:
- An array of sequencesfor each token provided insentence.
 
- 
probspublic void probs(double[] probs) Populates the specifiedprobsarray with the probabilities for each tag of the last tagged sentence.- Parameters:
- probs- An array to put the probabilities into.
 
- 
probspublic double[] probs()- Returns:
- An array with the probabilities for each tag of the last tagged sentence.
 
- 
getOrderedTags
- 
getOrderedTags
- 
trainpublic static POSModel train(String languageCode, ObjectStream<POSSample> samples, TrainingParameters mlParams, POSTaggerFactory posFactory) throws IOException Starts a training of aPOSModelwith the given parameters.- Parameters:
- languageCode- The ISO language code to train the model. Must not be- null.
- samples- The- ObjectStreamof- POSSampleused as input for training.
- mlParams- The- TrainingParametersfor the context of the training process.
- posFactory- The- POSTaggerFactoryfor creating related objects as defined via- mlParams.
- Returns:
- A valid, trained POSModelinstance.
- Throws:
- IOException- Thrown if IO errors occurred.
 
- 
buildNGramDictionarypublic static Dictionary buildNGramDictionary(ObjectStream<POSSample> samples, int cutoff) throws IOException Constructs anGram dictionaryfrom anObjectStreamof samples.- Parameters:
- samples- The- ObjectStreamto process.
- cutoff- A non-negative cut-off value.
- Returns:
- A valid Dictionaryinstance holding nGrams.
- Throws:
- IOException- Thrown if IO errors occurred during dictionary construction.
 
- 
populatePOSDictionarypublic static void populatePOSDictionary(ObjectStream<POSSample> samples, MutableTagDictionary dict, int cutoff) throws IOException Populates aPOSDictionaryfrom anObjectStreamof samples.- Parameters:
- samples- The- ObjectStreamto process.
- dict- The- MutableTagDictionaryto use during population.
- cutoff- A non-negative cut-off value.
- Throws:
- IOException- Thrown if IO errors occurred during dictionary construction.
 
 
-