opennlp.tools.postag
Class POSTaggerME

java.lang.Object
  extended by opennlp.tools.postag.POSTaggerME
All Implemented Interfaces:
POSTagger

public class POSTaggerME
extends Object
implements POSTagger

A part-of-speech tagger that uses maximum entropy. Tries to predict whether words are nouns, verbs, or any of 70 other POS tags depending on their surrounding context.


Field Summary
static int DEFAULT_BEAM_SIZE
           
 
Constructor Summary
POSTaggerME(opennlp.model.AbstractModel model, Dictionary dict)
          Deprecated. 
POSTaggerME(opennlp.model.AbstractModel model, Dictionary dict, TagDictionary tagdict)
          Deprecated. 
POSTaggerME(opennlp.model.AbstractModel model, POSContextGenerator cg)
          Deprecated. 
POSTaggerME(opennlp.model.AbstractModel model, POSContextGenerator cg, TagDictionary tagdict)
          Deprecated. 
POSTaggerME(opennlp.model.AbstractModel model, TagDictionary tagdict)
          Deprecated. 
POSTaggerME(int beamSize, opennlp.model.AbstractModel model, POSContextGenerator cg, TagDictionary tagdict)
          Deprecated. 
POSTaggerME(POSModel model)
          Initializes the current instance with the provided model and the default beam size of 3.
POSTaggerME(POSModel model, int beamSize, int cacheSize)
          Initializes the current instance with the provided model and provided beam size.
POSTaggerME(POSModel model, int beamSize, int cacheSize, SequenceValidator<String> sequenceValidator)
          Deprecated. use POSTaggerME(POSModel, int, int) instead. The model knows which SequenceValidator to use.
 
Method Summary
static Dictionary buildNGramDictionary(ObjectStream<POSSample> samples, int cutoff)
           
 int getNumTags()
          Returns the number of different tags predicted by this model.
 String[] getOrderedTags(List<String> words, List<String> tags, int index)
           
 String[] getOrderedTags(List<String> words, List<String> tags, int index, double[] tprobs)
           
static void populatePOSDictionary(ObjectStream<POSSample> samples, MutableTagDictionary dict, int cutoff)
           
 double[] probs()
          Returns an array with the probabilities for each tag of the last tagged sentence.
 void probs(double[] probs)
          Populates the specified array with the probabilities for each tag of the last tagged sentence.
 String[][] tag(int numTaggings, String[] sentence)
          Returns at most the specified number of taggings for the specified sentence.
 List<String> tag(List<String> sentence)
          Deprecated. 
 String tag(String sentence)
          Deprecated. 
 String[] tag(String[] sentence)
          Assigns the sentence of tokens pos tags.
 String[] tag(String[] sentence, Object[] additionaContext)
           
 Sequence[] topKSequences(List<String> sentence)
          Deprecated. 
 Sequence[] topKSequences(String[] sentence)
           
 Sequence[] topKSequences(String[] sentence, Object[] additionaContext)
           
static POSModel train(String languageCode, ObjectStream<POSSample> samples, ModelType modelType, POSDictionary tagDictionary, Dictionary ngramDictionary, int cutoff, int iterations)
          Deprecated. use train(String, ObjectStream, TrainingParameters, POSTaggerFactory) instead and pass in a POSTaggerFactory and a TrainingParameters.
static POSModel train(String languageCode, ObjectStream<POSSample> samples, TrainingParameters trainParams, POSDictionary tagDictionary, Dictionary ngramDictionary)
          Deprecated. use train(String, ObjectStream, TrainingParameters, POSTaggerFactory) instead and pass in a POSTaggerFactory.
static POSModel train(String languageCode, ObjectStream<POSSample> samples, TrainingParameters trainParams, POSTaggerFactory posFactory)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_BEAM_SIZE

public static final int DEFAULT_BEAM_SIZE
See Also:
Constant Field Values
Constructor Detail

POSTaggerME

public POSTaggerME(POSModel model,
                   int beamSize,
                   int cacheSize,
                   SequenceValidator<String> sequenceValidator)
Deprecated. use POSTaggerME(POSModel, int, int) instead. The model knows which SequenceValidator to use.

Constructor that overrides the SequenceValidator from the model.


POSTaggerME

public POSTaggerME(POSModel model,
                   int beamSize,
                   int cacheSize)
Initializes the current instance with the provided model and provided beam size.

Parameters:
model -
beamSize -

POSTaggerME

public POSTaggerME(POSModel model)
Initializes the current instance with the provided model and the default beam size of 3.

Parameters:
model -

POSTaggerME

@Deprecated
public POSTaggerME(opennlp.model.AbstractModel model,
                              TagDictionary tagdict)
Deprecated. 

Creates a new tagger with the specified model and tag dictionary.

Parameters:
model - The model used for tagging.
tagdict - The tag dictionary used for specifying a set of valid tags.

POSTaggerME

@Deprecated
public POSTaggerME(opennlp.model.AbstractModel model,
                              Dictionary dict)
Deprecated. 

Creates a new tagger with the specified model and n-gram dictionary.

Parameters:
model - The model used for tagging.
dict - The n-gram dictionary used for feature generation.

POSTaggerME

@Deprecated
public POSTaggerME(opennlp.model.AbstractModel model,
                              Dictionary dict,
                              TagDictionary tagdict)
Deprecated. 

Creates a new tagger with the specified model, n-gram dictionary, and tag dictionary.

Parameters:
model - The model used for tagging.
dict - The n-gram dictionary used for feature generation.
tagdict - The dictionary which specifies the valid set of tags for some words.

POSTaggerME

@Deprecated
public POSTaggerME(opennlp.model.AbstractModel model,
                              POSContextGenerator cg)
Deprecated. 

Creates a new tagger with the specified model and context generator.

Parameters:
model - The model used for tagging.
cg - The context generator used for feature creation.

POSTaggerME

@Deprecated
public POSTaggerME(opennlp.model.AbstractModel model,
                              POSContextGenerator cg,
                              TagDictionary tagdict)
Deprecated. 

Creates a new tagger with the specified model, context generator, and tag dictionary.

Parameters:
model - The model used for tagging.
cg - The context generator used for feature creation.
tagdict - The dictionary which specifies the valid set of tags for some words.

POSTaggerME

@Deprecated
public POSTaggerME(int beamSize,
                              opennlp.model.AbstractModel model,
                              POSContextGenerator cg,
                              TagDictionary tagdict)
Deprecated. 

Creates a new tagger with the specified beam size, model, context generator, and tag dictionary.

Parameters:
beamSize - The number of alternate tagging considered when tagging.
model - The model used for tagging.
cg - The context generator used for feature creation.
tagdict - The dictionary which specifies the valid set of tags for some words.
Method Detail

getNumTags

public int getNumTags()
Returns the number of different tags predicted by this model.

Returns:
the number of different tags predicted by this model.

tag

@Deprecated
public List<String> tag(List<String> sentence)
Deprecated. 

Description copied from interface: POSTagger
Assigns the sentence of tokens pos tags.

Specified by:
tag in interface POSTagger
Parameters:
sentence - The sentence of tokens to be tagged.
Returns:
a list of pos tags for each token provided in sentence.

tag

public String[] tag(String[] sentence)
Description copied from interface: POSTagger
Assigns the sentence of tokens pos tags.

Specified by:
tag in interface POSTagger
Parameters:
sentence - The sentece of tokens to be tagged.
Returns:
an array of pos tags for each token provided in sentence.

tag

public String[] tag(String[] sentence,
                    Object[] additionaContext)
Specified by:
tag in interface POSTagger

tag

public String[][] tag(int numTaggings,
                      String[] sentence)
Returns at most the specified number of taggings for the specified sentence.

Parameters:
numTaggings - The number of tagging to be returned.
sentence - An array of tokens which make up a sentence.
Returns:
At most the specified number of taggings for the specified sentence.

topKSequences

@Deprecated
public Sequence[] topKSequences(List<String> sentence)
Deprecated. 

Specified by:
topKSequences in interface POSTagger

topKSequences

public Sequence[] topKSequences(String[] sentence)
Specified by:
topKSequences in interface POSTagger

topKSequences

public Sequence[] topKSequences(String[] sentence,
                                Object[] additionaContext)
Specified by:
topKSequences in interface POSTagger

probs

public void probs(double[] probs)
Populates the specified array with the probabilities for each tag of the last tagged sentence.

Parameters:
probs - An array to put the probabilities into.

probs

public double[] probs()
Returns an array with the probabilities for each tag of the last tagged sentence.

Returns:
an array with the probabilities for each tag of the last tagged sentence.

tag

@Deprecated
public String tag(String sentence)
Deprecated. 

Description copied from interface: POSTagger
Assigns the sentence of space-delimied tokens pos tags.

Specified by:
tag in interface POSTagger
Parameters:
sentence - The sentece of space-delimited tokens to be tagged.
Returns:
a string of space-delimited pos tags for each token provided in sentence.

getOrderedTags

public String[] getOrderedTags(List<String> words,
                               List<String> tags,
                               int index)

getOrderedTags

public String[] getOrderedTags(List<String> words,
                               List<String> tags,
                               int index,
                               double[] tprobs)

train

public static POSModel train(String languageCode,
                             ObjectStream<POSSample> samples,
                             TrainingParameters trainParams,
                             POSTaggerFactory posFactory)
                      throws IOException
Throws:
IOException

train

public static POSModel train(String languageCode,
                             ObjectStream<POSSample> samples,
                             TrainingParameters trainParams,
                             POSDictionary tagDictionary,
                             Dictionary ngramDictionary)
                      throws IOException
Deprecated. use train(String, ObjectStream, TrainingParameters, POSTaggerFactory) instead and pass in a POSTaggerFactory.

Throws:
IOException

train

@Deprecated
public static POSModel train(String languageCode,
                                        ObjectStream<POSSample> samples,
                                        ModelType modelType,
                                        POSDictionary tagDictionary,
                                        Dictionary ngramDictionary,
                                        int cutoff,
                                        int iterations)
                      throws IOException
Deprecated. use train(String, ObjectStream, TrainingParameters, POSTaggerFactory) instead and pass in a POSTaggerFactory and a TrainingParameters.

Throws:
IOException

buildNGramDictionary

public static Dictionary buildNGramDictionary(ObjectStream<POSSample> samples,
                                              int cutoff)
                                       throws IOException
Throws:
IOException

populatePOSDictionary

public static void populatePOSDictionary(ObjectStream<POSSample> samples,
                                         MutableTagDictionary dict,
                                         int cutoff)
                                  throws IOException
Throws:
IOException


Copyright © 2013 The Apache Software Foundation. All Rights Reserved.