LemmatizerME (Apache OpenNLP Tools 1.8.0 API)

java.lang.Object
- opennlp.tools.lemmatizer.LemmatizerME

All Implemented Interfaces:

Lemmatizer
```
public class LemmatizerME
extends Object
implements Lemmatizer
```
A probabilistic lemmatizer. Tries to predict the induced permutation class for each word depending on its surrounding context. Based on Grzegorz Chrupała. 2008. Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. PhD dissertation, Dublin City University. http://grzegorz.chrupala.me/papers/phd-single.pdf

Field Summary

Fields
Modifier and Type Field and Description

static int DEFAULT_BEAM_SIZE

static int LEMMA_NUMBER

Fields
Modifier and Type	Field and Description
`static int`	`DEFAULT_BEAM_SIZE`
`static int`	`LEMMA_NUMBER`

Constructor Summary

Constructors
Constructor and Description

LemmatizerME(LemmatizerModel model)
Initializes the current instance with the provided model and the default beam size of 3.

Constructors
Constructor and Description
`LemmatizerME(LemmatizerModel model)` Initializes the current instance with the provided model and the default beam size of 3.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`static String[]`	`decodeLemmas(String[] toks, String[] preds)` Decodes the lemma from the word and the induced lemma class.
`static String[]`	`encodeLemmas(String[] toks, String[] lemmas)`
`List<List<String>>`	`lemmatize(List<String> toks, List<String> tags)` Generates a lemma tags for the word and postag returning the result in a list of every possible lemma for each token and postag.
`String[]`	`lemmatize(String[] toks, String[] tags)` Generates lemmas for the word and postag returning the result in an array.
`String[][]`	`predictLemmas(int numLemmas, String[] toks, String[] tags)` Predict all possible lemmas (using a default upper bound).
`String[]`	`predictSES(String[] toks, String[] tags)` Predict Short Edit Script (automatically induced lemma class).
`double[]`	`probs()` Returns an array with the probabilities of the last decoded sequence.
`void`	`probs(double[] probs)` Populates the specified array with the probabilities of the last decoded sequence.
`Sequence[]`	`topKLemmaClasses(String[] sentence, String[] tags)`
`Sequence[]`	`topKLemmaClasses(String[] sentence, String[] tags, double minSequenceScore)`
`Sequence[]`	`topKSequences(String[] sentence, String[] tags)`
`Sequence[]`	`topKSequences(String[] sentence, String[] tags, double minSequenceScore)`
`static LemmatizerModel`	`train(String languageCode, ObjectStream<LemmaSample> samples, TrainingParameters trainParams, LemmatizerFactory posFactory)`

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - LEMMA_NUMBER
```
public static final int LEMMA_NUMBER
```
    See Also:
    
    Constant Field Values
  - DEFAULT_BEAM_SIZE
```
public static final int DEFAULT_BEAM_SIZE
```
    See Also:
    
    Constant Field Values
- Constructor Detail
  - LemmatizerME
```
public LemmatizerME(LemmatizerModel model)
```
    Initializes the current instance with the provided model and the default beam size of 3.
    
    Parameters:
    
    model - the model
- Method Detail
  - lemmatize
```
public String[] lemmatize(String[] toks,
                          String[] tags)
```
    Description copied from interface: Lemmatizer
    
    Generates lemmas for the word and postag returning the result in an array.
    
    Specified by:
    
    lemmatize in interface Lemmatizer
    
    Parameters:
    
    toks - an array of the tokens
    
    tags - an array of the pos tags
    
    Returns:
    
    an array of possible lemmas for each token in the sequence.
  - lemmatize
```
public List<List<String>> lemmatize(List<String> toks,
                                    List<String> tags)
```
    Description copied from interface: Lemmatizer
    
    Generates a lemma tags for the word and postag returning the result in a list of every possible lemma for each token and postag.
    
    Specified by:
    
    lemmatize in interface Lemmatizer
    
    Parameters:
    
    toks - an array of the tokens
    
    tags - an array of the pos tags
    
    Returns:
    
    a list of every possible lemma for each token in the sequence.
  - predictSES
```
public String[] predictSES(String[] toks,
                           String[] tags)
```
    Predict Short Edit Script (automatically induced lemma class).
    
    Parameters:
    
    toks - the array of tokens
    
    tags - the array of pos tags
    
    Returns:
    
    an array containing the lemma classes
  - predictLemmas
```
public String[][] predictLemmas(int numLemmas,
                                String[] toks,
                                String[] tags)
```
    Predict all possible lemmas (using a default upper bound).
    
    Parameters:
    
    numLemmas - the default number of lemmas
    
    toks - the tokens
    
    tags - the postags
    
    Returns:
    
    a double array containing all posible lemmas for each token and postag pair
  - decodeLemmas
```
public static String[] decodeLemmas(String[] toks,
                                    String[] preds)
```
    Decodes the lemma from the word and the induced lemma class.
    
    Parameters:
    
    toks - the array of tokens
    
    preds - the predicted lemma classes
    
    Returns:
    
    the array of decoded lemmas
  - encodeLemmas
```
public static String[] encodeLemmas(String[] toks,
                                    String[] lemmas)
```
  - topKSequences
```
public Sequence[] topKSequences(String[] sentence,
                                String[] tags)
```
  - topKSequences
```
public Sequence[] topKSequences(String[] sentence,
                                String[] tags,
                                double minSequenceScore)
```
  - probs
```
public void probs(double[] probs)
```
    Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to lemmatize. The specified array should be at least as large as the number of tokens in the previous call to lemmatize.
    
    Parameters:
    
    probs - An array used to hold the probabilities of the last decoded sequence.
  - probs
```
public double[] probs()
```
    Returns an array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk.
    
    Returns:
    
    An array with the same number of probabilities as tokens were sent to chunk when it was last called.
  - train
```
public static LemmatizerModel train(String languageCode,
                                    ObjectStream<LemmaSample> samples,
                                    TrainingParameters trainParams,
                                    LemmatizerFactory posFactory)
                             throws IOException
```
    Throws:
    
    IOException
  - topKLemmaClasses
```
public Sequence[] topKLemmaClasses(String[] sentence,
                                   String[] tags)
```
  - topKLemmaClasses
```
public Sequence[] topKLemmaClasses(String[] sentence,
                                   String[] tags,
                                   double minSequenceScore)
```

Class LemmatizerME

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

LEMMA_NUMBER

DEFAULT_BEAM_SIZE

Constructor Detail

LemmatizerME

Method Detail

lemmatize

lemmatize

predictSES

predictLemmas

decodeLemmas

encodeLemmas

topKSequences

topKSequences

probs

probs

train

topKLemmaClasses

topKLemmaClasses