public class LemmatizerME extends Object implements Lemmatizer
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_BEAM_SIZE |
static int |
LEMMA_NUMBER |
Constructor and Description |
---|
LemmatizerME(LemmatizerModel model)
Initializes the current instance with the provided model
and the default beam size of 3.
|
Modifier and Type | Method and Description |
---|---|
static String[] |
decodeLemmas(String[] toks,
String[] preds)
Decodes the lemma from the word and the induced lemma class.
|
static String[] |
encodeLemmas(String[] toks,
String[] lemmas) |
List<List<String>> |
lemmatize(List<String> toks,
List<String> tags)
Generates a lemma tags for the word and postag returning the result in a list
of every possible lemma for each token and postag.
|
String[] |
lemmatize(String[] toks,
String[] tags)
Generates lemmas for the word and postag returning the result in an array.
|
String[][] |
predictLemmas(int numLemmas,
String[] toks,
String[] tags)
Predict all possible lemmas (using a default upper bound).
|
String[] |
predictSES(String[] toks,
String[] tags)
Predict Short Edit Script (automatically induced lemma class).
|
double[] |
probs()
Returns an array with the probabilities of the last decoded sequence.
|
void |
probs(double[] probs)
Populates the specified array with the probabilities of the last decoded sequence.
|
Sequence[] |
topKLemmaClasses(String[] sentence,
String[] tags) |
Sequence[] |
topKLemmaClasses(String[] sentence,
String[] tags,
double minSequenceScore) |
Sequence[] |
topKSequences(String[] sentence,
String[] tags) |
Sequence[] |
topKSequences(String[] sentence,
String[] tags,
double minSequenceScore) |
static LemmatizerModel |
train(String languageCode,
ObjectStream<LemmaSample> samples,
TrainingParameters trainParams,
LemmatizerFactory posFactory) |
public static final int LEMMA_NUMBER
public static final int DEFAULT_BEAM_SIZE
public LemmatizerME(LemmatizerModel model)
model
- the modelpublic String[] lemmatize(String[] toks, String[] tags)
Lemmatizer
lemmatize
in interface Lemmatizer
toks
- an array of the tokenstags
- an array of the pos tagspublic List<List<String>> lemmatize(List<String> toks, List<String> tags)
Lemmatizer
lemmatize
in interface Lemmatizer
toks
- an array of the tokenstags
- an array of the pos tagspublic String[] predictSES(String[] toks, String[] tags)
toks
- the array of tokenstags
- the array of pos tagspublic String[][] predictLemmas(int numLemmas, String[] toks, String[] tags)
numLemmas
- the default number of lemmastoks
- the tokenstags
- the postagspublic static String[] decodeLemmas(String[] toks, String[] preds)
toks
- the array of tokenspreds
- the predicted lemma classespublic Sequence[] topKSequences(String[] sentence, String[] tags, double minSequenceScore)
public void probs(double[] probs)
lemmatize
. The
specified array should be at least as large as the number of tokens in the
previous call to lemmatize
.probs
- An array used to hold the probabilities of the last decoded sequence.public double[] probs()
chunk
.chunk
when it was last called.public static LemmatizerModel train(String languageCode, ObjectStream<LemmaSample> samples, TrainingParameters trainParams, LemmatizerFactory posFactory) throws IOException
IOException
Copyright © 2018 The Apache Software Foundation. All rights reserved.