Package | Description |
---|---|
opennlp.tools.dictionary |
Package related to parsing and storing dictionaries.
|
opennlp.tools.dictionary.serializer | |
opennlp.tools.formats |
Experimental package related to converting various corpora to OpenNLP Format.
|
opennlp.tools.languagemodel |
Package related to language models
|
opennlp.tools.ngram |
Package related to computing and storing n-gram frequencies.
|
opennlp.tools.util |
Package containing utility data structures and algorithms used by multiple other packages.
|
Modifier and Type | Method and Description |
---|---|
Iterator<StringList> |
Dictionary.iterator()
Retrieves an Iterator over all tokens.
|
Modifier and Type | Method and Description |
---|---|
boolean |
Dictionary.contains(StringList tokens)
Checks if this dictionary has the given entry.
|
void |
Dictionary.put(StringList tokens)
Adds the tokens to the dictionary as one new entry.
|
void |
Dictionary.remove(StringList tokens)
Removes the given tokens form the current instance.
|
Constructor and Description |
---|
Index(Iterator<StringList> tokenLists)
Initializes the current instance with the given
StringList Iterator . |
Modifier and Type | Method and Description |
---|---|
StringList |
Entry.getTokens()
Retrieves the tokens.
|
Constructor and Description |
---|
Entry(StringList tokens,
Attributes attributes)
Initializes the current instance.
|
Modifier and Type | Method and Description |
---|---|
StringList |
NameFinderCensus90NameStream.read() |
Modifier and Type | Method and Description |
---|---|
StringList |
NGramLanguageModel.predictNextTokens(StringList tokens) |
StringList |
LanguageModel.predictNextTokens(StringList tokens)
Predict the most probable output sequence of tokens, given an input sequence of tokens
|
Modifier and Type | Method and Description |
---|---|
double |
NGramLanguageModel.calculateProbability(StringList sample) |
double |
LanguageModel.calculateProbability(StringList tokens)
Calculate the probability of a series of tokens (e.g.
|
StringList |
NGramLanguageModel.predictNextTokens(StringList tokens) |
StringList |
LanguageModel.predictNextTokens(StringList tokens)
Predict the most probable output sequence of tokens, given an input sequence of tokens
|
Modifier and Type | Method and Description |
---|---|
static StringList |
NGramUtils.getNMinusOneTokenFirst(StringList ngram)
get the (n-1)th ngram of a given ngram, that is the same ngram except the last word in the ngram
|
static StringList |
NGramUtils.getNMinusOneTokenLast(StringList ngram)
get the (n-1)th ngram of a given ngram, that is the same ngram except the first word in the ngram
|
Modifier and Type | Method and Description |
---|---|
static Collection<StringList> |
NGramUtils.getNGrams(StringList sequence,
int size)
get the ngrams of dimension n of a certain input sequence of tokens
|
Iterator<StringList> |
NGramModel.iterator()
Retrieves an
Iterator over all StringList entries. |
Modifier and Type | Method and Description |
---|---|
void |
NGramModel.add(StringList ngram)
Adds one NGram, if it already exists the count increase by one.
|
void |
NGramModel.add(StringList ngram,
int minLength,
int maxLength)
Adds NGrams up to the specified length to the current instance.
|
static double |
NGramUtils.calculateLaplaceSmoothingProbability(StringList ngram,
Iterable<StringList> set,
Double k)
calculate the probability of a ngram in a vocabulary using Laplace smoothing algorithm
|
static double |
NGramUtils.calculateMissingNgramProbabilityMass(StringList ngram,
Double discount,
Iterable<StringList> set)
calculate the probability of a ngram in a vocabulary using the missing probability mass algorithm
|
static double |
NGramUtils.calculateNgramMLProbability(StringList ngram,
Iterable<StringList> set)
calculate the probability of a ngram in a vocabulary using maximum likelihood estimation
|
boolean |
NGramModel.contains(StringList tokens)
Checks fit he given tokens are contained by the current instance.
|
int |
NGramModel.getCount(StringList ngram)
Retrieves the count of the given ngram.
|
static Collection<StringList> |
NGramUtils.getNGrams(StringList sequence,
int size)
get the ngrams of dimension n of a certain input sequence of tokens
|
static StringList |
NGramUtils.getNMinusOneTokenFirst(StringList ngram)
get the (n-1)th ngram of a given ngram, that is the same ngram except the last word in the ngram
|
static StringList |
NGramUtils.getNMinusOneTokenLast(StringList ngram)
get the (n-1)th ngram of a given ngram, that is the same ngram except the first word in the ngram
|
void |
NGramModel.remove(StringList tokens)
Removes the specified tokens form the NGram model, they are just dropped.
|
void |
NGramModel.setCount(StringList ngram,
int count)
Sets the count of an existing ngram.
|
Modifier and Type | Method and Description |
---|---|
static double |
NGramUtils.calculateBigramMLProbability(String x0,
String x1,
Collection<StringList> set)
calculate the probability of a bigram in a vocabulary using maximum likelihood estimation
|
static double |
NGramUtils.calculateBigramPriorSmoothingProbability(String x0,
String x1,
Collection<StringList> set,
Double k)
calculate the probability of a bigram in a vocabulary using prior Laplace smoothing algorithm
|
static double |
NGramUtils.calculateLaplaceSmoothingProbability(StringList ngram,
Iterable<StringList> set,
Double k)
calculate the probability of a ngram in a vocabulary using Laplace smoothing algorithm
|
static double |
NGramUtils.calculateMissingNgramProbabilityMass(StringList ngram,
Double discount,
Iterable<StringList> set)
calculate the probability of a ngram in a vocabulary using the missing probability mass algorithm
|
static double |
NGramUtils.calculateNgramMLProbability(StringList ngram,
Iterable<StringList> set)
calculate the probability of a ngram in a vocabulary using maximum likelihood estimation
|
static double |
NGramUtils.calculateTrigramLinearInterpolationProbability(String x0,
String x1,
String x2,
Collection<StringList> set,
Double lambda1,
Double lambda2,
Double lambda3)
calculate the probability of a trigram in a vocabulary using a linear interpolation algorithm
|
static double |
NGramUtils.calculateTrigramMLProbability(String x0,
String x1,
String x2,
Iterable<StringList> set)
calculate the probability of a trigram in a vocabulary using maximum likelihood estimation
|
static double |
NGramUtils.calculateUnigramMLProbability(String word,
Collection<StringList> set)
calculate the probability of a unigram in a vocabulary using maximum likelihood estimation
|
Modifier and Type | Method and Description |
---|---|
boolean |
StringList.compareToIgnoreCase(StringList tokens)
Compares to tokens list and ignores the case of the tokens.
|
Copyright © 2017 The Apache Software Foundation. All rights reserved.