Package opennlp.tools.ngram
Klasse NGramModel
java.lang.Object
opennlp.tools.ngram.NGramModel
- Alle implementierten Schnittstellen:
Iterable<StringList>
- Bekannte direkte Unterklassen:
NGramLanguageModel
The
NGramModel can be used to crate ngrams and character ngrams.- Siehe auch:
-
Konstruktorübersicht
KonstruktorenKonstruktorBeschreibungInstantiates an emptyNGramModelinstance.Instantiates aNGramModelvia anInputStreamreference. -
Methodenübersicht
Modifizierer und TypMethodeBeschreibungvoidadd(CharSequence chars, int minLength, int maxLength) Adds character NGrams to the current instance.voidadd(StringList ngram) Adds one NGram, if it already exists the count increase by one.voidadd(StringList ngram, int minLength, int maxLength) Adds NGrams up to the specified length to the current instance.booleancontains(StringList tokens) Checks fit he given tokens are contained by the current instance.voidcutoff(int cutoffUnder, int cutoffOver) Deletes all ngram which do appear less than the cutoffUnder value and more often than the cutoffOver value.booleanintgetCount(StringList ngram) Retrieves the count of the given ngram.inthashCode()iterator()Retrieves anIteratorover allStringListentries.intRetrieves the total count of all Ngrams.voidremove(StringList tokens) Removes the specified tokens form the NGram model, they are just dropped.voidserialize(OutputStream out) Writes the ngram instance to the givenOutputStream.voidsetCount(StringList ngram, int count) Sets the count of an existing ngram.intsize()Retrieves the number ofStringListentries in the current instance.Creates a dictionary which contain allStringListwhich are in the currentNGramModel.toDictionary(boolean caseSensitive) Creates a dictionary which contains allStringLists which are in the currentNGramModel.toString()Von Schnittstelle geerbte Methoden java.lang.Iterable
forEach, spliterator
-
Konstruktordetails
-
NGramModel
public NGramModel()Instantiates an emptyNGramModelinstance. -
NGramModel
Instantiates aNGramModelvia anInputStreamreference.- Parameter:
in- the serialized model stream- Löst aus:
IOException- Thrown if errors occurred reading fromin.
-
-
Methodendetails
-
getCount
Retrieves the count of the given ngram.- Parameter:
ngram- an ngram- Gibt zurück:
- count of the ngram or 0 if it is not contained
-
setCount
Sets the count of an existing ngram.- Parameter:
ngram-count-
-
add
Adds one NGram, if it already exists the count increase by one.- Parameter:
ngram-
-
add
Adds NGrams up to the specified length to the current instance.- Parameter:
ngram- the tokens to build the uni-grams, bi-grams, tri-grams, .. from.minLength- - minimal lengthmaxLength- - maximal length
-
add
Adds character NGrams to the current instance.- Parameter:
chars-minLength-maxLength-
-
remove
Removes the specified tokens form the NGram model, they are just dropped.- Parameter:
tokens-
-
contains
Checks fit he given tokens are contained by the current instance.- Parameter:
tokens-- Gibt zurück:
- true if the ngram is contained
-
size
public int size()Retrieves the number ofStringListentries in the current instance.- Gibt zurück:
- number of different grams
-
iterator
Retrieves anIteratorover allStringListentries.- Angegeben von:
iteratorin SchnittstelleIterable<StringList>- Gibt zurück:
- iterator over all grams
-
numberOfGrams
public int numberOfGrams()Retrieves the total count of all Ngrams.- Gibt zurück:
- total count of all ngrams
-
cutoff
public void cutoff(int cutoffUnder, int cutoffOver) Deletes all ngram which do appear less than the cutoffUnder value and more often than the cutoffOver value.- Parameter:
cutoffUnder-cutoffOver-
-
toDictionary
Creates a dictionary which contain allStringListwhich are in the currentNGramModel.Entries which are only different in the case are merged into one.
Calling this method is the same as calling
toDictionary(boolean)with true.- Gibt zurück:
- a dictionary of the ngrams
-
toDictionary
Creates a dictionary which contains allStringLists which are in the currentNGramModel.- Parameter:
caseSensitive- Specifies whether case distinctions should be kept in the creation of the dictionary.- Gibt zurück:
- a dictionary of the ngrams
-
serialize
Writes the ngram instance to the givenOutputStream.- Parameter:
out-- Löst aus:
IOException- if an I/O Error during writing occurs
-
equals
-
toString
-
hashCode
public int hashCode()
-