Klasse DictionaryLemmatizer
- Alle implementierten Schnittstellen:
- Lemmatizer
Lemmatizer implementation that works by simple dictionary lookup into
 a Map built from a file containing, for each line:
 
 word\tabpostag\tablemma.
- 
KonstruktorübersichtKonstruktorenKonstruktorBeschreibungDictionaryLemmatizer(File dictionaryFile) Initializes aDictionaryLemmatizerand relatedHashMapfrom the input tab separated dictionary.DictionaryLemmatizer(File dictionaryFile, Charset charset) Initializes aDictionaryLemmatizerand relatedHashMapfrom the input tab separated dictionary.DictionaryLemmatizer(InputStream dictionaryStream) Initializes aDictionaryLemmatizerand relatedHashMapfrom the input tab separated dictionary.DictionaryLemmatizer(InputStream dictionaryStream, Charset charset) Initializes aDictionaryLemmatizerand relatedHashMapfrom the input tab separated dictionary.DictionaryLemmatizer(Path dictionaryPath) Initializes aDictionaryLemmatizerand relatedHashMapfrom the input tab separated dictionary.
- 
Methodenübersicht
- 
Konstruktordetails- 
DictionaryLemmatizerInitializes aDictionaryLemmatizerand relatedHashMapfrom the input tab separated dictionary.The input file should have, for each line, word\tabpostag\tablemma. Alternatively, if multiple lemmas are possible for each word-postag pair, then the format should beword\tab\postag\tablemma01#lemma02#lemma03.- Parameter:
- dictionaryStream- The dictionary referenced by an open- InputStream.
- charset- The- character encodingof the dictionary.
- Löst aus:
- IOException- Thrown if IO errors occurred while reading in from- dictionaryStream.
 
- 
DictionaryLemmatizerInitializes aDictionaryLemmatizerand relatedHashMapfrom the input tab separated dictionary.The input file should have, for each line, word\tabpostag\tablemma. Alternatively, if multiple lemmas are possible for each word-postag pair, then the format should beword\tab\postag\tablemma01#lemma02#lemma03.- Parameter:
- dictionaryStream- The dictionary referenced by an open- InputStream.
- Löst aus:
- IOException- Thrown if IO errors occurred while reading in from- dictionaryStream.
 
- 
DictionaryLemmatizerInitializes aDictionaryLemmatizerand relatedHashMapfrom the input tab separated dictionary.The input file should have, for each line, word\tabpostag\tablemma. Alternatively, if multiple lemmas are possible for each word-postag pair, then the format should beword\tab\postag\tablemma01#lemma02#lemma03.- Parameter:
- dictionaryFile- The dictionary referenced by a valid, readable- File.
- Löst aus:
- IOException- Thrown if IO errors occurred while reading in from- dictionaryFile.
 
- 
DictionaryLemmatizerInitializes aDictionaryLemmatizerand relatedHashMapfrom the input tab separated dictionary.The input file should have, for each line, word\tabpostag\tablemma. Alternatively, if multiple lemmas are possible for each word-postag pair, then the format should beword\tab\postag\tablemma01#lemma02#lemma03.- Parameter:
- dictionaryFile- The dictionary referenced by a valid, readable- File.
- charset- The- character encodingof the dictionary.
- Löst aus:
- IOException- Thrown if IO errors occurred while reading in from- dictionaryFile.
 
- 
DictionaryLemmatizerInitializes aDictionaryLemmatizerand relatedHashMapfrom the input tab separated dictionary.The input file should have, for each line, word\tabpostag\tablemma. Alternatively, if multiple lemmas are possible for each word-postag pair, then the format should beword\tab\postag\tablemma01#lemma02#lemma03.- Parameter:
- dictionaryPath- The dictionary referenced via a valid, readable- Path.
- Löst aus:
- IOException- Thrown if IO errors occurred while reading in from- dictionaryPath.
 
 
- 
- 
Methodendetails- 
getDictMap- Gibt zurück:
- Retrieves the Mapcontaining the dictionary.
 
- 
lemmatizeBeschreibung aus Schnittstelle kopiert:LemmatizerGenerates lemmas for the word and postag.- Angegeben von:
- lemmatizein Schnittstelle- Lemmatizer
- Parameter:
- tokens- An array of the tokens
- postags- an array of the pos tags
- Gibt zurück:
- An array of possible lemmas for each token in the tokssequence.
 
- 
lemmatizeBeschreibung aus Schnittstelle kopiert:LemmatizerGenerates lemma tags for the word and postag.- Angegeben von:
- lemmatizein Schnittstelle- Lemmatizer
- Parameter:
- tokens- An array of the tokens
- posTags- An array of the pos tags
- Gibt zurück:
- A list of every possible lemma for each token in the tokssequence.
 
 
-