Package opennlp.tools.stemmer
Class PorterStemmer
java.lang.Object
opennlp.tools.stemmer.PorterStemmer
- All Implemented Interfaces:
Stemmer
A
Stemmer
, implementing the
Porter Stemming Algorithm
The Stemmer implementation transforms a word into its root form. The input
word can be provided a character at time (by calling add(char)
),
or at once by calling one of the various stem(..)
methods.
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
add
(char ch) Add a character to the word being stemmed.char[]
Returns a reference to a character buffer containing the results of the stemming process.int
Returns the length of the word resulting from the stemming process.void
reset()
reset() resets the stemmer so it can stem another word.boolean
stem()
Stem the word placed into the Stemmer buffer through calls to add().boolean
stem
(char[] word) Stem a word contained in a char[].boolean
stem
(char[] word, int wordLen) Stem a word contained in a leading portion of a char[] array.boolean
stem
(char[] wordBuffer, int offset, int wordLen) Stem a word contained in a portion of a char[] array.boolean
stem
(int i0) stem
(CharSequence word) Stem a word provided as a CharSequence.Stem a word provided as a String.toString()
After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)
-
Constructor Details
-
PorterStemmer
public PorterStemmer()
-
-
Method Details
-
reset
public void reset()reset() resets the stemmer so it can stem another word. If you invoke the stemmer by calling add(char) and then stem(), you must call reset() before starting another word. -
add
public void add(char ch) Add a character to the word being stemmed. When you are finished adding characters, you can call stem(void) to process the word. -
toString
After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.) -
getResultLength
public int getResultLength()Returns the length of the word resulting from the stemming process. -
getResultBuffer
public char[] getResultBuffer()Returns a reference to a character buffer containing the results of the stemming process. You also need to consult getResultLength() to determine the length of the result. -
stem
Stem a word provided as a String. Returns the result as a String. -
stem
Stem a word provided as a CharSequence. Returns the result as a CharSequence. -
stem
public boolean stem(char[] word) Stem a word contained in a char[]. Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString(). -
stem
public boolean stem(char[] wordBuffer, int offset, int wordLen) Stem a word contained in a portion of a char[] array. Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString(). -
stem
public boolean stem(char[] word, int wordLen) Stem a word contained in a leading portion of a char[] array. Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString(). -
stem
public boolean stem()Stem the word placed into the Stemmer buffer through calls to add(). Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString(). -
stem
public boolean stem(int i0)
-