Package opennlp.tools.doccat
Interface DocumentCategorizer
-
- All Known Implementing Classes:
DocumentCategorizerME
public interface DocumentCategorizerInterface for classes which categorize documents.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description double[]categorize(String[] text)Categorizes the given text, provided in separate tokens.double[]categorize(String[] text, Map<String,Object> extraInformation)Categorize the given text provided as tokens along with the provided extra informationStringgetAllResults(double[] results)get the name of the category associated with the given probabiltiesStringgetBestCategory(double[] outcome)get the best category from previously generated outcome probabilitiesStringgetCategory(int index)get the category at a given indexintgetIndex(String category)get the index of a certain categoryintgetNumberOfCategories()get the number of categoriesMap<String,Double>scoreMap(String[] text)Returns a map in which the key is the category name and the value is the scoreSortedMap<Double,Set<String>>sortedScoreMap(String[] text)Get a map of the scores sorted in ascending aorder together with their associated categories.
-
-
-
Method Detail
-
categorize
double[] categorize(String[] text, Map<String,Object> extraInformation)
Categorize the given text provided as tokens along with the provided extra information- Parameters:
text- the tokens of text to categorizeextraInformation- extra information- Returns:
- per category probabilities
-
categorize
double[] categorize(String[] text)
Categorizes the given text, provided in separate tokens.- Parameters:
text- the tokens of text to categorize- Returns:
- per category probabilities
-
getBestCategory
String getBestCategory(double[] outcome)
get the best category from previously generated outcome probabilities- Parameters:
outcome- a vector of outcome probabilities- Returns:
- the best category String
-
getIndex
int getIndex(String category)
get the index of a certain category- Parameters:
category- the category- Returns:
- an index
-
getCategory
String getCategory(int index)
get the category at a given index- Parameters:
index- the index- Returns:
- a category
-
getNumberOfCategories
int getNumberOfCategories()
get the number of categories- Returns:
- the no. of categories
-
getAllResults
String getAllResults(double[] results)
get the name of the category associated with the given probabilties- Parameters:
results- the probabilities of each category- Returns:
- the name of the outcome
-
scoreMap
Map<String,Double> scoreMap(String[] text)
Returns a map in which the key is the category name and the value is the score- Parameters:
text- the input text to classify- Returns:
- a map with the score as a key. The value is a Set of categories with the score.
-
sortedScoreMap
SortedMap<Double,Set<String>> sortedScoreMap(String[] text)
Get a map of the scores sorted in ascending aorder together with their associated categories. Many categories can have the same score, hence the Set as value- Parameters:
text- the input text to classify- Returns:
- a map with the score as a key. The value is a Set of categories with the score.
-
-