Package opennlp.tools.doccat
Interface DocumentCategorizer
-
- All Known Implementing Classes:
DocumentCategorizerME
public interface DocumentCategorizer
Interface for classes which categorize documents.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description double[]
categorize(String[] text)
Categorizes the given text, provided in separate tokens.double[]
categorize(String[] text, Map<String,Object> extraInformation)
Categorize the given text provided as tokens along with the provided extra informationString
getAllResults(double[] results)
get the name of the category associated with the given probabiltiesString
getBestCategory(double[] outcome)
get the best category from previously generated outcome probabilitiesString
getCategory(int index)
get the category at a given indexint
getIndex(String category)
get the index of a certain categoryint
getNumberOfCategories()
get the number of categoriesMap<String,Double>
scoreMap(String[] text)
Returns a map in which the key is the category name and the value is the scoreSortedMap<Double,Set<String>>
sortedScoreMap(String[] text)
Get a map of the scores sorted in ascending aorder together with their associated categories.
-
-
-
Method Detail
-
categorize
double[] categorize(String[] text, Map<String,Object> extraInformation)
Categorize the given text provided as tokens along with the provided extra information- Parameters:
text
- the tokens of text to categorizeextraInformation
- extra information- Returns:
- per category probabilities
-
categorize
double[] categorize(String[] text)
Categorizes the given text, provided in separate tokens.- Parameters:
text
- the tokens of text to categorize- Returns:
- per category probabilities
-
getBestCategory
String getBestCategory(double[] outcome)
get the best category from previously generated outcome probabilities- Parameters:
outcome
- a vector of outcome probabilities- Returns:
- the best category String
-
getIndex
int getIndex(String category)
get the index of a certain category- Parameters:
category
- the category- Returns:
- an index
-
getCategory
String getCategory(int index)
get the category at a given index- Parameters:
index
- the index- Returns:
- a category
-
getNumberOfCategories
int getNumberOfCategories()
get the number of categories- Returns:
- the no. of categories
-
getAllResults
String getAllResults(double[] results)
get the name of the category associated with the given probabilties- Parameters:
results
- the probabilities of each category- Returns:
- the name of the outcome
-
scoreMap
Map<String,Double> scoreMap(String[] text)
Returns a map in which the key is the category name and the value is the score- Parameters:
text
- the input text to classify- Returns:
- a map with the score as a key. The value is a Set of categories with the score.
-
sortedScoreMap
SortedMap<Double,Set<String>> sortedScoreMap(String[] text)
Get a map of the scores sorted in ascending aorder together with their associated categories. Many categories can have the same score, hence the Set as value- Parameters:
text
- the input text to classify- Returns:
- a map with the score as a key. The value is a Set of categories with the score.
-
-