java.lang.Object
- opennlp.tools.doccat.DocumentCategorizerME

All Implemented Interfaces:

DocumentCategorizer
```
public class DocumentCategorizerME
extends Object
implements DocumentCategorizer
```
A Max-Ent based implementation of DocumentCategorizer.

Constructor Summary

Constructors
Constructor Description

DocumentCategorizerME(DoccatModel model)
Initializes a DocumentCategorizerME instance with a doccat model.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`double[]`	`categorize(String[] text)`	Categorizes the given `text`, provided in separate tokens.
`double[]`	`categorize(String[] text, Map<String,Object> extraInformation)`	Categorize the given `text` provided as tokens along with the provided extra information.
`String`	`getAllResults(double[] results)`	Retrieves the name of the category associated with the given probabilities.
`String`	`getBestCategory(double[] outcome)`	Retrieves the best category from previously generated `outcome` probabilities
`String`	`getCategory(int index)`	Retrieves the category at a given `index`.
`int`	`getIndex(String category)`	Retrieves the index of a certain category.
`int`	`getNumberOfCategories()`	Retrieves the number of categories.
`Map<String,Double>`	`scoreMap(String[] text)`	Retrieves a `Map` in which the key is the category name and the value is the score.
`SortedMap<Double,Set<String>>`	`sortedScoreMap(String[] text)`	Retrieves a `SortedMap` of the scores sorted in ascending order, together with their associated categories.
`static DoccatModel`	`train(String lang, ObjectStream<DocumentSample> samples, TrainingParameters mlParams, DoccatFactory factory)`	Starts a training of a `DoccatModel` with the given parameters.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - DocumentCategorizerME
```
public DocumentCategorizerME(DoccatModel model)
```
    Initializes a DocumentCategorizerME instance with a doccat model. Default feature generation is used.
    
    Parameters:
    
    model - the DoccatModel to be used for categorization.
- Method Detail
  - categorize
```
public double[] categorize(String[] text,
                           Map<String,Object> extraInformation)
```
    Categorize the given text provided as tokens along with the provided extra information.
    
    Specified by:
    
    categorize in interface DocumentCategorizer
    
    Parameters:
    
    text - The text tokens to categorize.
    
    extraInformation - Additional information for context to be used by the feature generator.
    
    Returns:
    
    The per category probabilities.
  - categorize
```
public double[] categorize(String[] text)
```
    Description copied from interface: DocumentCategorizer
    
    Categorizes the given text, provided in separate tokens.
    
    Specified by:
    
    categorize in interface DocumentCategorizer
    
    Parameters:
    
    text - The tokens of text to categorize.
    
    Returns:
    
    The per category probabilities.
  - scoreMap
```
public Map<String,Double> scoreMap(String[] text)
```
    Description copied from interface: DocumentCategorizer
    
    Retrieves a Map in which the key is the category name and the value is the score.
    
    Specified by:
    
    scoreMap in interface DocumentCategorizer
    
    Parameters:
    
    text - The tokenized input text to classify.
    
    Returns:
    
    A Map with the score as a key.
  - sortedScoreMap
```
public SortedMap<Double,Set<String>> sortedScoreMap(String[] text)
```
    Description copied from interface: DocumentCategorizer
    
    Retrieves a SortedMap of the scores sorted in ascending order, together with their associated categories.
    Many categories can have the same score, hence the Set as value.
    
    Specified by:
    
    sortedScoreMap in interface DocumentCategorizer
    
    Parameters:
    
    text - the input text to classify
    
    Returns:
    
    A SortedMap with the score as a key.
  - getBestCategory
```
public String getBestCategory(double[] outcome)
```
    Description copied from interface: DocumentCategorizer
    
    Retrieves the best category from previously generated outcome probabilities
    
    Specified by:
    
    getBestCategory in interface DocumentCategorizer
    
    Parameters:
    
    outcome - An array of computed outcome probabilities.
    
    Returns:
    
    The best category represented as String.
  - getIndex
```
public int getIndex(String category)
```
    Description copied from interface: DocumentCategorizer
    
    Retrieves the index of a certain category.
    
    Specified by:
    
    getIndex in interface DocumentCategorizer
    
    Parameters:
    
    category - The category for which the index is to be found.
    
    Returns:
    
    The index.
  - getCategory
```
public String getCategory(int index)
```
    Description copied from interface: DocumentCategorizer
    
    Retrieves the category at a given index.
    
    Specified by:
    
    getCategory in interface DocumentCategorizer
    
    Parameters:
    
    index - The index for which the category shall be found.
    
    Returns:
    
    The category represented as String.
  - getNumberOfCategories
```
public int getNumberOfCategories()
```
    Description copied from interface: DocumentCategorizer
    
    Retrieves the number of categories.
    
    Specified by:
    
    getNumberOfCategories in interface DocumentCategorizer
    
    Returns:
    
    The no. of categories.
  - getAllResults
```
public String getAllResults(double[] results)
```
    Description copied from interface: DocumentCategorizer
    
    Retrieves the name of the category associated with the given probabilities.
    
    Specified by:
    
    getAllResults in interface DocumentCategorizer
    
    Parameters:
    
    results - The probabilities of each category.
    
    Returns:
    
    The name of the outcome.
  - train
```
public static DoccatModel train(String lang,
                                ObjectStream<DocumentSample> samples,
                                TrainingParameters mlParams,
                                DoccatFactory factory)
                         throws IOException
```
    Starts a training of a DoccatModel with the given parameters.
    
    Parameters:
    
    lang - The ISO conform language code.
    
    samples - The ObjectStream of DocumentSample used as input for training.
    
    mlParams - The TrainingParameters for the context of the training.
    
    factory - The DoccatFactory for creating related objects defined via mlParams.
    
    Returns:
    
    A valid, trained DoccatModel instance.
    
    Throws:
    
    IOException - Thrown if IO errors occurred.

Class DocumentCategorizerME

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

DocumentCategorizerME

Method Detail

categorize

categorize

scoreMap

sortedScoreMap

getBestCategory

getIndex

getCategory

getNumberOfCategories

getAllResults

train