Class DocumentCategorizerME

    • Constructor Detail

      • DocumentCategorizerME

        public DocumentCategorizerME​(DoccatModel model)
        Initializes the current instance with a doccat model. Default feature generation is used.
        Parameters:
        model - the doccat model
    • Method Detail

      • categorize

        public double[] categorize​(String[] text,
                                   Map<String,​Object> extraInformation)
        Categorize the given text provided as tokens along with the provided extra information
        Specified by:
        categorize in interface DocumentCategorizer
        Parameters:
        text - text tokens to categorize
        extraInformation - additional information
        Returns:
        per category probabilities
      • categorize

        public double[] categorize​(String[] text)
        Categorizes the given text.
        Specified by:
        categorize in interface DocumentCategorizer
        Parameters:
        text - the text to categorize
        Returns:
        per category probabilities
      • scoreMap

        public Map<String,​Double> scoreMap​(String[] text)
        Returns a map in which the key is the category name and the value is the score
        Specified by:
        scoreMap in interface DocumentCategorizer
        Parameters:
        text - the input text to classify
        Returns:
        the score map
      • sortedScoreMap

        public SortedMap<Double,​Set<String>> sortedScoreMap​(String[] text)
        Returns a map with the score as a key in ascending order. The value is a Set of categories with the score. Many categories can have the same score, hence the Set as value
        Specified by:
        sortedScoreMap in interface DocumentCategorizer
        Parameters:
        text - the input text to classify
        Returns:
        the sorted score map
      • getBestCategory

        public String getBestCategory​(double[] outcome)
        Description copied from interface: DocumentCategorizer
        get the best category from previously generated outcome probabilities
        Specified by:
        getBestCategory in interface DocumentCategorizer
        Parameters:
        outcome - a vector of outcome probabilities
        Returns:
        the best category String
      • getAllResults

        public String getAllResults​(double[] results)
        Description copied from interface: DocumentCategorizer
        get the name of the category associated with the given probabilties
        Specified by:
        getAllResults in interface DocumentCategorizer
        Parameters:
        results - the probabilities of each category
        Returns:
        the name of the outcome