Class LemmatizerME

    • Method Detail

      • lemmatize

        public String[] lemmatize​(String[] toks,
                                  String[] tags)
        Description copied from interface: Lemmatizer
        Generates lemmas for the word and postag.
        Specified by:
        lemmatize in interface Lemmatizer
        Parameters:
        toks - An array of the tokens
        tags - an array of the pos tags
        Returns:
        An array of possible lemmas for each token in the toks sequence.
      • lemmatize

        public List<List<String>> lemmatize​(List<String> toks,
                                            List<String> tags)
        Description copied from interface: Lemmatizer
        Generates lemma tags for the word and postag.
        Specified by:
        lemmatize in interface Lemmatizer
        Parameters:
        toks - An array of the tokens
        tags - An array of the pos tags
        Returns:
        A list of every possible lemma for each token in the toks sequence.
      • predictSES

        public String[] predictSES​(String[] toks,
                                   String[] tags)
        Predict Short Edit Script (automatically induced lemma class).
        Parameters:
        toks - An array of tokens.
        tags - An array of postags.
        Returns:
        An array of possible lemma classes for each token in toks.
      • predictLemmas

        public String[][] predictLemmas​(int numLemmas,
                                        String[] toks,
                                        String[] tags)
        Predict all possible lemmas (using a default upper bound).
        Parameters:
        numLemmas - The default number of lemmas
        toks - An array of tokens.
        tags - An array of postags.
        Returns:
        A 2-dimensional array containing all possible lemmas for each token and postag pair.
      • decodeLemmas

        public static String[] decodeLemmas​(String[] toks,
                                            String[] preds)
        Decodes the lemma from the word and the induced lemma class.
        Parameters:
        toks - An array of tokens.
        preds - An array of predicted lemma classes.
        Returns:
        The array of decoded lemmas.
      • encodeLemmas

        public static String[] encodeLemmas​(String[] toks,
                                            String[] lemmas)
        Encodes the word given its lemmas.
        Parameters:
        toks - An array of tokens.
        lemmas - An array of lemmas.
        Returns:
        The array of lemma classes.
      • topKSequences

        public Sequence[] topKSequences​(String[] sentence,
                                        String[] tags)
        Parameters:
        sentence - An array of tokens.
        tags - An array of postags.
        Returns:
        Retrieves the top-k sequences.
      • topKSequences

        public Sequence[] topKSequences​(String[] sentence,
                                        String[] tags,
                                        double minSequenceScore)
        Parameters:
        sentence - An array of tokens.
        tags - An array of postags.
        minSequenceScore - The minimum score to be achieved.
        Returns:
        Retrieves the top-k sequences.
      • probs

        public void probs​(double[] probs)
        Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to lemmatize(String[], String[]).

        The specified array should be at least as large as the number of tokens in the previous call to lemmatize(String[], String[]).

        Parameters:
        probs - An array used to hold the probabilities of the last decoded sequence.
      • probs

        public double[] probs()
        Returns an array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to lemmatize(String[], String[]).
        Returns:
        An array with the same number of probabilities as tokens were sent to lemmatize(String[], String[]) when it was last called.
      • topKLemmaClasses

        public Sequence[] topKLemmaClasses​(String[] sentence,
                                           String[] tags)
        Parameters:
        sentence - An array of tokens.
        tags - An array of postags.
        Returns:
        Retrieves the top-k lemma classes.
      • topKLemmaClasses

        public Sequence[] topKLemmaClasses​(String[] sentence,
                                           String[] tags,
                                           double minSequenceScore)
        Parameters:
        sentence - An array of tokens.
        tags - An array of postags.
        minSequenceScore - The minimum score to be achieved.
        Returns:
        Retrieves the top-k lemma classes.