Class NameFinderME

java.lang.Object
opennlp.tools.namefind.NameFinderME
All Implemented Interfaces:
TokenNameFinder

public class NameFinderME extends Object implements TokenNameFinder
A maximum-entropy-based name finder implementation.
  • Field Details

  • Constructor Details

  • Method Details

    • find

      public Span[] find(String[] tokens)
      Description copied from interface: TokenNameFinder
      Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.
      Specified by:
      find in interface TokenNameFinder
      Parameters:
      tokens - An array of the tokens or words of the sequence, typically a sentence.
      Returns:
      An array of spans for each of the names identified.
    • find

      public Span[] find(String[] tokens, String[][] additionalContext)
      Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.
      Parameters:
      tokens - An array of the tokens or words of a sequence, typically a sentence.
      additionalContext - Features which are based on context outside of the sentence but which should also be used.
      Returns:
      An array of token spans for each of the names identified.
    • clearAdaptiveData

      public void clearAdaptiveData()
      Description copied from interface: TokenNameFinder
      Forgets all adaptive data which was collected during previous calls to one of the find methods.

      Note: This method should typically be called at the end of the processing of a document.

      Specified by:
      clearAdaptiveData in interface TokenNameFinder
    • probs

      public void probs(double[] probs)
      Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to find(String[]). The specified array should be at least as large as the number of tokens in the previous call to find(String[]).
      Parameters:
      probs - An array with the probabilities of the last decoded sequence.
    • probs

      public double[] probs()
      Retrieves the probabilities of the last decoded sequence. The sequence was determined based on the previous call to find(String[]).
      Returns:
      An array with the same number of probabilities as tokens were sent to find(String[]) when it was last called.
    • probs

      public double[] probs(Span[] spans)
      Retrieves an array of probabilities for each of the specified spans which is the arithmetic mean of the probabilities for each of the outcomes which make up the span.
      Parameters:
      spans - The spans of the names for which probabilities are requested.
      Returns:
      An array of probabilities for each of the specified spans.
    • train

      public static TokenNameFinderModel train(String languageCode, String type, ObjectStream<NameSample> samples, TrainingParameters params, TokenNameFinderFactory factory) throws IOException
      Starts a training of a TokenNameFinderModel with the given parameters.
      Parameters:
      languageCode - The ISO conform language code.
      type - The type to use.
      samples - The ObjectStream of NameSample used as input for training.
      params - The TrainingParameters for the context of the training.
      factory - The TokenNameFinderFactory for creating related objects defined via params.
      Returns:
      A valid, trained TokenNameFinderModel instance.
      Throws:
      IOException - Thrown if IO errors occurred during training.
    • dropOverlappingSpans

      public static Span[] dropOverlappingSpans(Span[] spans)
      Removes spans with are intersecting or crossing in any way.

      The following rules are used to remove the spans:
      Identical spans: The first span in the array after sorting it remains.
      Intersecting spans: The first span after sorting remains.
      Contained spans: All spans which are contained by another are removed.

      Parameters:
      spans - The input spans.
      Returns:
      The resulting non-overlapping spans.