java.lang.Object

opennlp.tools.namefind.NameFinderME

All Implemented Interfaces:: TokenNameFinder

public class NameFinderME extends Object implements TokenNameFinder

A maximum-entropy-based name finder implementation.

Field Summary

Fields

Modifier and Type

Field

Description

static final String

CONTINUE

static final int

DEFAULT_BEAM_SIZE

static final String

OTHER

static final String

START
Constructor Summary

Constructors

Constructor

Description

NameFinderME(TokenNameFinderModel model)

Initializes a NameFinderME with a TokenNameFinderModel.
Method Summary

Modifier and Type

Method

Description

void

clearAdaptiveData()

Forgets all adaptive data which was collected during previous calls to one of the find methods.

static Span[]

dropOverlappingSpans(Span[] spans)

Removes spans with are intersecting or crossing in any way.

Span[]

find(String[] tokens)

Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.

Span[]

find(String[] tokens, String[][] additionalContext)

Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.

double[]

probs()

Retrieves the probabilities of the last decoded sequence.

void

probs(double[] probs)

Populates the specified array with the probabilities of the last decoded sequence.

double[]

probs(Span[] spans)

Retrieves an array of probabilities for each of the specified spans which is the arithmetic mean of the probabilities for each of the outcomes which make up the span.

static TokenNameFinderModel

train(String languageCode, String type, ObjectStream<NameSample> samples, TrainingParameters params, TokenNameFinderFactory factory)

Starts a training of a TokenNameFinderModel with the given parameters.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- DEFAULT_BEAM_SIZE
  
  public static final int DEFAULT_BEAM_SIZE
  See Also:
  
  Constant Field Values
- START
  
  public static final String START
  See Also:
  
  Constant Field Values
- CONTINUE
  
  public static final String CONTINUE
  See Also:
  
  Constant Field Values
- OTHER
  
  public static final String OTHER
  See Also:
  
  Constant Field Values
Constructor Details
- NameFinderME
  
  public NameFinderME(TokenNameFinderModel model)
  
  Initializes a NameFinderME with a TokenNameFinderModel.
  
  Parameters:
  
  model - The TokenNameFinderModel to initialize with.
Method Details
- find
  
  public Span[] find(String[] tokens)
  
  Description copied from interface: TokenNameFinder
  
  Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.
  
  Specified by:
  
  find in interface TokenNameFinder
  
  Parameters:
  
  tokens - An array of the tokens or words of the sequence, typically a sentence.
  
  Returns:
  
  An array of spans for each of the names identified.
- find
  
  public Span[] find(String[] tokens, String[][] additionalContext)
  
  Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.
  
  Parameters:
  
  tokens - An array of the tokens or words of a sequence, typically a sentence.
  
  additionalContext - Features which are based on context outside of the sentence but which should also be used.
  
  Returns:
  
  An array of token spans for each of the names identified.
- clearAdaptiveData
  
  public void clearAdaptiveData()
  
  Description copied from interface: TokenNameFinder
  
  Forgets all adaptive data which was collected during previous calls to one of the find methods.
  Note: This method should typically be called at the end of the processing of a document.
  
  Specified by:
  
  clearAdaptiveData in interface TokenNameFinder
- probs
  
  public void probs(double[] probs)
  
  Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to find(String[]). The specified array should be at least as large as the number of tokens in the previous call to find(String[]).
  
  Parameters:
  
  probs - An array with the probabilities of the last decoded sequence.
- probs
  
  public double[] probs()
  
  Retrieves the probabilities of the last decoded sequence. The sequence was determined based on the previous call to find(String[]).
  
  Returns:
  
  An array with the same number of probabilities as tokens were sent to find(String[]) when it was last called.
- probs
  
  public double[] probs(Span[] spans)
  
  Retrieves an array of probabilities for each of the specified spans which is the arithmetic mean of the probabilities for each of the outcomes which make up the span.
  
  Parameters:
  
  spans - The spans of the names for which probabilities are requested.
  
  Returns:
  
  An array of probabilities for each of the specified spans.
- train
  
  public static TokenNameFinderModel train(String languageCode, String type, ObjectStream<NameSample> samples, TrainingParameters params, TokenNameFinderFactory factory) throws IOException
  
  Starts a training of a TokenNameFinderModel with the given parameters.
  
  Parameters:
  
  languageCode - The ISO conform language code.
  
  type - The type to use.
  
  samples - The ObjectStream of NameSample used as input for training.
  
  params - The TrainingParameters for the context of the training.
  
  factory - The TokenNameFinderFactory for creating related objects defined via params.
  
  Returns:
  
  A valid, trained TokenNameFinderModel instance.
  
  Throws:
  
  IOException - Thrown if IO errors occurred during training.
- dropOverlappingSpans
  
  public static Span[] dropOverlappingSpans(Span[] spans)
  
  Removes spans with are intersecting or crossing in any way.
  The following rules are used to remove the spans:
  Identical spans: The first span in the array after sorting it remains.
  Intersecting spans: The first span after sorting remains.
  Contained spans: All spans which are contained by another are removed.
  
  Parameters:
  
  spans - The input spans.
  
  Returns:
  
  The resulting non-overlapping spans.

Class NameFinderME

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

DEFAULT_BEAM_SIZE

START

CONTINUE

OTHER

Constructor Details

NameFinderME

Method Details

find

find

clearAdaptiveData

probs

probs

probs

train

dropOverlappingSpans