Package opennlp.tools.namefind
Class NameFinderME
- java.lang.Object
-
- opennlp.tools.namefind.NameFinderME
-
- All Implemented Interfaces:
TokenNameFinder
public class NameFinderME extends Object implements TokenNameFinder
A maximum-entropy-basedname finder
implementation.
-
-
Constructor Summary
Constructors Constructor Description NameFinderME(TokenNameFinderModel model)
Initializes aNameFinderME
with aTokenNameFinderModel
.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
clearAdaptiveData()
Forgets all adaptive data which was collected during previous calls to one of the find methods.static Span[]
dropOverlappingSpans(Span[] spans)
Removesspans
with are intersecting or crossing in any way.Span[]
find(String[] tokens)
Generates name tags for the given sequence, typically a sentence, returningtoken spans
for any identified names.Span[]
find(String[] tokens, String[][] additionalContext)
Generates name tags for the given sequence, typically a sentence, returningtoken spans
for any identified names.double[]
probs()
Retrieves the probabilities of the last decoded sequence.void
probs(double[] probs)
Populates the specified array with the probabilities of the last decoded sequence.double[]
probs(Span[] spans)
Retrieves an array of probabilities for each of the specified spans which is the arithmetic mean of the probabilities for each of the outcomes which make up the span.static TokenNameFinderModel
train(String languageCode, String type, ObjectStream<NameSample> samples, TrainingParameters params, TokenNameFinderFactory factory)
Starts a training of aTokenNameFinderModel
with the given parameters.
-
-
-
Field Detail
-
DEFAULT_BEAM_SIZE
public static final int DEFAULT_BEAM_SIZE
- See Also:
- Constant Field Values
-
START
public static final String START
- See Also:
- Constant Field Values
-
CONTINUE
public static final String CONTINUE
- See Also:
- Constant Field Values
-
OTHER
public static final String OTHER
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
NameFinderME
public NameFinderME(TokenNameFinderModel model)
Initializes aNameFinderME
with aTokenNameFinderModel
.- Parameters:
model
- TheTokenNameFinderModel
to initialize with.
-
-
Method Detail
-
find
public Span[] find(String[] tokens)
Description copied from interface:TokenNameFinder
Generates name tags for the given sequence, typically a sentence, returningtoken spans
for any identified names.- Specified by:
find
in interfaceTokenNameFinder
- Parameters:
tokens
- An array of the tokens or words of the sequence, typically a sentence.- Returns:
- An array of
spans
for each of the names identified.
-
find
public Span[] find(String[] tokens, String[][] additionalContext)
Generates name tags for the given sequence, typically a sentence, returningtoken spans
for any identified names.- Parameters:
tokens
- An array of the tokens or words of a sequence, typically a sentence.additionalContext
- Features which are based on context outside of the sentence but which should also be used.- Returns:
- An array of
token spans
for each of the names identified.
-
clearAdaptiveData
public void clearAdaptiveData()
Description copied from interface:TokenNameFinder
Forgets all adaptive data which was collected during previous calls to one of the find methods.Note: This method should typically be called at the end of the processing of a document.
- Specified by:
clearAdaptiveData
in interfaceTokenNameFinder
-
probs
public void probs(double[] probs)
Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call tofind(String[])
. The specified array should be at least as large as the number of tokens in the previous call tofind(String[])
.- Parameters:
probs
- An array with the probabilities of the last decoded sequence.
-
probs
public double[] probs()
Retrieves the probabilities of the last decoded sequence. The sequence was determined based on the previous call tofind(String[])
.- Returns:
- An array with the same number of probabilities as tokens were sent
to
find(String[])
when it was last called.
-
probs
public double[] probs(Span[] spans)
Retrieves an array of probabilities for each of the specified spans which is the arithmetic mean of the probabilities for each of the outcomes which make up the span.- Parameters:
spans
- Thespans
of the names for which probabilities are requested.- Returns:
- An array of probabilities for each of the specified spans.
-
train
public static TokenNameFinderModel train(String languageCode, String type, ObjectStream<NameSample> samples, TrainingParameters params, TokenNameFinderFactory factory) throws IOException
Starts a training of aTokenNameFinderModel
with the given parameters.- Parameters:
languageCode
- The ISO conform language code.type
- The type to use.samples
- TheObjectStream
ofNameSample
used as input for training.params
- TheTrainingParameters
for the context of the training.factory
- TheTokenNameFinderFactory
for creating related objects defined viaparams
.- Returns:
- A valid, trained
TokenNameFinderModel
instance. - Throws:
IOException
- Thrown if IO errors occurred during training.
-
dropOverlappingSpans
public static Span[] dropOverlappingSpans(Span[] spans)
Removesspans
with are intersecting or crossing in any way.The following rules are used to remove the spans:
Identical spans: The first span in the array after sorting it remains.
Intersecting spans: The first span after sorting remains.
Contained spans: All spans which are contained by another are removed.
-
-