Package opennlp.tools.namefind
Class NameFinderEventStream
- All Implemented Interfaces:
AutoCloseable
,ObjectStream<Event>
Class for creating an event stream out of data files for training an
TokenNameFinder
.-
Constructor Summary
ConstructorDescriptionNameFinderEventStream
(ObjectStream<NameSample> dataStream, String type, NameContextGenerator contextGenerator, SequenceCodec<String> codec) -
Method Summary
Modifier and TypeMethodDescriptionstatic String[][]
additionalContext
(String[] tokens, Map<String, String> prevMap) Generated previous decision features for each token based on contents of the specifiedprevMap
.generateEvents
(String[] sentence, String[] outcomes, NameContextGenerator cg) Generatesevents
for each token in asentence
with the specifiedoutcomes
using the specifiedNameContextGenerator
.static String[]
generateOutcomes
(Span[] names, String type, int length) Deprecated.Methods inherited from class opennlp.tools.util.AbstractEventStream
close, read, reset
-
Constructor Details
-
NameFinderEventStream
public NameFinderEventStream(ObjectStream<NameSample> dataStream, String type, NameContextGenerator contextGenerator, SequenceCodec<String> codec) - Parameters:
dataStream
- Thedata stream
of events.type
-null
or overrides the type parameter in the provided samples.contextGenerator
- TheNameContextGenerator
used to generate features for the event stream.codec
- TheSequenceCodec
to use.
-
-
Method Details
-
generateOutcomes
Deprecated.use theBioCodec
implementation of the SequenceValidator instead!Generates the name tag outcomes (start
,continue
,other
) for each token in a sentence with the specifiedlength
using the specifiednames
.- Parameters:
names
- Tokenspans
for each of the names.type
-null
or overrides the type parameter in the provided sampleslength
- The length of the sentence.- Returns:
- An array of
start
,continue
,other
outcomes based on the specified names and sentencelength
.
-
generateEvents
public static List<Event> generateEvents(String[] sentence, String[] outcomes, NameContextGenerator cg) Generatesevents
for each token in asentence
with the specifiedoutcomes
using the specifiedNameContextGenerator
.- Parameters:
sentence
- Token representing a sentence.outcomes
- An array of outcomes.cg
- TheNameContextGenerator
to use.- Returns:
- A list of
events
generated.
-
additionalContext
Generated previous decision features for each token based on contents of the specifiedprevMap
.- Parameters:
tokens
- The token for which the context is generated.prevMap
- A mapping of tokens to their previous decisions.- Returns:
- A 2-dimensional array with additional context with features for each token.
-
BioCodec
implementation of the SequenceValidator instead!