opennlp.tools.chunker
Class ChunkerME

java.lang.Object
  extended by opennlp.tools.chunker.ChunkerME
All Implemented Interfaces:
Chunker

public class ChunkerME
extends Object
implements Chunker

The class represents a maximum-entropy-based chunker. Such a chunker can be used to find flat structures based on sequence inputs such as noun phrases or named entities.


Field Summary
static int DEFAULT_BEAM_SIZE
           
 
Constructor Summary
ChunkerME(ChunkerModel model)
          Initializes the current instance with the specified model.
ChunkerME(ChunkerModel model, int beamSize)
          Initializes the current instance with the specified model and the specified beam size.
ChunkerME(ChunkerModel model, int beamSize, SequenceValidator<String> sequenceValidator)
          Deprecated. Use ChunkerME(ChunkerModel, int) instead and use the ChunkerFactory to configure the SequenceValidator.
ChunkerME(ChunkerModel model, int beamSize, SequenceValidator<String> sequenceValidator, ChunkerContextGenerator contextGenerator)
          Deprecated. Use ChunkerME(ChunkerModel, int) instead and use the ChunkerFactory to configure the SequenceValidator and ChunkerContextGenerator.
ChunkerME(opennlp.model.MaxentModel mod)
          Deprecated. 
ChunkerME(opennlp.model.MaxentModel mod, ChunkerContextGenerator cg)
          Deprecated. 
ChunkerME(opennlp.model.MaxentModel mod, ChunkerContextGenerator cg, int beamSize)
          Deprecated. 
 
Method Summary
 List<String> chunk(List<String> toks, List<String> tags)
          Deprecated. 
 String[] chunk(String[] toks, String[] tags)
          Generates chunk tags for the given sequence returning the result in an array.
 Span[] chunkAsSpans(String[] toks, String[] tags)
          Generates tagged chunk spans for the given sequence returning the result in a span array.
 double[] probs()
          Returns an array with the probabilities of the last decoded sequence.
 void probs(double[] probs)
          Populates the specified array with the probabilities of the last decoded sequence.
 Sequence[] topKSequences(List<String> sentence, List<String> tags)
          Deprecated. 
 Sequence[] topKSequences(String[] sentence, String[] tags)
          Returns the top k chunk sequences for the specified sentence with the specified pos-tags
 Sequence[] topKSequences(String[] sentence, String[] tags, double minSequenceScore)
          Returns the top k chunk sequences for the specified sentence with the specified pos-tags
static ChunkerModel train(String lang, ObjectStream<ChunkSample> in, ChunkerContextGenerator contextGenerator, TrainingParameters mlParams)
          Deprecated. Use #train(String, ObjectStream, ChunkerContextGenerator, TrainingParameters, ChunkerFactory) instead.
static ChunkerModel train(String lang, ObjectStream<ChunkSample> in, int cutoff, int iterations)
          Deprecated. use train(String, ObjectStream, ChunkerContextGenerator, TrainingParameters) instead and pass in a TrainingParameters object.
static ChunkerModel train(String lang, ObjectStream<ChunkSample> in, int cutoff, int iterations, ChunkerContextGenerator contextGenerator)
          Deprecated. use train(String, ObjectStream, ChunkerContextGenerator, TrainingParameters) instead and pass in a TrainingParameters object.
static ChunkerModel train(String lang, ObjectStream<ChunkSample> in, TrainingParameters mlParams, ChunkerFactory factory)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_BEAM_SIZE

public static final int DEFAULT_BEAM_SIZE
See Also:
Constant Field Values
Constructor Detail

ChunkerME

public ChunkerME(ChunkerModel model,
                 int beamSize,
                 SequenceValidator<String> sequenceValidator,
                 ChunkerContextGenerator contextGenerator)
Deprecated. Use ChunkerME(ChunkerModel, int) instead and use the ChunkerFactory to configure the SequenceValidator and ChunkerContextGenerator.

Initializes the current instance with the specified model and the specified beam size.

Parameters:
model - The model for this chunker.
beamSize - The size of the beam that should be used when decoding sequences.
sequenceValidator - The SequenceValidator to determines whether the outcome is valid for the preceding sequence. This can be used to implement constraints on what sequences are valid.

ChunkerME

public ChunkerME(ChunkerModel model,
                 int beamSize,
                 SequenceValidator<String> sequenceValidator)
Deprecated. Use ChunkerME(ChunkerModel, int) instead and use the ChunkerFactory to configure the SequenceValidator.

Initializes the current instance with the specified model and the specified beam size.

Parameters:
model - The model for this chunker.
beamSize - The size of the beam that should be used when decoding sequences.
sequenceValidator - The SequenceValidator to determines whether the outcome is valid for the preceding sequence. This can be used to implement constraints on what sequences are valid.

ChunkerME

public ChunkerME(ChunkerModel model,
                 int beamSize)
Initializes the current instance with the specified model and the specified beam size.

Parameters:
model - The model for this chunker.
beamSize - The size of the beam that should be used when decoding sequences.

ChunkerME

public ChunkerME(ChunkerModel model)
Initializes the current instance with the specified model. The default beam size is used.

Parameters:
model -

ChunkerME

@Deprecated
public ChunkerME(opennlp.model.MaxentModel mod)
Deprecated. 

Creates a chunker using the specified model.

Parameters:
mod - The maximum entropy model for this chunker.

ChunkerME

@Deprecated
public ChunkerME(opennlp.model.MaxentModel mod,
                            ChunkerContextGenerator cg)
Deprecated. 

Creates a chunker using the specified model and context generator.

Parameters:
mod - The maximum entropy model for this chunker.
cg - The context generator to be used by the specified model.

ChunkerME

@Deprecated
public ChunkerME(opennlp.model.MaxentModel mod,
                            ChunkerContextGenerator cg,
                            int beamSize)
Deprecated. 

Creates a chunker using the specified model and context generator and decodes the model using a beam search of the specified size.

Parameters:
mod - The maximum entropy model for this chunker.
cg - The context generator to be used by the specified model.
beamSize - The size of the beam that should be used when decoding sequences.
Method Detail

chunk

@Deprecated
public List<String> chunk(List<String> toks,
                                     List<String> tags)
Deprecated. 

Description copied from interface: Chunker
Generates chunk tags for the given sequence returning the result in a list.

Specified by:
chunk in interface Chunker
Parameters:
toks - a list of the tokens or words of the sequence.
tags - a list of the pos tags of the sequence.
Returns:
a list of chunk tags for each token in the sequence.

chunk

public String[] chunk(String[] toks,
                      String[] tags)
Description copied from interface: Chunker
Generates chunk tags for the given sequence returning the result in an array.

Specified by:
chunk in interface Chunker
Parameters:
toks - an array of the tokens or words of the sequence.
tags - an array of the pos tags of the sequence.
Returns:
an array of chunk tags for each token in the sequence.

chunkAsSpans

public Span[] chunkAsSpans(String[] toks,
                           String[] tags)
Description copied from interface: Chunker
Generates tagged chunk spans for the given sequence returning the result in a span array.

Specified by:
chunkAsSpans in interface Chunker
Parameters:
toks - an array of the tokens or words of the sequence.
tags - an array of the pos tags of the sequence.
Returns:
an array of spans with chunk tags for each chunk in the sequence.

topKSequences

@Deprecated
public Sequence[] topKSequences(List<String> sentence,
                                           List<String> tags)
Deprecated. 

Description copied from interface: Chunker
Returns the top k chunk sequences for the specified sentence with the specified pos-tags

Specified by:
topKSequences in interface Chunker
Parameters:
sentence - The tokens of the sentence.
tags - The pos-tags for the specified sentence.
Returns:
the top k chunk sequences for the specified sentence.

topKSequences

public Sequence[] topKSequences(String[] sentence,
                                String[] tags)
Description copied from interface: Chunker
Returns the top k chunk sequences for the specified sentence with the specified pos-tags

Specified by:
topKSequences in interface Chunker
Parameters:
sentence - The tokens of the sentence.
tags - The pos-tags for the specified sentence.
Returns:
the top k chunk sequences for the specified sentence.

topKSequences

public Sequence[] topKSequences(String[] sentence,
                                String[] tags,
                                double minSequenceScore)
Description copied from interface: Chunker
Returns the top k chunk sequences for the specified sentence with the specified pos-tags

Specified by:
topKSequences in interface Chunker
Parameters:
sentence - The tokens of the sentence.
tags - The pos-tags for the specified sentence.
minSequenceScore - A lower bound on the score of a returned sequence.
Returns:
the top k chunk sequences for the specified sentence.

probs

public void probs(double[] probs)
Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk. The specified array should be at least as large as the numbe of tokens in the previous call to chunk.

Parameters:
probs - An array used to hold the probabilities of the last decoded sequence.

probs

public double[] probs()
Returns an array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk.

Returns:
An array with the same number of probabilities as tokens were sent to chunk when it was last called.

train

public static ChunkerModel train(String lang,
                                 ObjectStream<ChunkSample> in,
                                 TrainingParameters mlParams,
                                 ChunkerFactory factory)
                          throws IOException
Throws:
IOException

train

public static ChunkerModel train(String lang,
                                 ObjectStream<ChunkSample> in,
                                 ChunkerContextGenerator contextGenerator,
                                 TrainingParameters mlParams)
                          throws IOException
Deprecated. Use #train(String, ObjectStream, ChunkerContextGenerator, TrainingParameters, ChunkerFactory) instead.

Throws:
IOException

train

public static ChunkerModel train(String lang,
                                 ObjectStream<ChunkSample> in,
                                 int cutoff,
                                 int iterations,
                                 ChunkerContextGenerator contextGenerator)
                          throws IOException
Deprecated. use train(String, ObjectStream, ChunkerContextGenerator, TrainingParameters) instead and pass in a TrainingParameters object.

Throws:
IOException

train

@Deprecated
public static ChunkerModel train(String lang,
                                            ObjectStream<ChunkSample> in,
                                            int cutoff,
                                            int iterations)
                          throws IOException,
                                 ObjectStreamException
Deprecated. use train(String, ObjectStream, ChunkerContextGenerator, TrainingParameters) instead and pass in a TrainingParameters object.

Trains a new model for the ChunkerME.

Parameters:
in -
cutoff -
iterations -
Returns:
the new model
Throws:
IOException
ObjectStreamException


Copyright © 2013 The Apache Software Foundation. All Rights Reserved.