Class ChunkerME

java.lang.Object
opennlp.tools.chunker.ChunkerME
All Implemented Interfaces:
Chunker

public class ChunkerME extends Object implements Chunker
The class represents a maximum-entropy-based Chunker. This chunker can be used to find flat structures based on sequence inputs such as noun phrases or named entities.
  • Field Details

  • Constructor Details

    • ChunkerME

      public ChunkerME(String language) throws IOException
      Initializes the Chunker by downloading a default model.
      Parameters:
      language - The language of the model.
      Throws:
      IOException - Thrown if the model cannot be downloaded or saved.
    • ChunkerME

      public ChunkerME(ChunkerModel model)
      Initializes the current instance with the specified ChunkerModel. The DEFAULT_BEAM_SIZE is used.
      Parameters:
      model - A valid model instance.
  • Method Details

    • chunk

      public String[] chunk(String[] toks, String[] tags)
      Description copied from interface: Chunker
      Generates chunk tags for the given sequence returning the result in an array.
      Specified by:
      chunk in interface Chunker
      Parameters:
      toks - an array of the tokens or words of the sequence.
      tags - an array of the pos tags of the sequence.
      Returns:
      an array of chunk tags for each token in the sequence.
    • chunkAsSpans

      public Span[] chunkAsSpans(String[] toks, String[] tags)
      Description copied from interface: Chunker
      Generates tagged chunk spans for the given sequence returning the result in a span array.
      Specified by:
      chunkAsSpans in interface Chunker
      Parameters:
      toks - an array of the tokens or words of the sequence.
      tags - an array of the pos tags of the sequence.
      Returns:
      an array of spans with chunk tags for each chunk in the sequence.
    • topKSequences

      public Sequence[] topKSequences(String[] sentence, String[] tags)
      Description copied from interface: Chunker
      Returns the top k chunk sequences for the specified sentence with the specified pos-tags
      Specified by:
      topKSequences in interface Chunker
      Parameters:
      sentence - The tokens of the sentence.
      tags - The pos-tags for the specified sentence.
      Returns:
      the top k chunk sequences for the specified sentence.
    • topKSequences

      public Sequence[] topKSequences(String[] sentence, String[] tags, double minSequenceScore)
      Description copied from interface: Chunker
      Returns the top k chunk sequences for the specified sentence with the specified pos-tags
      Specified by:
      topKSequences in interface Chunker
      Parameters:
      sentence - The tokens of the sentence.
      tags - The pos-tags for the specified sentence.
      minSequenceScore - A lower bound on the score of a returned sequence.
      Returns:
      the top k chunk sequences for the specified sentence.
    • probs

      public void probs(double[] probs)
      Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk. The specified array should be at least as large as the number of tokens in the previous call to chunk.
      Parameters:
      probs - An array used to hold the probabilities of the last decoded sequence.
    • probs

      public double[] probs()
      Returns an array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk.
      Returns:
      An array with the same number of probabilities as tokens when chunk(String[], String[]) was last called.
    • train

      public static ChunkerModel train(String lang, ObjectStream<ChunkSample> in, TrainingParameters mlParams, ChunkerFactory factory) throws IOException
      Starts a training of a ChunkerModel with the given parameters.
      Parameters:
      lang - The ISO conform language code.
      in - The ObjectStream of ChunkSample used as input for training.
      mlParams - The TrainingParameters for the context of the training.
      factory - The ChunkerFactory for creating related objects defined via mlParams.
      Returns:
      A valid, trained ChunkerModel instance.
      Throws:
      IOException - Thrown if IO errors occurred.