Class ChunkerME

java.lang.Object
opennlp.tools.chunker.ChunkerME
All Implemented Interfaces:
opennlp.tools.chunker.Chunker, opennlp.tools.ml.Probabilistic

public class ChunkerME extends Object implements opennlp.tools.chunker.Chunker, opennlp.tools.ml.Probabilistic
The class represents a maximum-entropy-based Chunker. A chunker can be used to find flat structures based on sequence inputs such as noun phrases or named entities.
See Also:
  • Chunker
  • Probabilistic
  • Field Details

  • Constructor Details

    • ChunkerME

      public ChunkerME(String language) throws IOException
      Initializes a Chunker by downloading a default model.
      Parameters:
      language - The language of the model.
      Throws:
      IOException - Thrown if the model cannot be downloaded or saved.
    • ChunkerME

      public ChunkerME(ChunkerModel model)
      Initializes a Chunker with the specified ChunkerModel. The DEFAULT_BEAM_SIZE is used.
      Parameters:
      model - A valid model instance.
  • Method Details

    • chunk

      public String[] chunk(String[] toks, String[] tags)
      Specified by:
      chunk in interface opennlp.tools.chunker.Chunker
    • chunkAsSpans

      public opennlp.tools.util.Span[] chunkAsSpans(String[] toks, String[] tags)
      Specified by:
      chunkAsSpans in interface opennlp.tools.chunker.Chunker
    • topKSequences

      public opennlp.tools.util.Sequence[] topKSequences(String[] sentence, String[] tags)
      Specified by:
      topKSequences in interface opennlp.tools.chunker.Chunker
    • topKSequences

      public opennlp.tools.util.Sequence[] topKSequences(String[] sentence, String[] tags, double minSequenceScore)
      Specified by:
      topKSequences in interface opennlp.tools.chunker.Chunker
    • probs

      public void probs(double[] probs)
      Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk. The specified array should be at least as large as the number of tokens in the previous call to chunk.
      Parameters:
      probs - An array used to hold the probabilities of the last decoded sequence.
    • probs

      public double[] probs()
      The sequence was determined based on the previous call to chunk(String[], String[]).
      Specified by:
      probs in interface opennlp.tools.ml.Probabilistic
      Returns:
      An array with the same number of probabilities as tokens when chunk(String[], String[]) was last called.
    • train

      public static ChunkerModel train(String lang, opennlp.tools.util.ObjectStream<opennlp.tools.chunker.ChunkSample> in, opennlp.tools.util.TrainingParameters mlParams, ChunkerFactory factory) throws IOException
      Starts a training of a ChunkerModel with the given parameters.
      Parameters:
      lang - The ISO conform language code.
      in - The ObjectStream of ChunkSample used as input for training.
      mlParams - The TrainingParameters for the context of the training.
      factory - The ChunkerFactory for creating related objects defined via mlParams.
      Returns:
      A valid, trained ChunkerModel instance.
      Throws:
      IOException - Thrown if IO errors occurred.
      IllegalArgumentException - Thrown if the specified TrainerFactory.TrainerType is not supported.