Klasse ChunkerME

java.lang.Object
opennlp.tools.chunker.ChunkerME
Alle implementierten Schnittstellen:
Chunker

public class ChunkerME extends Object implements Chunker
The class represents a maximum-entropy-based Chunker. This chunker can be used to find flat structures based on sequence inputs such as noun phrases or named entities.
  • Felddetails

    • DEFAULT_BEAM_SIZE

      public static final int DEFAULT_BEAM_SIZE
      Siehe auch:
  • Konstruktordetails

  • Methodendetails

    • chunk

      public String[] chunk(String[] toks, String[] tags)
      Beschreibung aus Schnittstelle kopiert: Chunker
      Generates chunk tags for the given sequence returning the result in an array.
      Angegeben von:
      chunk in Schnittstelle Chunker
      Parameter:
      toks - an array of the tokens or words of the sequence.
      tags - an array of the pos tags of the sequence.
      Gibt zurück:
      an array of chunk tags for each token in the sequence.
    • chunkAsSpans

      public Span[] chunkAsSpans(String[] toks, String[] tags)
      Beschreibung aus Schnittstelle kopiert: Chunker
      Generates tagged chunk spans for the given sequence returning the result in a span array.
      Angegeben von:
      chunkAsSpans in Schnittstelle Chunker
      Parameter:
      toks - an array of the tokens or words of the sequence.
      tags - an array of the pos tags of the sequence.
      Gibt zurück:
      an array of spans with chunk tags for each chunk in the sequence.
    • topKSequences

      public Sequence[] topKSequences(String[] sentence, String[] tags)
      Beschreibung aus Schnittstelle kopiert: Chunker
      Computes the top k chunk sequences for the specified sentence with the specified pos-tags.
      Angegeben von:
      topKSequences in Schnittstelle Chunker
      Parameter:
      sentence - The tokens of the sentence.
      tags - The pos-tags for the specified sentence.
      Gibt zurück:
      the top k chunk sequences for the specified sentence.
    • topKSequences

      public Sequence[] topKSequences(String[] sentence, String[] tags, double minSequenceScore)
      Beschreibung aus Schnittstelle kopiert: Chunker
      Computes the top k chunk sequences for the specified sentence with the specified pos-tags.
      Angegeben von:
      topKSequences in Schnittstelle Chunker
      Parameter:
      sentence - The tokens of the sentence.
      tags - The pos-tags for the specified sentence.
      minSequenceScore - A lower bound on the score of a returned sequence.
      Gibt zurück:
      the top k chunk sequences for the specified sentence.
    • probs

      public void probs(double[] probs)
      Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk. The specified array should be at least as large as the number of tokens in the previous call to chunk.
      Parameter:
      probs - An array used to hold the probabilities of the last decoded sequence.
    • probs

      public double[] probs()
      Returns an array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk(String[], String[]).
      Gibt zurück:
      An array with the same number of probabilities as tokens when chunk(String[], String[]) was last called.
    • train

      public static ChunkerModel train(String lang, ObjectStream<ChunkSample> in, TrainingParameters mlParams, ChunkerFactory factory) throws IOException
      Starts a training of a ChunkerModel with the given parameters.
      Parameter:
      lang - The ISO conform language code.
      in - The ObjectStream of ChunkSample used as input for training.
      mlParams - The TrainingParameters for the context of the training.
      factory - The ChunkerFactory for creating related objects defined via mlParams.
      Gibt zurück:
      A valid, trained ChunkerModel instance.
      Löst aus:
      IOException - Thrown if IO errors occurred.
      IllegalArgumentException - Thrown if the specified TrainerFactory.TrainerType is not supported.