Klasse AbstractBottomUpParser

java.lang.Object
opennlp.tools.parser.AbstractBottomUpParser
Alle implementierten Schnittstellen:
Parser
Bekannte direkte Unterklassen:
Parser, Parser

public abstract class AbstractBottomUpParser extends Object implements Parser
Abstract class which contains code to tag and chunk parses for bottom up parsing and leaves implementation of advancing parses and completing parses to extend class.

Note:
The nodes within the returned parses are shared with other parses and therefore their parent node references will not be consistent with their child node reference. setParents can be used to make the parents consistent with a particular parse, but subsequent calls to setParents can invalidate the results of earlier calls.

  • Felddetails

    • defaultBeamSize

      public static final int defaultBeamSize
      The default beam size used if no beam size is given.
      Siehe auch:
    • defaultAdvancePercentage

      public static final double defaultAdvancePercentage
      The default amount of probability mass required of advanced outcomes.
      Siehe auch:
    • TOP_NODE

      public static final String TOP_NODE
      The label for the top node.
      Siehe auch:
    • INC_NODE

      public static final String INC_NODE
      The label for the top if an incomplete node.
      Siehe auch:
    • TOK_NODE

      public static final String TOK_NODE
      The label for a token node.
      Siehe auch:
    • START

      public static final String START
      Prefix for outcomes starting a constituent.
      Siehe auch:
    • CONT

      public static final String CONT
      Prefix for outcomes continuing a constituent.
      Siehe auch:
    • OTHER

      public static final String OTHER
      Outcome for token which is not contained in a basal constituent.
      Siehe auch:
    • COMPLETE

      public static final String COMPLETE
      Outcome used when a constituent is complete.
      Siehe auch:
    • INCOMPLETE

      public static final String INCOMPLETE
      Outcome used when a constituent is incomplete.
      Siehe auch:
  • Konstruktordetails

    • AbstractBottomUpParser

      public AbstractBottomUpParser(POSTagger tagger, Chunker chunker, HeadRules headRules, int beamSize, double advancePercentage)
  • Methodendetails

    • setErrorReporting

      public void setErrorReporting(boolean errorReporting)
      Specifies whether the parser should report when it was unable to find a parse for a particular sentence.
      Parameter:
      errorReporting - true if un-parsed sentences should be reported, false otherwise.
    • setParents

      public static void setParents(Parse p)
      Assigns parent references for the specified parse so that they are consistent with the children references.
      Parameter:
      p - The Parse whose parent references need to be assigned.
    • collapsePunctuation

      public static Parse[] collapsePunctuation(Parse[] chunks, Set<String> punctSet)
      Removes the punctuation from the specified set of chunks, adds it to the parses adjacent to the punctuation is specified, and returns a new array of parses with the punctuation removed.
      Parameter:
      chunks - A set of parses.
      punctSet - The set of punctuation to be removed.
      Gibt zurück:
      Array of parses which is a subset of chunks with punctuation removed.
    • parse

      public Parse[] parse(Parse tokens, int numParses)
      Beschreibung aus Schnittstelle kopiert: Parser
      Returns the specified number of parses or fewer for the specified tokens.

      Note: The nodes within the returned parses are shared with other parses and therefore their parent node references will not be consistent with their child node reference.

      Parse.setParent(Parse) can be used to make the parents consistent with a particular parse, but subsequent calls to setParents can invalidate the results of earlier calls.

      Angegeben von:
      parse in Schnittstelle Parser
      Parameter:
      tokens - A Parse containing the tokens with a single parent node.
      numParses - The number of parses desired.
      Gibt zurück:
      the specified number of parses for the specified tokens.
    • parse

      public Parse parse(Parse tokens)
      Beschreibung aus Schnittstelle kopiert: Parser
      Returns a Parse for the specified Parse of tokens.
      Angegeben von:
      parse in Schnittstelle Parser
      Parameter:
      tokens - The root node of a flat parse containing only tokens.
      Gibt zurück:
      A full parse of the specified tokens or the flat chunks of the tokens if a full parse could not be found.
    • buildDictionary

      public static Dictionary buildDictionary(ObjectStream<Parse> data, HeadRules rules, TrainingParameters params) throws IOException
      Creates a n-gram Dictionary from the specified data stream using the specified head rule and specified cut-off.
      Parameter:
      data - The data stream of parses.
      rules - The HeadRules for the parses.
      params - The TrainingParameters which can contain a cutoff, the minimum number of entries required for the n-gram to be saved as part of the Dictionary.
      Gibt zurück:
      A Dictionary instance.
      Löst aus:
      IOException
    • buildDictionary

      public static Dictionary buildDictionary(ObjectStream<Parse> data, HeadRules rules, int cutoff) throws IOException
      Creates a n-gram Dictionary from the specified data stream using HeadRules and specified cut-off.
      Parameter:
      data - The data stream of parses.
      rules - The HeadRules for the parses.
      cutoff - The minimum number of entries required for the n-gram to be saved as part of the dictionary.
      Gibt zurück:
      A Dictionary instance.
      Löst aus:
      IOException