Class AbstractBottomUpParser

  • All Implemented Interfaces:
    Parser
    Direct Known Subclasses:
    Parser, Parser

    public abstract class AbstractBottomUpParser
    extends Object
    implements Parser
    Abstract class which contains code to tag and chunk parses for bottom up parsing and leaves implementation of advancing parses and completing parses to extend class.

    Note:
    The nodes within the returned parses are shared with other parses and therefore their parent node references will not be consistent with their child node reference. setParents can be used to make the parents consistent with a particular parse, but subsequent calls to setParents can invalidate the results of earlier calls.

    • Constructor Detail

      • AbstractBottomUpParser

        public AbstractBottomUpParser​(POSTagger tagger,
                                      Chunker chunker,
                                      HeadRules headRules,
                                      int beamSize,
                                      double advancePercentage)
    • Method Detail

      • setErrorReporting

        public void setErrorReporting​(boolean errorReporting)
        Specifies whether the parser should report when it was unable to find a parse for a particular sentence.
        Parameters:
        errorReporting - If true then un-parsed sentences are reported, false otherwise.
      • setParents

        public static void setParents​(Parse p)
        Assigns parent references for the specified parse so that they are consistent with the children references.
        Parameters:
        p - The parse whose parent references need to be assigned.
      • collapsePunctuation

        public static Parse[] collapsePunctuation​(Parse[] chunks,
                                                  Set<String> punctSet)
        Removes the punctuation from the specified set of chunks, adds it to the parses adjacent to the punctuation is specified, and returns a new array of parses with the punctuation removed.
        Parameters:
        chunks - A set of parses.
        punctSet - The set of punctuation which is to be removed.
        Returns:
        An array of parses which is a subset of chunks with punctuation removed.
      • parse

        public Parse[] parse​(Parse tokens,
                             int numParses)
        Description copied from interface: Parser
        Returns the specified number of parses or fewer for the specified tokens.
        Note: The nodes within the returned parses are shared with other parses and therefore their parent node references will not be consistent with their child node reference. Parse.setParent(Parse) can be used to make the parents consistent with a particular parse, but subsequent calls to setParents can invalidate the results of earlier calls.
        Specified by:
        parse in interface Parser
        Parameters:
        tokens - A parse containing the tokens with a single parent node.
        numParses - The number of parses desired.
        Returns:
        the specified number of parses for the specified tokens.
      • parse

        public Parse parse​(Parse tokens)
        Description copied from interface: Parser
        Returns a parse for the specified parse of tokens.
        Specified by:
        parse in interface Parser
        Parameters:
        tokens - The root node of a flat parse containing only tokens.
        Returns:
        A full parse of the specified tokens or the flat chunks of the tokens if a fullparse could not be found.
      • buildDictionary

        public static Dictionary buildDictionary​(ObjectStream<Parse> data,
                                                 HeadRules rules,
                                                 TrainingParameters params)
                                          throws IOException
        Creates a n-gram dictionary from the specified data stream using the specified head rule and specified cut-off.
        Parameters:
        data - The data stream of parses.
        rules - The head rules for the parses.
        params - can contain a cutoff, the minimum number of entries required for the n-gram to be saved as part of the dictionary.
        Returns:
        A dictionary object.
        Throws:
        IOException
      • buildDictionary

        public static Dictionary buildDictionary​(ObjectStream<Parse> data,
                                                 HeadRules rules,
                                                 int cutoff)
                                          throws IOException
        Creates a n-gram dictionary from the specified data stream using the specified head rule and specified cut-off.
        Parameters:
        data - The data stream of parses.
        rules - The head rules for the parses.
        cutoff - The minimum number of entries required for the n-gram to be saved as part of the dictionary.
        Returns:
        A dictionary object.
        Throws:
        IOException