Package opennlp.tools.parser
Class AbstractBottomUpParser
- java.lang.Object
-
- opennlp.tools.parser.AbstractBottomUpParser
-
- All Implemented Interfaces:
Parser
public abstract class AbstractBottomUpParser extends Object implements Parser
Abstract class which contains code to tag and chunk parses for bottom up parsing and leaves implementation of advancing parses and completing parses to extend class.Note:
The nodes within the returned parses are shared with other parses and therefore their parent node references will not be consistent with their child node reference.setParents
can be used to make the parents consistent with a particular parse, but subsequent calls tosetParents
can invalidate the results of earlier calls.
-
-
Field Summary
Fields Modifier and Type Field Description static String
COMPLETE
Outcome used when a constituent is complete.static String
CONT
Prefix for outcomes continuing a constituent.static double
defaultAdvancePercentage
The default amount of probability mass required of advanced outcomes.static int
defaultBeamSize
The default beam size used if no beam size is given.static String
INC_NODE
The label for the top if an incomplete node.static String
INCOMPLETE
Outcome used when a constituent is incomplete.static String
OTHER
Outcome for token which is not contained in a basal constituent.static String
START
Prefix for outcomes starting a constituent.static String
TOK_NODE
The label for a token node.static String
TOP_NODE
The label for the top node.
-
Constructor Summary
Constructors Constructor Description AbstractBottomUpParser(POSTagger tagger, Chunker chunker, HeadRules headRules, int beamSize, double advancePercentage)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static Dictionary
buildDictionary(ObjectStream<Parse> data, HeadRules rules, int cutoff)
Creates a n-gramDictionary
from the specified data stream usingHeadRules
and specified cut-off.static Dictionary
buildDictionary(ObjectStream<Parse> data, HeadRules rules, TrainingParameters params)
Creates a n-gramDictionary
from the specified data stream using the specified head rule and specified cut-off.static Parse[]
collapsePunctuation(Parse[] chunks, Set<String> punctSet)
Removes the punctuation from the specified set ofchunks
, adds it to the parses adjacent to the punctuation is specified, and returns a new array of parses with the punctuation removed.Parse
parse(Parse tokens)
Parse[]
parse(Parse tokens, int numParses)
Returns the specified number of parses or fewer for the specified tokens.void
setErrorReporting(boolean errorReporting)
Specifies whether the parser should report when it was unable to find a parse for a particular sentence.static void
setParents(Parse p)
Assigns parent references for the specified parse so that they are consistent with the children references.
-
-
-
Field Detail
-
defaultBeamSize
public static final int defaultBeamSize
The default beam size used if no beam size is given.- See Also:
- Constant Field Values
-
defaultAdvancePercentage
public static final double defaultAdvancePercentage
The default amount of probability mass required of advanced outcomes.- See Also:
- Constant Field Values
-
TOP_NODE
public static final String TOP_NODE
The label for the top node.- See Also:
- Constant Field Values
-
INC_NODE
public static final String INC_NODE
The label for the top if an incomplete node.- See Also:
- Constant Field Values
-
TOK_NODE
public static final String TOK_NODE
The label for a token node.- See Also:
- Constant Field Values
-
START
public static final String START
Prefix for outcomes starting a constituent.- See Also:
- Constant Field Values
-
CONT
public static final String CONT
Prefix for outcomes continuing a constituent.- See Also:
- Constant Field Values
-
OTHER
public static final String OTHER
Outcome for token which is not contained in a basal constituent.- See Also:
- Constant Field Values
-
COMPLETE
public static final String COMPLETE
Outcome used when a constituent is complete.- See Also:
- Constant Field Values
-
INCOMPLETE
public static final String INCOMPLETE
Outcome used when a constituent is incomplete.- See Also:
- Constant Field Values
-
-
Method Detail
-
setErrorReporting
public void setErrorReporting(boolean errorReporting)
Specifies whether the parser should report when it was unable to find a parse for a particular sentence.- Parameters:
errorReporting
-true
if un-parsed sentences should be reported,false
otherwise.
-
setParents
public static void setParents(Parse p)
Assigns parent references for the specified parse so that they are consistent with the children references.- Parameters:
p
- TheParse
whose parent references need to be assigned.
-
collapsePunctuation
public static Parse[] collapsePunctuation(Parse[] chunks, Set<String> punctSet)
Removes the punctuation from the specified set ofchunks
, adds it to the parses adjacent to the punctuation is specified, and returns a new array of parses with the punctuation removed.
-
parse
public Parse[] parse(Parse tokens, int numParses)
Description copied from interface:Parser
Returns the specified number of parses or fewer for the specified tokens.Note: The nodes within the returned parses are shared with other parses and therefore their parent node references will not be consistent with their child node reference.
Parse.setParent(Parse)
can be used to make the parents consistent with a particular parse, but subsequent calls tosetParents
can invalidate the results of earlier calls.
-
buildDictionary
public static Dictionary buildDictionary(ObjectStream<Parse> data, HeadRules rules, TrainingParameters params) throws IOException
Creates a n-gramDictionary
from the specified data stream using the specified head rule and specified cut-off.- Parameters:
data
- The data stream ofparses
.rules
- TheHeadRules
for the parses.params
- TheTrainingParameters
which can contain acutoff
, the minimum number of entries required for the n-gram to be saved as part of theDictionary
.- Returns:
- A
Dictionary
instance. - Throws:
IOException
-
buildDictionary
public static Dictionary buildDictionary(ObjectStream<Parse> data, HeadRules rules, int cutoff) throws IOException
Creates a n-gramDictionary
from the specified data stream usingHeadRules
and specified cut-off.- Parameters:
data
- The data stream ofparses
.rules
- TheHeadRules
for theparses
.cutoff
- The minimum number of entries required for the n-gram to be saved as part of the dictionary.- Returns:
- A
Dictionary
instance. - Throws:
IOException
-
-