public abstract class AbstractBottomUpParser extends Object implements Parser
Note:
The nodes within the returned parses are shared with other parses
and therefore their parent node references will not be consistent with their child
node reference. setParents
can be used to make the parents consistent
with a particular parse, but subsequent calls to setParents
can invalidate
the results of earlier calls.
Modifier and Type | Field and Description |
---|---|
static String |
COMPLETE
Outcome used when a constituent is complete.
|
static String |
CONT
Prefix for outcomes continuing a constituent.
|
static double |
defaultAdvancePercentage
The default amount of probability mass required of advanced outcomes.
|
static int |
defaultBeamSize
The default beam size used if no beam size is given.
|
static String |
INC_NODE
The label for the top if an incomplete node.
|
static String |
INCOMPLETE
Outcome used when a constituent is incomplete.
|
static String |
OTHER
Outcome for token which is not contained in a basal constituent.
|
static String |
START
Prefix for outcomes starting a constituent.
|
static String |
TOK_NODE
The label for a token node.
|
static String |
TOP_NODE
The label for the top node.
|
static Integer |
ZERO
The integer 0.
|
Constructor and Description |
---|
AbstractBottomUpParser(POSTagger tagger,
Chunker chunker,
HeadRules headRules,
int beamSize,
double advancePercentage) |
Modifier and Type | Method and Description |
---|---|
static Dictionary |
buildDictionary(ObjectStream<Parse> data,
HeadRules rules,
int cutoff)
Creates a n-gram dictionary from the specified data stream using the specified
head rule and specified cut-off.
|
static Dictionary |
buildDictionary(ObjectStream<Parse> data,
HeadRules rules,
TrainingParameters params)
Creates a n-gram dictionary from the specified data stream using the specified
head rule and specified cut-off.
|
static Parse[] |
collapsePunctuation(Parse[] chunks,
Set<String> punctSet)
Removes the punctuation from the specified set of chunks, adds it to the parses
adjacent to the punctuation is specified, and returns a new array of parses with
the punctuation removed.
|
Parse |
parse(Parse tokens)
Returns a parse for the specified parse of tokens.
|
Parse[] |
parse(Parse tokens,
int numParses)
Returns the specified number of parses or fewer for the specified tokens.
|
void |
setErrorReporting(boolean errorReporting)
Specifies whether the parser should report when it was unable to find a parse for
a particular sentence.
|
static void |
setParents(Parse p)
Assigns parent references for the specified parse so that they
are consistent with the children references.
|
public static final int defaultBeamSize
public static final double defaultAdvancePercentage
public static final String TOP_NODE
public static final String INC_NODE
public static final String TOK_NODE
public static final Integer ZERO
public static final String START
public static final String CONT
public static final String OTHER
public static final String COMPLETE
public static final String INCOMPLETE
public void setErrorReporting(boolean errorReporting)
errorReporting
- If true then un-parsed sentences are reported, false otherwise.public static void setParents(Parse p)
p
- The parse whose parent references need to be assigned.public static Parse[] collapsePunctuation(Parse[] chunks, Set<String> punctSet)
chunks
- A set of parses.punctSet
- The set of punctuation which is to be removed.public Parse[] parse(Parse tokens, int numParses)
Parser
Parse.setParent(Parse)
can be used to make the parents consistent with a particular parse, but subsequent calls
to setParents
can invalidate the results of earlier calls.public Parse parse(Parse tokens)
Parser
public static Dictionary buildDictionary(ObjectStream<Parse> data, HeadRules rules, TrainingParameters params) throws IOException
data
- The data stream of parses.rules
- The head rules for the parses.params
- can contain a cutoff, the minimum number of entries required for the
n-gram to be saved as part of the dictionary.IOException
public static Dictionary buildDictionary(ObjectStream<Parse> data, HeadRules rules, int cutoff) throws IOException
data
- The data stream of parses.rules
- The head rules for the parses.cutoff
- The minimum number of entries required for the n-gram to be
saved as part of the dictionary.IOException
Copyright © 2018 The Apache Software Foundation. All rights reserved.