Package | Description |
---|---|
opennlp.tools.doccat |
Package for classifying a document into a category.
|
opennlp.tools.formats.brat | |
opennlp.tools.formats.muc | |
opennlp.tools.tokenize |
Contains classes related to finding token or words in a string.
|
opennlp.tools.util.featuregen |
This package contains classes for generating sequence features.
|
Modifier and Type | Method and Description |
---|---|
Tokenizer |
DoccatFactory.getTokenizer() |
Modifier and Type | Method and Description |
---|---|
static DoccatFactory |
DoccatFactory.create(String subclassName,
Tokenizer tokenizer,
FeatureGenerator[] featureGenerators) |
protected void |
DoccatFactory.init(Tokenizer tokenizer,
FeatureGenerator[] featureGenerators) |
void |
DoccatFactory.setTokenizer(Tokenizer tokenizer) |
Constructor and Description |
---|
DoccatFactory(Tokenizer tokenizer,
FeatureGenerator[] featureGenerators)
Creates a
DoccatFactory . |
Constructor and Description |
---|
BratNameSampleStream(SentenceDetector sentDetector,
Tokenizer tokenizer,
ObjectStream<BratDocument> samples) |
Constructor and Description |
---|
MucNameContentHandler(Tokenizer tokenizer,
List<NameSample> storedSamples) |
MucNameSampleStream(Tokenizer tokenizer,
ObjectStream<String> samples) |
Modifier and Type | Class and Description |
---|---|
class |
SimpleTokenizer
Performs tokenization using character classes.
|
class |
TokenizerME
A Tokenizer for converting raw text into separated tokens.
|
class |
WhitespaceTokenizer
This tokenizer uses white spaces to tokenize the input text.
|
Constructor and Description |
---|
TokenizerEvaluator(Tokenizer tokenizer,
TokenizerEvaluationMonitor... listeners)
Initializes the current instance with the
given
Tokenizer . |
TokenizerStream(Tokenizer tokenizer,
ObjectStream<String> input) |
Constructor and Description |
---|
TokenPatternFeatureGenerator(Tokenizer supportTokenizer)
Initializes a new instance.
|
Copyright © 2015 The Apache Software Foundation. All rights reserved.