Package | Description |
---|---|
opennlp.tools.formats |
Experimental package related to converting various corpora to OpenNLP Format.
|
opennlp.tools.formats.brat | |
opennlp.tools.formats.muc | |
opennlp.tools.tokenize |
Contains classes related to finding token or words in a string.
|
opennlp.tools.util.featuregen |
This package contains classes for generating sequence features.
|
Constructor and Description |
---|
LeipzigDoccatSampleStream(String language,
int sentencesPerDocument,
Tokenizer tokenizer,
InputStreamFactory in)
Creates a new LeipzigDoccatSampleStream with the specified parameters.
|
Constructor and Description |
---|
BratNameSampleStream(SentenceDetector sentDetector,
Tokenizer tokenizer,
ObjectStream<BratDocument> samples) |
Constructor and Description |
---|
MucNameContentHandler(Tokenizer tokenizer,
List<NameSample> storedSamples) |
MucNameSampleStream(Tokenizer tokenizer,
ObjectStream<String> samples) |
Modifier and Type | Class and Description |
---|---|
class |
SimpleTokenizer
Performs tokenization using character classes.
|
class |
TokenizerME
A Tokenizer for converting raw text into separated tokens.
|
class |
WhitespaceTokenizer
This tokenizer uses white spaces to tokenize the input text.
|
Constructor and Description |
---|
TokenizerEvaluator(Tokenizer tokenizer,
TokenizerEvaluationMonitor... listeners)
Initializes the current instance with the
given
Tokenizer . |
TokenizerStream(Tokenizer tokenizer,
ObjectStream<String> input) |
Constructor and Description |
---|
TokenPatternFeatureGenerator(Tokenizer supportTokenizer)
Initializes a new instance.
|
Copyright © 2017 The Apache Software Foundation. All rights reserved.