See: Description
| Interface | Description | 
|---|---|
| Detokenizer | 
 A Detokenizer merges tokens back to their untokenized representation. 
 | 
| TokenContextGenerator | 
 Interface for  
TokenizerME context generators. | 
| Tokenizer | 
 The interface for tokenizers, which segment a string into its tokens. 
 | 
| TokenizerEvaluationMonitor | 
| Class | Description | 
|---|---|
| DefaultTokenContextGenerator | 
 Generate events for maxent decisions for tokenization. 
 | 
| DetokenizationDictionary | |
| DetokenizerEvaluator | 
 The  
DetokenizerEvaluator measures the performance of
 the given Detokenizer with the provided reference
 TokenSamples. | 
| DictionaryDetokenizer | 
 A rule based detokenizer. 
 | 
| SimpleTokenizer | 
 Performs tokenization using character classes. 
 | 
| TokenizerCrossValidator | |
| TokenizerEvaluator | 
 The  
TokenizerEvaluator measures the performance of
 the given Tokenizer with the provided reference
 TokenSamples. | 
| TokenizerFactory | 
 The factory that provides  
Tokenizer default implementations and
 resources. | 
| TokenizerME | 
 A Tokenizer for converting raw text into separated tokens. 
 | 
| TokenizerModel | 
 The  
TokenizerModel is the model used
 by a learnable Tokenizer. | 
| TokenizerStream | 
 The  
TokenizerStream uses a tokenizer to tokenize the
 input string and output TokenSamples. | 
| TokenSample | 
 A  
TokenSample is text with token spans. | 
| TokenSampleStream | 
 This class is a stream filter which reads in string encoded samples and creates
  
TokenSamples out of them. | 
| TokSpanEventStream | 
 This class reads the  
TokenSamples from the given Iterator
 and converts the TokenSamples into Events which
 can be used by the maxent library for training. | 
| WhitespaceTokenizer | 
 This tokenizer uses white spaces to tokenize the input text. 
 | 
| WhitespaceTokenStream | 
 This stream formats a  
TokenSamples into whitespace
 separated token strings. | 
| Enum | Description | 
|---|---|
| DetokenizationDictionary.Operation | |
| Detokenizer.DetokenizationOperation | 
 This enum contains an operation for every token to merge the
 tokens together to their detokenized form. 
 | 
TokenizerME, the WhitespaceTokenizer and
 the SimpleTokenizer which is a character class tokenizer.Copyright © 2020 The Apache Software Foundation. All rights reserved.