public class LeipzigDoccatSampleStream extends FilterObjectStream<String,DocumentSample>
The input text is tokenized with the SimpleTokenizer
. The input text classified
by the language model must also be tokenized by the SimpleTokenizer
to produce
exactly the same tokenization during testing and training.
Modifier and Type | Method and Description |
---|---|
DocumentSample |
read()
Returns the next object.
|
close, reset
public DocumentSample read() throws IOException
ObjectStream
IOException
- if there is an error during readingCopyright © 2015 The Apache Software Foundation. All rights reserved.