Package opennlp.tools.formats.ad
Class ADChunkSampleStream
java.lang.Object
opennlp.tools.formats.ad.ADChunkSampleStream
- All Implemented Interfaces:
- AutoCloseable,- ObjectStream<ChunkSample>
Parser for Floresta Sita(c)tica Arvores Deitadas corpus, output to for the
 Portuguese Chunker training.
 
 The heuristic to extract chunks where based o paper 'A Machine Learning
 Approach to Portuguese Clause Identification', (Eraldo Fernandes, Cicero
 Santos and Ruy Milidiú).
 
Data can be found on this web site.
 Information about the format:
 Susana Afonso.
 
   "Árvores deitadas: Descrição do formato e das opções de análise na Floresta Sintáctica".
 
 12 de Fevereiro de 2006.
 
Detailed info about the NER tagset.
Note: Do not use this class, internal use only!
- 
Field SummaryFields
- 
Constructor SummaryConstructorsConstructorDescriptionADChunkSampleStream(InputStreamFactory in, String charsetName) Instantiates aADChunkSampleStreamstream from anInputStreamFactory.ADChunkSampleStream(ObjectStream<String> lineStream) Instantiates aADChunkSampleStreamstream fromObjectStream<String>, that could be aPlainTextByLineStreamobject.
- 
Method SummaryModifier and TypeMethodDescriptionvoidclose()Closes theObjectStreamand releases all allocated resources.static StringconvertFuncTag(String t, boolean useCGTags) read()Returns the nextObjectStreamobject.voidreset()Repositions the stream at the beginning and the previously seen object sequence will be repeated exactly.voidsetEnd(int aEnd) voidsetStart(int aStart) 
- 
Field Details- 
OTHER- See Also:
 
 
- 
- 
Constructor Details- 
ADChunkSampleStreamInstantiates aADChunkSampleStreamstream fromObjectStream<String>, that could be aPlainTextByLineStreamobject.- Parameters:
- lineStream- An- ObjectStream<String>as input.
 
- 
ADChunkSampleStreamInstantiates aADChunkSampleStreamstream from anInputStreamFactory.- Parameters:
- in- The- InputStreamFactoryfor the corpus.
- charsetName- The- charsetto use for reading of the corpus.
- Throws:
- IOException
 
 
- 
- 
Method Details- 
readDescription copied from interface:ObjectStreamReturns the nextObjectStreamobject. Calling this method repeatedly until it returnsnullwill return each object from the underlying source exactly once.- Specified by:
- readin interface- ObjectStream<ChunkSample>
- Returns:
- The next object or nullto signal that the stream is exhausted.
- Throws:
- IOException- Thrown if there is an error during reading.
 
- 
convertFuncTag
- 
setStartpublic void setStart(int aStart) 
- 
setEndpublic void setEnd(int aEnd) 
- 
resetDescription copied from interface:ObjectStreamRepositions the stream at the beginning and the previously seen object sequence will be repeated exactly. This method can be used to re-read the stream if multiple passes over the objects are required.The implementation of this method is optional. - Specified by:
- resetin interface- ObjectStream<ChunkSample>
- Throws:
- IOException- Thrown if there is an error during resetting the stream.
- UnsupportedOperationException- Thrown if the- reset()is not supported. By default, this is the case.
 
- 
closeDescription copied from interface:ObjectStreamCloses theObjectStreamand releases all allocated resources. After close was called, it's not allowed to callObjectStream.read()orObjectStream.reset().- Specified by:
- closein interface- AutoCloseable
- Specified by:
- closein interface- ObjectStream<ChunkSample>
- Throws:
- IOException- Thrown if there is an error during closing the stream.
 
 
-