Package opennlp.tools.formats.ad
Klasse ADChunkSampleStream
java.lang.Object
opennlp.tools.formats.ad.ADChunkSampleStream
- Alle implementierten Schnittstellen:
AutoCloseable,ObjectStream<ChunkSample>
Parser for Floresta Sita(c)tica Arvores Deitadas corpus, output to for the
Portuguese Chunker training.
The heuristic to extract chunks where based o paper 'A Machine Learning
Approach to Portuguese Clause Identification', (Eraldo Fernandes, Cicero
Santos and Ruy Milidiú).
Data can be found on this web site.
Information about the format:
Susana Afonso.
"Árvores deitadas: Descrição do formato e das opções de análise na Floresta Sintáctica".
12 de Fevereiro de 2006.
Detailed info about the NER tagset.
Note: Do not use this class, internal use only!
-
Feldübersicht
Felder -
Konstruktorübersicht
KonstruktorenKonstruktorBeschreibungADChunkSampleStream(InputStreamFactory in, String charsetName) Instantiates aADChunkSampleStreamstream from anInputStreamFactory.ADChunkSampleStream(ObjectStream<String> lineStream) Instantiates aADChunkSampleStreamstream fromObjectStream<String>, that could be aPlainTextByLineStreamobject. -
Methodenübersicht
Modifizierer und TypMethodeBeschreibungvoidclose()Closes theObjectStreamand releases all allocated resources.static StringconvertFuncTag(String t, boolean useCGTags) read()Returns the nextObjectStreamobject.voidreset()Repositions the stream at the beginning and the previously seen object sequence will be repeated exactly.voidsetEnd(int aEnd) voidsetStart(int aStart)
-
Felddetails
-
OTHER
- Siehe auch:
-
-
Konstruktordetails
-
ADChunkSampleStream
Instantiates aADChunkSampleStreamstream fromObjectStream<String>, that could be aPlainTextByLineStreamobject.- Parameter:
lineStream- AnObjectStream<String>as input.
-
ADChunkSampleStream
Instantiates aADChunkSampleStreamstream from anInputStreamFactory.- Parameter:
in- TheInputStreamFactoryfor the corpus.charsetName- Thecharsetto use for reading of the corpus.- Löst aus:
IOException
-
-
Methodendetails
-
read
Beschreibung aus Schnittstelle kopiert:ObjectStreamReturns the nextObjectStreamobject. Calling this method repeatedly until it returnsnullwill return each object from the underlying source exactly once.- Angegeben von:
readin SchnittstelleObjectStream<ChunkSample>- Gibt zurück:
- The next object or
nullto signal that the stream is exhausted. - Löst aus:
IOException- Thrown if there is an error during reading.
-
convertFuncTag
-
setStart
public void setStart(int aStart) -
setEnd
public void setEnd(int aEnd) -
reset
Beschreibung aus Schnittstelle kopiert:ObjectStreamRepositions the stream at the beginning and the previously seen object sequence will be repeated exactly. This method can be used to re-read the stream if multiple passes over the objects are required.The implementation of this method is optional.
- Angegeben von:
resetin SchnittstelleObjectStream<ChunkSample>- Löst aus:
IOException- Thrown if there is an error during resetting the stream.UnsupportedOperationException- Thrown if thereset()is not supported. By default, this is the case.
-
close
Beschreibung aus Schnittstelle kopiert:ObjectStreamCloses theObjectStreamand releases all allocated resources. After close was called, it's not allowed to callObjectStream.read()orObjectStream.reset().- Angegeben von:
closein SchnittstelleAutoCloseable- Angegeben von:
closein SchnittstelleObjectStream<ChunkSample>- Löst aus:
IOException- Thrown if there is an error during closing the stream.
-