public class ADChunkSampleStream extends Object implements ObjectStream<ChunkSample>
The heuristic to extract chunks where based o paper 'A Machine Learning
Approach to Portuguese Clause Identification', (Eraldo Fernandes, Cicero
Santos and Ruy Milidiú).
Data can be found on this web site:
http://www.linguateca.pt/floresta/corpus.html
Information about the format:
Susana Afonso.
"Árvores deitadas: Descrição do formato e das opções de análise na Floresta Sintáctica"
.
12 de Fevereiro de 2006.
http://www.linguateca.pt/documentos/Afonso2006ArvoresDeitadas.pdf
Detailed info about the NER tagset: http://beta.visl.sdu.dk/visl/pt/info/portsymbol.html#semtags_names
Note: Do not use this class, internal use only!
Constructor and Description |
---|
ADChunkSampleStream(InputStreamFactory in,
String charsetName) |
ADChunkSampleStream(InputStream in,
String charsetName)
Deprecated.
|
ADChunkSampleStream(ObjectStream<String> lineStream)
Creates a new
NameSample stream from a line stream, i.e. |
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes the
ObjectStream and releases all allocated
resources. |
static String |
convertFuncTag(String t,
boolean useCGTags) |
ChunkSample |
read()
Returns the next object.
|
void |
reset()
Repositions the stream at the beginning and the previously seen object sequence
will be repeated exactly.
|
void |
setEnd(int aEnd) |
void |
setStart(int aStart) |
public static final String OTHER
public ADChunkSampleStream(ObjectStream<String> lineStream)
NameSample
stream from a line stream, i.e.
ObjectStream
<String
>, that could be a
PlainTextByLineStream
object.lineStream
- a stream of lines as String
public ADChunkSampleStream(InputStreamFactory in, String charsetName) throws IOException
IOException
@Deprecated public ADChunkSampleStream(InputStream in, String charsetName)
NameSample
stream from a InputStream
in
- the Corpus InputStream
charsetName
- the charset of the Arvores Deitadas Corpuspublic ChunkSample read() throws IOException
ObjectStream
read
in interface ObjectStream<ChunkSample>
IOException
- if there is an error during readingpublic void setStart(int aStart)
public void setEnd(int aEnd)
public void reset() throws IOException, UnsupportedOperationException
ObjectStream
reset
in interface ObjectStream<ChunkSample>
IOException
- if there is an error during reseting the streamUnsupportedOperationException
public void close() throws IOException
ObjectStream
ObjectStream
and releases all allocated
resources. After close was called its not allowed to call
read or reset.close
in interface AutoCloseable
close
in interface ObjectStream<ChunkSample>
IOException
- if there is an error during closing the streamCopyright © 2015 The Apache Software Foundation. All rights reserved.