public class ADNameSampleStream extends Object implements ObjectStream<NameSample>
The data contains four named entity types: Person, Organization, Group,
Place, Event, ArtProd, Abstract, Thing, Time and Numeric.
Data can be found on this web site:
http://www.linguateca.pt/floresta/corpus.html
Information about the format:
Susana Afonso.
"Árvores deitadas: Descrição do formato e das opções de análise na Floresta Sintáctica"
.
12 de Fevereiro de 2006.
http://www.linguateca.pt/documentos/Afonso2006ArvoresDeitadas.pdf
Detailed info about the NER tagset: http://beta.visl.sdu.dk/visl/pt/info/portsymbol.html#semtags_names
Note: Do not use this class, internal use only!
Constructor and Description |
---|
ADNameSampleStream(InputStreamFactory in,
String charsetName,
boolean splitHyphenatedTokens)
Deprecated.
|
ADNameSampleStream(InputStream in,
String charsetName,
boolean splitHyphenatedTokens)
Deprecated.
|
ADNameSampleStream(ObjectStream<String> lineStream,
boolean splitHyphenatedTokens)
Creates a new
NameSample stream from a line stream, i.e. |
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes the
ObjectStream and releases all allocated
resources. |
NameSample |
read()
Returns the next object.
|
void |
reset()
Repositions the stream at the beginning and the previously seen object sequence
will be repeated exactly.
|
public ADNameSampleStream(ObjectStream<String> lineStream, boolean splitHyphenatedTokens)
NameSample
stream from a line stream, i.e.
ObjectStream
<String
>, that could be a
PlainTextByLineStream
object.lineStream
- a stream of lines as String
splitHyphenatedTokens
- if true hyphenated tokens will be separated: "carros-monstro" >
"carros" "-" "monstro"@Deprecated public ADNameSampleStream(InputStreamFactory in, String charsetName, boolean splitHyphenatedTokens) throws IOException
NameSample
stream from a InputStream
in
- the Corpus InputStream
charsetName
- the charset of the Arvores Deitadas CorpussplitHyphenatedTokens
- if true hyphenated tokens will be separated: "carros-monstro" >
"carros" "-" "monstro"IOException
@Deprecated public ADNameSampleStream(InputStream in, String charsetName, boolean splitHyphenatedTokens)
NameSample
stream from a InputStream
in
- the Corpus InputStream
charsetName
- the charset of the Arvores Deitadas CorpussplitHyphenatedTokens
- if true hyphenated tokens will be separated: "carros-monstro" >
"carros" "-" "monstro"public NameSample read() throws IOException
ObjectStream
read
in interface ObjectStream<NameSample>
IOException
- if there is an error during readingpublic void reset() throws IOException, UnsupportedOperationException
ObjectStream
reset
in interface ObjectStream<NameSample>
IOException
- if there is an error during reseting the streamUnsupportedOperationException
public void close() throws IOException
ObjectStream
ObjectStream
and releases all allocated
resources. After close was called its not allowed to call
read or reset.close
in interface AutoCloseable
close
in interface ObjectStream<NameSample>
IOException
- if there is an error during closing the streamCopyright © 2015 The Apache Software Foundation. All rights reserved.