Class ADNameSampleStream
java.lang.Object
opennlp.tools.formats.ad.ADNameSampleStream
- All Implemented Interfaces:
AutoCloseable, opennlp.tools.util.ObjectStream<opennlp.tools.namefind.NameSample>
@Internal
public class ADNameSampleStream
extends Object
implements opennlp.tools.util.ObjectStream<opennlp.tools.namefind.NameSample>
Parser for Floresta Sita(c)tica Arvores Deitadas corpus, output to for the
Portuguese NER training.
The data contains four named entity types: Person, Organization, Group,
Place, Event, ArtProd, Abstract, Thing, Time and Numeric.
Data can be found on this web site.
Information about the format:
Susana Afonso.
"Árvores deitadas: Descrição do formato e das opções de análise na Floresta Sintáctica".
12 de Fevereiro de 2006.
Detailed info about the NER tagset.
Note: Do not use this class, internal use only!
-
Constructor Summary
ConstructorsConstructorDescriptionADNameSampleStream(opennlp.tools.util.InputStreamFactory in, String charsetName, boolean splitHyphenatedTokens) Deprecated, for removal: This API element is subject to removal in a future version.ADNameSampleStream(opennlp.tools.util.ObjectStream<String> lineStream, boolean splitHyphenatedTokens) Initializes a newADNameSampleStreamstream from aopennlp.tools.util.ObjectStream<String>, that could be aPlainTextByLineStreamobject. -
Method Summary
-
Constructor Details
-
ADNameSampleStream
public ADNameSampleStream(opennlp.tools.util.ObjectStream<String> lineStream, boolean splitHyphenatedTokens) Initializes a newADNameSampleStreamstream from aopennlp.tools.util.ObjectStream<String>, that could be aPlainTextByLineStreamobject.- Parameters:
lineStream- Anopennlp.tools.util.ObjectStream<String>as input.splitHyphenatedTokens- Iftruehyphenated tokens will be separated: "carros-monstro" > "carros" "-" "monstro".
-
ADNameSampleStream
@Deprecated(forRemoval=true) public ADNameSampleStream(opennlp.tools.util.InputStreamFactory in, String charsetName, boolean splitHyphenatedTokens) throws IOException Deprecated, for removal: This API element is subject to removal in a future version.Initializes a newADNameSampleStreamfrom anInputStreamFactory- Parameters:
in- The CorpusInputStreamFactory.charsetName- Thecharsetto use for reading of the corpus.splitHyphenatedTokens- Iftruehyphenated tokens will be separated: "carros-monstro" > "carros" "-" "monstro".- Throws:
IOException
-
-
Method Details
-
read
- Specified by:
readin interfaceopennlp.tools.util.ObjectStream<opennlp.tools.namefind.NameSample>- Throws:
IOException
-
reset
- Specified by:
resetin interfaceopennlp.tools.util.ObjectStream<opennlp.tools.namefind.NameSample>- Throws:
IOExceptionUnsupportedOperationException
-
close
- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceopennlp.tools.util.ObjectStream<opennlp.tools.namefind.NameSample>- Throws:
IOException
-