public class EvalitaNameSampleStream extends Object implements ObjectStream<NameSample>
The data does not contain article boundaries, adaptive data will be cleared for every sentence.
Named Entities are annotated in the IOB2 format (as used in CoNLL 2002 shared task)
The Named Entity tag consists of two parts: 1. The IOB2 tag: 'B' (for 'begin') denotes the first token of a Named Entity, I (for 'inside') is used for all other tokens in a Named Entity, and 'O' (for 'outside') is used for all other words; 2. The Entity type tag: PER (for Person), ORG (for Organization), GPE (for Geo-Political Entity), or LOC (for Location).
Each file consists of four columns separated by a blank, containing respectively the token, the Elsnet PoS-tag, the Adige news story to which the token belongs, and the Named Entity tag.
Data can be found on this web site:
http://www.evalita.it
Note: Do not use this class, internal use only!
Modifier and Type | Class and Description |
---|---|
static class |
EvalitaNameSampleStream.LANGUAGE |
Modifier and Type | Field and Description |
---|---|
static String |
DOCSTART |
static int |
GENERATE_GPE_ENTITIES |
static int |
GENERATE_LOCATION_ENTITIES |
static int |
GENERATE_ORGANIZATION_ENTITIES |
static int |
GENERATE_PERSON_ENTITIES |
Constructor and Description |
---|
EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang,
InputStreamFactory in,
int types) |
EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang,
InputStream in,
int types)
Deprecated.
|
EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang,
ObjectStream<String> lineStream,
int types) |
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes the
ObjectStream and releases all allocated
resources. |
NameSample |
read()
Returns the next object.
|
void |
reset()
Repositions the stream at the beginning and the previously seen object sequence
will be repeated exactly.
|
public static final int GENERATE_PERSON_ENTITIES
public static final int GENERATE_ORGANIZATION_ENTITIES
public static final int GENERATE_LOCATION_ENTITIES
public static final int GENERATE_GPE_ENTITIES
public static final String DOCSTART
public EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang, ObjectStream<String> lineStream, int types)
public EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang, InputStreamFactory in, int types) throws IOException
IOException
@Deprecated public EvalitaNameSampleStream(EvalitaNameSampleStream.LANGUAGE lang, InputStream in, int types)
lang
- the language of the Evalita data filein
- an Input Stream to read data.types
- the types of the entities which are included in the Name Sample streampublic NameSample read() throws IOException
ObjectStream
read
in interface ObjectStream<NameSample>
IOException
- if there is an error during readingpublic void reset() throws IOException, UnsupportedOperationException
ObjectStream
reset
in interface ObjectStream<NameSample>
IOException
- if there is an error during reseting the streamUnsupportedOperationException
public void close() throws IOException
ObjectStream
ObjectStream
and releases all allocated
resources. After close was called its not allowed to call
read or reset.close
in interface AutoCloseable
close
in interface ObjectStream<NameSample>
IOException
- if there is an error during closing the streamCopyright © 2015 The Apache Software Foundation. All rights reserved.