Class Conll02NameSampleStream

  • All Implemented Interfaces:
    AutoCloseable, ObjectStream<NameSample>

    public class Conll02NameSampleStream
    extends Object
    implements ObjectStream<NameSample>
    Parser for the dutch and spanish ner training files of the CONLL 2002 shared task.

    The dutch data has a -DOCSTART- tag to mark article boundaries, adaptive data in the feature generators will be cleared before every article.
    The spanish data does not contain article boundaries, adaptive data will be cleared for every sentence.

    The data contains four named entity types: Person, Organization, Location and Misc.

    Data can be found on this web site:
    http://www.cnts.ua.ac.be/conll2002/ner/

    Note: Do not use this class, internal use only!