Class BioNLP2004NameSampleStream

java.lang.Object
opennlp.tools.formats.BioNLP2004NameSampleStream
All Implemented Interfaces:
AutoCloseable, ObjectStream<NameSample>

@Internal public class BioNLP2004NameSampleStream extends Object implements ObjectStream<NameSample>
A sample stream for the training files of the BioNLP/NLPBA 2004 shared task.

The data contains five named entity types:

  • DNA
  • RNA
  • protein
  • cell_type
  • cell_line

Data can be found on this website, or in this repository.

The BioNLP/NLPBA 2004 data were originally published here:

http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/ERtask/report.html,

yet this page was gone when last checked in December 2022.

It looks like this repo contains a copy of the data located on the original page: The BioNLP 2004 seems to be related to http://www.geniaproject.org/shared-tasks/bionlp-jnlpba-shared-task-2004

Note: Do not use this class, internal use only!