Class BioCodec

java.lang.Object
opennlp.tools.namefind.BioCodec
All Implemented Interfaces:
opennlp.tools.util.SequenceCodec<String>

public class BioCodec extends Object implements opennlp.tools.util.SequenceCodec<String>
The default SequenceCodec implementation according to the BIO scheme:
  • B: 'beginning' of a NE
  • I: 'inside', the word is inside a NE
  • O: 'outside', the word is a regular word outside a NE
See also the paper by Roth D. and Ratinov L.: Design Challenges and Misconceptions in Named Entity Recognition.
See Also:
  • Field Details

  • Constructor Details

    • BioCodec

      public BioCodec()
  • Method Details

    • decode

      public opennlp.tools.util.Span[] decode(List<String> c)
      Specified by:
      decode in interface opennlp.tools.util.SequenceCodec<String>
    • encode

      public String[] encode(opennlp.tools.util.Span[] names, int length)
      Specified by:
      encode in interface opennlp.tools.util.SequenceCodec<String>
    • createSequenceValidator

      public NameFinderSequenceValidator createSequenceValidator()
      Specified by:
      createSequenceValidator in interface opennlp.tools.util.SequenceCodec<String>
    • areOutcomesCompatible

      public boolean areOutcomesCompatible(String[] outcomes)
      Specified by:
      areOutcomesCompatible in interface opennlp.tools.util.SequenceCodec<String>