Class BilouCodec

java.lang.Object
opennlp.tools.namefind.BilouCodec
All Implemented Interfaces:
opennlp.tools.util.SequenceCodec<String>

public class BilouCodec extends Object implements opennlp.tools.util.SequenceCodec<String>
The default SequenceCodec implementation according to the BILOU scheme.
  • B: 'beginning' of a NE
  • I: 'inside', the word is inside a NE
  • L: 'last', the last (I) word inside a NE
  • O: 'outside', the word is a regular word outside a NE
  • U: 'unit', any standalone token following words outside of NE
See paper by Roth D. and Ratinov L. (2009): Design Challenges and Misconceptions in Named Entity Recognition.
See Also:
  • Field Details

  • Constructor Details

    • BilouCodec

      public BilouCodec()
  • Method Details

    • decode

      public opennlp.tools.util.Span[] decode(List<String> c)
      Specified by:
      decode in interface opennlp.tools.util.SequenceCodec<String>
    • encode

      public String[] encode(opennlp.tools.util.Span[] names, int length)
      Specified by:
      encode in interface opennlp.tools.util.SequenceCodec<String>
    • createSequenceValidator

      public opennlp.tools.util.SequenceValidator<String> createSequenceValidator()
      Specified by:
      createSequenceValidator in interface opennlp.tools.util.SequenceCodec<String>
    • areOutcomesCompatible

      public boolean areOutcomesCompatible(String[] outcomes)
      B requires CL or L, C requires BL, L requires B, O requires any valid combo/unit, U requires none.
      Specified by:
      areOutcomesCompatible in interface opennlp.tools.util.SequenceCodec<String>
      Parameters:
      outcomes - All potential model outcomes check.
      Returns:
      true, if model outcomes are compatible, false otherwise.