Interface EntityLinker<T extends Span>

  • Type Parameters:
    T - A type that extends Span. LinkedSpan and BaseLink are provided to provide this signature: EntityLinker<LinkedSpan<BaseLink>> as a default

    public interface EntityLinker<T extends Span>
    EntityLinkers establish connections to external data to enrich extracted entities. For instance, for Location entities a linker can be developed to lookup each found location in a geonames gazateer. Another example may be to find peoples' names and look them up in a database or active directory. Intended to return n best matches for any give search, but can also be implemented as deterministic
    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      List<T> find​(String doctext, Span[] sentences, Span[][] tokensBySentence, Span[][] namesBySentence)
      Links an entire document of named entities to an external source
      List<T> find​(String doctext, Span[] sentences, Span[][] tokensBySentence, Span[][] namesBySentence, int sentenceIndex)
      Links the names that correspond to the tokens[] spans.
      void init​(EntityLinkerProperties initializationData)
      allows for passing properties through the EntityLinkerFactory into all impls dynamically.
    • Method Detail

      • init

        void init​(EntityLinkerProperties initializationData)
           throws IOException
        allows for passing properties through the EntityLinkerFactory into all impls dynamically. EntityLinker impls should initialize reusable objects used by the impl in this method. If this is done, any errors will be captured and thrown by the EntityLinkerFactory.
        Parameters:
        initializationData - the EntityLinkerProperties object that contains properties needed by the impl, as well as any other objects required for the impl
        Throws:
        IOException
      • find

        List<T> find​(String doctext,
                     Span[] sentences,
                     Span[][] tokensBySentence,
                     Span[][] namesBySentence)
        Links an entire document of named entities to an external source
        Parameters:
        doctext - the full text of the document
        tokensBySentence - a list of tokens spans that correspond to each sentence. The outer array refers to the sentence, the inner array is the tokens for the outer sentence. Similar in nature to Map of SentenceIndex keys to Listof tokens as values
        namesBySentence - a list of name spans that correspond to each sentence. The outer array refers to the sentence, the inner array refers to the tokens that for the same sentence.Similar in nature to Map<SentenceIndex,List<Name Spans For This Sentence's Tokens>> @ return
        Returns:
      • find

        List<T> find​(String doctext,
                     Span[] sentences,
                     Span[][] tokensBySentence,
                     Span[][] namesBySentence,
                     int sentenceIndex)
        Links the names that correspond to the tokens[] spans. The sentenceindex can be used to get the sentence text and tokens from the text based on the sentence and token spans. The text is available for additional context.
        Parameters:
        doctext - the full text of the document
        tokensBySentence - a list of tokens spans that correspond to each sentence. The outer array refers to the sentence, the inner array is the tokens for the outer sentence. Similar in nature to Map of SentenceIndex keys to Listof tokens as values
        namesBySentence - a list of name spans that correspond to each sentence. The outer array refers to the sentence, the inner array refers to the tokens that for the same sentence.Similar in nature to Map<SentenceIndex,List<Name Spans For This Sentence's Tokens>> @ return
        sentenceIndex - the index to the sentence span that the tokens[] Span[] corresponds to
        Returns: