Package opennlp.tools.entitylinker
Interface EntityLinker<T extends Span>
- Type Parameters:
T
- A type that extendsSpan
.LinkedSpan
andBaseLink
are available to provide this signature. Use:EntityLinker
<LinkedSpan<BaseLink>> as a default.
public interface EntityLinker<T extends Span>
EntityLinkers establish connections with external data to enrich extracted
entities.
For instance, for Location entities a linker can be developed to look up each found location in a geonames gazetteer. Another example may be to find peoples' names and look them up in a database or active directory. Intended to return n best matches for any given search, but can also be implemented as deterministic.
-
Method Summary
Modifier and TypeMethodDescriptionLinks an entire document of named entities to an external source.find
(String doctext, Span[] sentences, Span[][] tokensBySentence, Span[][] namesBySentence, int sentenceIndex) Links the names that correspond to the tokens[] spans.void
init
(EntityLinkerProperties initializationData) Initializes anEntityLinker
and allows for passing properties through theEntityLinkerFactory
into all impls dynamically.
-
Method Details
-
init
Initializes anEntityLinker
and allows for passing properties through theEntityLinkerFactory
into all impls dynamically.EntityLinker
impls should initialize reusable objects used by the impl in this method. If this is done, any errors will be captured and thrown by theEntityLinkerFactory
.- Parameters:
initializationData
- TheEntityLinkerProperties
that contains properties needed by the impl, as well as any other objects required.- Throws:
IOException
- Thrown if IO errors occurred.
-
find
Links an entire document of named entities to an external source.- Parameters:
doctext
- The full text of the document.sentences
- An array ofsentence spans
.tokensBySentence
- An array oftokens spans
that correspond to each sentence. The outer array refers to the sentence, the inner array is the tokens for the outer sentence. Similar in nature to Map of SentenceIndex keys to List of tokens as values.namesBySentence
- An array ofname spans
that correspond to each sentence. The outer array refers to the sentence, the inner array refers to the tokens that for the same sentence. Similar in nature to Map<SentenceIndex,List<Name Spans For This Sentence's Tokens>> @ return.- Returns:
- A list of
EntityLinker
instances.
-
find
List<T> find(String doctext, Span[] sentences, Span[][] tokensBySentence, Span[][] namesBySentence, int sentenceIndex) Links the names that correspond to the tokens[] spans. ThesentenceIndex
can be used to get the sentence text and tokens from the text based on the sentence and token spans. The text is available for additional context.- Parameters:
doctext
- The full text of the document.sentences
- An array ofsentence spans
.tokensBySentence
- An array oftokens spans
that correspond to each sentence. The outer array refers to the sentence, the inner array is the tokens for the outer sentence. Similar in nature to Map of SentenceIndex keys to List of tokens as values.namesBySentence
- An array ofname spans
that correspond to each sentence. The outer array refers to the sentence, the inner array refers to the tokens that for the same sentence. Similar in nature to Map<SentenceIndex,List<Name Spans For This Sentence's Tokens>> @ return.sentenceIndex
- The index to the sentence span that thetokensBySentence
corresponds to.- Returns:
- A list of
EntityLinker
instances.
-