All Classes and Interfaces (Apache OpenNLP

Class

Description

AbstractObjectStream<T>

A base ObjectStream implementation.

AdaptiveFeatureGenerator

An interface for generating features for name entity identification and for updating document level contexts.

AlgorithmType

ArgumentParser

Parser for command line arguments.

ArgumentParser.OptionalParameter

ArgumentParser.ParameterDescription

ArrayMath

Utility class for simple vector arithmetic.

ArtifactProvider

Provides access to model persisted artifacts.

ArtifactSerializer<T>

Responsible to create an artifact from an InputStream.

BaseLink

Represents a minimal tuple of information.

BasicFormatParams

Common format parameters.

BasicTrainingParams

Common training parameters.

BeamSearchContextGenerator<T>

Interface for context generators used with a sequence beam search.

Cache<K,V>

Provides fixed size, pre-allocated, least recently used replacement cache.

CharSequenceNormalizer

A char sequence normalizer, used to adjusting (prune, substitute, add, etc.) characters in order to remove noise from text

Chunker

The interface for chunkers which provide chunk tags for a sequence of tokens.

ChunkerContextGenerator

Interface for a BeamSearchContextGenerator used in syntactic chunking.

ChunkerEvaluationMonitor

A marker interface for evaluating chunkers.

ChunkSample

Class for holding chunks for a single unit of text.

ClassPathLoaderException

ClassPathModelEntry

Encapsulates a classpath entry that is associated with a model URI and optional properties.

ClassPathModelFinder

Describes a scanner which detects OpenNLP specific model files in an applications's classpath.

Constituent

Holds constituents when reading parses.

Context

Class which associates a real valued parameter or expected value with a particular contextual predicate or feature.

CVParams

Common cross validator parameters.

DataIndexer<P>

Represents an indexer which compresses events in memory and performs feature selection.

DataReader

Describes generic ways to read data from a DataInputStream.

Detokenizer

A Detokenizer merges tokens back to their detokenized representation.

Detokenizer.DetokenizationOperation

This enum contains an operation for every token to merge the tokens together to their detokenized form.

DetokenizerParameter

DoccatEvaluationMonitor

A marker interface for evaluating doccat.

DocumentCategorizer

Interface for classes which categorize documents.

DocumentNameFinder

Interface for processing an entire document allowing a TokenNameFinder to use context from the entire document.

DocumentSample

Class which holds a classified document and its category.

EncodingParameter

Encoding parameter.

EndOfSentenceScanner

Scans CharSequence, StringBuffer, and char[] for the offsets of sentence ending characters.

EntityLinker<T>

EntityLinkers establish connections with external data to enrich extracted entities.

EntityLinkerProperties

Properties wrapper for EntityLinker implementations.

EvaluationMonitor<T>

EvaluatorParams

Common evaluation parameters.

Event

The context of a decision point during training.

EventModelSequenceTrainer<T,P>

A specialized Trainer that is based on a 'EventModelSequence' approach.

EventTrainer<P>

A specialized Trainer that is based on an Event approach.

Experimental

Indicates that a certain API feature is not stable and might change with a new release.

ExtensionLoader

The ExtensionLoader is responsible to load extensions to the OpenNLP library.

ExtensionNotLoadedException

Exception indicates that an OpenNLP extension could not be loaded.

ExtensionServiceKeys

FeatureGenerator

Interface for generating features for document categorization.

FeatureGeneratorResourceProvider

The FeatureGeneratorResourceProvider provides access to the resources available in the model.

FineGrainedEvaluatorParams

Common evaluation parameters.

FMeasure

The FMeasure is a utility class for evaluators which measures precision, recall and the resulting f-measure.

GapLabeler

Represents a labeler for nodes which contain traces so that these traces can be predicted by a Parser.

HeadRules

Encoder for head rules associated with parsing.

InputStreamFactory

Allows repeated reads through a stream for certain model building types.

InsufficientTrainingDataException

This exception indicates that the provided training data is insufficient to train a desired model.

Internal

Classes, fields, or methods annotated @Internal are for OpenNLP internal use only.

InvalidFormatException

This exception indicates that a resource violates the expected data format.

Language

Class for holding the document language and its confidence

LanguageCodeValidator

Validates language codes against ISO 639 standards.

LanguageDetector

The interface for LanguageDetector which predicts the Language for a context.

LanguageDetectorContextGenerator

A context generator interface for LanguageDetector.

LanguageDetectorEvaluationMonitor

A marker interface for evaluating language detectors.

LanguageModel

A language model can calculate the probability p (between 0 and 1) of a certain sequence of tokens, given its underlying vocabulary.

LanguageParams

LanguageSample

Holds a classified document and its Language.

LemmaSample

Represents a lemmatized sentence.

Lemmatizer

The common interface for lemmatizers.

LemmatizerContextGenerator

Interface for the context generator used for probabilistic Lemmatizer.

LemmatizerEvaluationMonitor

A marker interface for evaluating lemmatizers.

LinkedSpan<T>

A default, extended Span that holds additional information about a Span.

MaxentModel

Interface for maximum entropy models.

Mean

Calculates the arithmetic mean of values added with the Mean.add(double) method.

ModelType

A model type to pattern enumeration.

ModelType

Enumeration of supported model types.

MutableTagDictionary

Interface that allows TagDictionary entries to be added and removed.

NameContextGenerator

Interface for generating the context for a name finder by specifying a set of feature generators.

NameSample

Encapsulates names for a single unit of text.

ObjectStream<T>

Reads objects from a stream.

ObjectStreamFactory<T,P>

ObjectStreamUtils

Parameters

Parse

Data structure for holding parse constituents.

Parser

Defines common methods for full-syntactic parsers.

ParserEvaluationMonitor

A marker interface for evaluating parsers.

ParserEventTypeEnum

Enumeration of event types for a Parser.

ParserType

Enumeration of supported Parser types.

PlainTextByLineStream

Reads a plain text file and returns each line as a String object.

POSContextGenerator

Interface for a BeamSearchContextGenerator used in POS tagging.

POSSample

Represents an pos-tagged sentence.

POSTagger

The interface for part of speech taggers.

POSTaggerEvaluationMonitor

A marker interface for evaluating pos taggers.

Prior

This interface allows one to implement a prior distribution for use in maximum entropy model training.

Probabilistic

A marker interface for classes with probabilistic capabilities.

ResetableIterator<E>

This interface makes an Iterator resettable.

ReverseListIterator<T>

An iterator for a list which returns values in the opposite order as the typical list iterator.

Sample

Represents a generic type of processable elements.

SDContextGenerator

Interface for SentenceDetector context generators.

SentenceDetector

The interface for sentence detectors, which find the sentence boundaries in a text.

SentenceDetectorEvaluationMonitor

SentenceSample

A SentenceSample contains a document with begin indexes of the individual sentences.

SentimentDetector

SentimentEvaluationMonitor

An sentiment specific EvaluationMonitor to be used by the evaluator.

SentimentSample

Class for holding text used for sentiment analysis.

Sequence<T>

Class which models a sequence.

Sequence

Represents a weighted sequence of outcomes.

SequenceClassificationModel

A classification model that can label an input Sequence.

SequenceCodec<T>

A codec for sequences of type T.

SequenceStream<S>

Interface for streams of sequences used to train sequence models.

SequenceTrainer<P>

SequenceValidator<T>

SerializableArtifact

A marker interface so that implementing classes can refer to the corresponding ArtifactSerializer implementation.

Span

Class for storing start and end integer offsets.

Stemmer

The stemmer is reducing a word to its stem.

StopCriteria<T>

Stop criteria for model training.

StringInterner

A marker-interface for a String interner implementation.

StringUtil

TagDictionary

Interface to determine which tags are valid for a particular word based on a tag dictionary.

TerminateToolException

Exception to terminate the execution of a command line tool.

ThreadSafe

Classes, fields, or methods annotated @ThreadSafe are safe to use in multithreading contexts.

TokenContextGenerator

Interface for context generators required for tokenizer implementations.

Tokenizer

The interface for tokenizers, which segment a string into its tokens.

TokenizerEvaluationMonitor

A marker interface for evaluating tokenizers.

TokenNameFinder

The interface for name finders which provide name tags for a sequence of tokens.

TokenNameFinderEvaluationMonitor

A marker interface for evaluating name finders.

TokenSample

A TokenSample is text with token spans.

TokenTag

Trainer<P>

Represents a common base for training implementations.

TrainingConfiguration

Configuration used for model training.

TrainingMeasure

Enumeration of Training measures.

TrainingProgressMonitor

An interface to capture training progress of a model.

TrainingToolParams

Common training parameters.

WhitespaceTokenizer

A basic Tokenizer implementation which performs tokenization using white spaces.

WordpieceTokenizer

A Tokenizer implementation which performs tokenization using word pieces.

WordVector

A word vector.

WordVectorTable

A table that maps tokens to word vectors.

WordVectorType