The Apache OpenNLP team is pleased to announce the release of Apache OpenNLP 2.3.2.
The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.
It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution.
Apache OpenNLP 2.3.2 binary and source distributions are available for download from our download page: download page
The OpenNLP library is distributed by Maven Central as well. See the Maven Dependency page for more details: Maven Dependency
In this release we fixed several bugs and upgraded some dependencies. In addition, we added abbreviation dictionaries for several languages. Moreover, we addressed a memory issue (OPENNLP-421) which occurs for large dictionaries due to String interning. Several new configuration options have been added to choose a strategy. Details can be found in the related Jira / PR.
We switched the default onnx runtime dependency in opennlp-dl to the cpu variant. If you need to use the GPU accelerated version of onxx, you can use the newly added module opennlp-dl-gpu. Moreover, we fixed the CLI on the Windows plattform.
For a full list of improvements, please see the list of items addressed in Jira.
--The Apache OpenNLP Team
04 February 2024