Fork me on GitHub

Apache OpenNLP 2.3.2 released

The Apache OpenNLP team is pleased to announce the release of Apache OpenNLP 2.3.2.

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.

It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution.

Apache OpenNLP 2.3.2 binary and source distributions are available for download from our download page: download page

The OpenNLP library is distributed by Maven Central as well. See the Maven Dependency page for more details: Maven Dependency

What’s new in Apache OpenNLP 2.3.2

In this release we fixed several bugs and upgraded some dependencies. In addition, we added abbreviation dictionaries for several languages. Moreover, we addressed a memory issue (OPENNLP-421) which occurs for large dictionaries due to String interning. Several new configuration options have been added to choose a strategy. Details can be found in the related Jira / PR.

We switched the default onnx runtime dependency in opennlp-dl to the cpu variant. If you need to use the GPU accelerated version of onxx, you can use the newly added module opennlp-dl-gpu. Moreover, we fixed the CLI on the Windows plattform.

For a full list of improvements, please see the list of items addressed in Jira.

--The Apache OpenNLP Team

04 February 2024