Fork me on GitHub

Apache OpenNLP 2.1.0 released

The Apache OpenNLP team is pleased to announce the release of Apache OpenNLP 2.1.0.

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.

It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution.

Apache OpenNLP 2.1.0 binary and source distributions are available for download from our download page: download page

The OpenNLP library is distributed by Maven Central as well. See the Maven Dependency page for more details: Maven Dependency

What’s new in Apache OpenNLP 2.1.0

  • Update language codes in documentation

  • Enable optional GPU inference in ONNX Runtime configuration

  • Allow for unlimited text length in document classification with ONNX Runtime

  • Fix alphaNumOpt in tokenizer example

  • Training of MaxEnt model with large corpora fails with java.io.UTFDataFormatException

  • Make parameter names in the params file be not case-sensitive

  • Upgrade JUnit to version 5

For a full list please see the list of items addressed in Jira.

--The Apache OpenNLP Team

23 November 2022