The Apache OpenNLP team is pleased to announce the release of version 1.5.2-incubating of Apache OpenNLP.
The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution.
The OpenNLP 1.5.2-incubating binary and source distributions are available for download from our download page: https://incubator.apache.org/opennlp/download.cgi
The OpenNLP library is distributed by Maven Central as well. See the Maven Dependency page for more details: https://incubator.apache.org/opennlp/maven-dependency.html
This release contains a couple of new features, improvements and bug fixes. The maxent trainer can now run in multiple threads to utilize multi-core CPUs, configurable feature generation was added to the name finder, the perceptron trainer was refactored and improved, machine learners can now be configured with much more options via a parameter file, evaluators can print out detailed evaluation information.
Additionally the release contains the following noteworthy changes:
Improved the white space handling in the Sentence Detector and its training code
Added more cross validator command line tools
Command line handling code has been refactored
Fixed problems with the new build
Now uses fast token class feature generation code by default
Added support for BioNLP/NLPBA 2004 shared task data
Removal of old and deprecated code
Dictionary case sensitivity support is now done properly
Support for OSGi
For a complete list of fixed bugs and improvements please see the RELEASE_NOTES file included in the distribution.
--The Apache OpenNLP Team
28 November 2011