Apache OpenNLP 1.5.2 Incubating released
The Apache OpenNLP team is pleased to announce the release of version 1.5.2-incubating of Apache OpenNLP.
The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution.
The OpenNLP 1.5.2-incubating binary and source distributions are available for download from our download page: http://incubator.apache.org/opennlp/download.cgi
The OpenNLP library is distributed by Maven Central as well. See the Maven Dependency page for more details: http://incubator.apache.org/opennlp/maven-dependency.html
This release contains a couple of new features, improvements and bug fixes. The maxent trainer can now run in multiple threads to utilize multi-core CPUs, configurable feature generation was added to the name finder, the perceptron trainer was refactored and improved, machine learners can now be configured with much more options via a parameter file, evaluators can print out detailed evaluation information.
Additionally the release contains the following noteworthy changes:
- Improved the white space handling in the Sentence Detector and its training code
- Added more cross validator command line tools
- Command line handling code has been refactored
- Fixed problems with the new build
- Now uses fast token class feature generation code by default
- Added support for BioNLP/NLPBA 2004 shared task data
- Removal of old and deprecated code
- Dictionary case sensitivity support is now done properly
- Support for OSGi
For a complete list of fixed bugs and improvements please see the RELEASE_NOTES file included in the distribution.
--The Apache OpenNLP Team