Identifying XML External Entities (XXE) Vulnerabilities

SAI CHARAN REDDY P
3 min readJul 29, 2020

--

Before we look at XXE, let’s look into few XML Basics which helps us to understand the main concepts behind XXE.

XML stands for Extensible Markup Language, that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable, and the current version is 1.0.

Design Goals of XML:

  1. Easy processing across the internet.
  2. Clear
  3. Easy to create

<?xml version=”1.0” encoding=”UTF-8”?> #Decleration
<Vulnerability>#Root Element
XXE #Value of the element
</Vulnerability>

XML Parser
Say, you have an XML document and your application needs to process the document, the part of processing the XML is called XML Parser and parser tried to parse the document which allows your application to use the parsed values.

Document Type Definition (DTD)
This is an optional part of XML but it is part of the XML standard itself. DTD provides a grammar for a class of documents, which can contain markup declarations which define the structure for a document and can also contain Entities. [note: DTD are prohibited from version 1.1]

Entities
These can be seen as a placeholder for variables which comes in
1. Regular Entities: Text substitution, where instead of adding text we can add a reference to the entity.
2. Predefined Entities: This is part of the XML standard.
3. External Entities: A placeholder that can point to internal and external resources.

<?xml version=”1.0” encoding=”UTF-8”?>
<!DOCTYPE Vulnerability [ #DTD Decleration
<!ENTITY xxe “XXE”>
#Entity Decleration
]>
<Vulnerability>
&xxe; #Reference to entity
</Vulnerability>

XML Validation:
In this process, we generally validate the well-formedness where we check the syntax but not validate whether it is a valid instance of the included DTD.

Tools we can use to validate an XML file are xmllint and xmlstarlet.

When does this happen?
When an untrusted input in the XML file is processed and included by XML parser which processes the XML file.

Common places to search for?
1. File Upload
2. API
3. SVG
4. CSS

Impact when exploited:

  1. Information Disclosure by reading local files and parsing errors.
  2. Denial of Service.
  3. External Requests which can also be called as SSRF.
  4. Data exfiltration using Out of Band approach.
  5. Remote Code Execution.

Demo: [Lab from PortSwigger]

Exploiting XXE to retrieve files
Exploiting XXE to perform SSRF attack
Blind XXE using out of band OAST techniques SSRF with general entities
Blind XXE with out of band interaction via XML parameter entities
Exploiting blind XXE to exfiltrate data using a malicious external DTD
Exploiting blind XXE to retrieve data via error messages

Identifying Vulnerable Code

When is code at risk?
1. XML data is processed: When a user can supply data to an XML processor.
2. DTD can be supplied and Processed: When any actor can supply or define entities.
3. Entities are dereferenced: Text substitution.

How to Detect Vulnerable Code?
1. When code processes XML: Check code for XML parsers and XML Libraries
2. Check whether the input is sanitized or not?
3. Check whether DTDs are allowed or not? If DTDs are allowed, now check if External Entities are allowed or not?
4. Check for processing limit is set or not?

Mitigating XML External Vulnerabilities

General Mitigation Strategy:
1. Disable DTD loading
2. Disable External Entities loading
3. Don’t allow an untrusted XML input
4. Update XML libraries

Specialized Mitigation Strategy:
1. Load entities from specific sources
2. Memory usage

Mitigation in Java: Java insecure by default and disable default DTD(ACCESS_EXTERNAL_DTD (or) SCHEMA (or) STYLESHEET,” ”) loading in java.xml.XMLConstants; and EntityExpansionLimit.

Mitigation in .Net: If we use XmlDocument, XmlTextReader, XPathNavigator before 4.5.2 version then insecure defaults are in use. If so, disable DTD by setting XmlDoc.XMLResolver to null and XMLReaderSettings (Maxcharactersindocument) is set to null and change it to number by requirement.

Mitigation in PHP: If we use libxml_disable_entity_loader(True); then we disable external entities. From the libxml2 2.9 version, entities are disabled by default and to disable network access: LIBXML_NONET.

--

--