🔴Critical9.1

Classic XXE (Direct Entity Expansion)

Understanding direct XML external entity attacks where entity content is reflected in the application's response, allowing immediate data exfiltration.

CWE-611A05:2021 – Security Misconfiguration

Overview

Classic XXE (also called in-band XXE) occurs when an XML parser processes external entities and includes their content in the application's response. This is the most straightforward form of XXE attack, where the attacker can immediately see the results of the entity expansion.

When exploited successfully, Classic XXE allows attackers to:

Read arbitrary files from the server's filesystem
Perform Server-Side Request Forgery (SSRF) attacks
Cause Denial of Service (DoS)
In some cases, achieve Remote Code Execution

This vulnerability has a CVSS score of 9.1 (Critical) due to its ease of exploitation and severe impact.

How Classic XXE Works

The attack follows these steps:

Attacker crafts malicious XML: Creates an XML document with a DOCTYPE declaration containing an external entity definition
External entity references local resource: The entity points to a file (file:///) or network resource (http://)
Application parses XML: The vulnerable application processes the XML with an insecurely configured parser
Parser retrieves content: The XML parser automatically retrieves the referenced resource
Content reflected in response: The expanded entity content appears in the application's response
Attacker reads data: The attacker extracts sensitive information from the response

Vulnerable Code Examples

These code examples show how XXE vulnerabilities occur when developers parse XML without proper security configuration:

Vulnerable Java Code

JavaVulnerableXmlParser.java⚠️ Vulnerable
// VULNERABLE CODE - DO NOT USE IN PRODUCTION
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import java.io.StringReader;

public class VulnerableXmlParser {
    public Document parseXml(String xmlInput) throws Exception {
        // Default configuration - VULNERABLE!
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        
        // This will process external entities by default
        Document doc = builder.parse(
            new InputSource(new StringReader(xmlInput))
        );
        
        return doc;
    }
}

Vulnerable Python Code

Pythonvulnerable_parser.py⚠️ Vulnerable
# VULNERABLE CODE - DO NOT USE IN PRODUCTION
import xml.etree.ElementTree as ET

def parse_xml(xml_string):
    # Default lxml parsing - VULNERABLE!
    # External entities are processed by default
    root = ET.fromstring(xml_string)
    return root

# Using lxml (also vulnerable by default)
from lxml import etree

def parse_xml_lxml(xml_string):
    # VULNERABLE - processes external entities
    parser = etree.XMLParser()
    root = etree.fromstring(xml_string, parser)
    return root

Attack Payload: File Disclosure

This payload demonstrates reading /etc/passwd on Linux systems:

Linux File Disclosure Payload

XML
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>
  <data>&xxe;</data>
</root>

Windows File Disclosure Payload

XML
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///c:/windows/win.ini">
]>
<root>
  <data>&xxe;</data>
</root>

SSRF Attack Payload

This payload makes the server send a request to an internal network resource:

SSRF Payload Example

XML
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://internal-server:8080/admin">
]>
<root>
  <data>&xxe;</data>
</root>

Expected Response

When the attack succeeds, the application's response will contain the file contents or internal server response:

Vulnerable Response Example

XML
<root>
  <data>
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
...
  </data>
</root>

Secure Configuration

To prevent XXE attacks, you must explicitly disable external entity processing and DTD processing:

Secure Java Configuration

JavaSecureXmlParser.java✓ Secure
// SECURE CODE
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import java.io.StringReader;

public class SecureXmlParser {
    public Document parseXml(String xmlInput) throws Exception {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        
        // Disable DOCTYPE declarations entirely (most secure)
        factory.setFeature(
            "http://apache.org/xml/features/disallow-doctype-decl",
            true
        );
        
        // If you can't disable DOCTYPE, disable external entities
        factory.setFeature(
            "http://xml.org/sax/features/external-general-entities",
            false
        );
        factory.setFeature(
            "http://xml.org/sax/features/external-parameter-entities",
            false
        );
        
        // Disable external DTDs
        factory.setFeature(
            "http://apache.org/xml/features/nonvalidating/load-external-dtd",
            false
        );
        
        // Disable XInclude
        factory.setXIncludeAware(false);
        
        // Disable expanding entity references
        factory.setExpandEntityReferences(false);
        
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document doc = builder.parse(
            new InputSource(new StringReader(xmlInput))
        );
        
        return doc;
    }
}

Secure Python Configuration

Pythonsecure_parser.py✓ Secure
# SECURE CODE
from lxml import etree
from defusedxml import ElementTree as DefusedET

# Option 1: Use defusedxml (recommended)
def parse_xml_safe(xml_string):
    # defusedxml automatically disables dangerous features
    root = DefusedET.fromstring(xml_string)
    return root

# Option 2: Configure lxml securely
def parse_xml_lxml_safe(xml_string):
    # Create parser with disabled entity resolution
    parser = etree.XMLParser(
        resolve_entities=False,
        no_network=True,
        dtd_validation=False,
        load_dtd=False
    )
    root = etree.fromstring(xml_string, parser)
    return root

Impact and Risk

Classic XXE vulnerabilities have severe security implications:

Confidentiality Impact: HIGH

Access to sensitive files (/etc/passwd, /etc/shadow, application config files)
Database credentials and API keys
Source code and intellectual property
User data and PII

Integrity Impact: LOW-MEDIUM

Usually limited to data exfiltration
Can lead to SSRF attacks that modify internal state

Availability Impact: MEDIUM-HIGH

Billion Laughs attack can cause DoS
Resource exhaustion from large file reads
System crashes from parsing malformed data

Real-World Examples:

Reading cloud metadata endpoints (AWS, Azure, GCP)
Accessing internal admin panels
Exfiltrating database backups
Reading application source code

Detection and Testing

To test for Classic XXE vulnerabilities:

Identify XML input points: API endpoints, file uploads, SOAP services
Test with basic payload: Submit XML with external entity referencing a known file
Observe response: Check if file contents appear in the response
Test SSRF: Try referencing an attacker-controlled server to confirm outbound requests
Automated scanning: Use Burp Suite, OWASP ZAP, or custom scripts

Key indicators of vulnerability:

Application accepts XML input
Response contains expanded entity content
Application makes outbound requests to attacker-controlled servers
Error messages reveal file paths or entity processing

Remediation Summary

Priority 1 (Critical): Disable DOCTYPE declarations entirely Priority 2 (High): Disable external entity processing Priority 3 (High): Disable DTD processing Priority 4 (Medium): Implement input validation and allowlisting Priority 5 (Medium): Use secure XML parsing libraries (e.g., defusedxml) Priority 6 (Low): Implement WAF rules to detect XXE patterns

Refer to language-specific prevention guides for detailed implementation instructions.