This site is the archived OWASP Foundation Wiki and is no longer accepting Account Requests.
To view the new OWASP Foundation website, please visit https://owasp.org

Difference between revisions of "XPATH Injection"

From OWASP
Jump to: navigation, search
 
(20 intermediate revisions by 7 users not shown)
Line 1: Line 1:
 
{{Template:Attack}}
 
{{Template:Attack}}
 +
<br>
 +
[[Category:OWASP ASDR Project]]
 +
 +
Last revision (mm/dd/yy): '''{{REVISIONMONTH}}/{{REVISIONDAY}}/{{REVISIONYEAR}}'''
  
==Author==
 
Contact Author: [mailto:[email protected] Mark Bradshaw]
 
  
 
==Description==
 
==Description==
Similar to SQL Injection, XML Injection attacks occur when a web site uses user supplied information to query XML data.  By sending intentionally malformed information into the web site, an attacker can find out how the XML data is structured or access data that they may not normally have access to.  They may even be able to elevate their privileges on the web site if the xml data is being used for authentication (such as an xml based user file).
+
Similar to [[SQL Injection]], XPath Injection attacks occur when a web site uses user-supplied information to construct an XPath query for XML data.  By sending intentionally malformed information into the web site, an attacker can find out how the XML data is structured, or access data that he may not normally have access to.  He may even be able to elevate his privileges on the web site if the XML data is being used for authentication (such as an XML based user file).
  
Querying XML is done with XPath, a type of simple descriptive statement that allows the xml query to locate a piece of information.  Like SQL you can specify certain attributes to find and patterns to match.  When using XML for a web site it is common to accept some form of input on the query string to identify the content to locate and display on the page.  This input '''must''' be sanitized to verify that it doesn't mess up the XPath query and return the wrong data.
+
Querying XML is done with XPath, a type of simple descriptive statement that allows the XML query to locate a piece of information.  Like SQL, you can specify certain attributes to find, and patterns to match.  When using XML for a web site it is common to accept some form of input on the query string to identify the content to locate and display on the page.  This input '''must''' be sanitized to verify that it doesn't mess up the XPath query and return the wrong data.
  
==Examples ==
+
XPath is a standard language; its notation/syntax is always implementation independent, which means the attack may be automated.
 +
There are no different dialects as it takes place in requests to the SQL databases.
  
We'll use this xml snippet for the examples.
+
Because there is no level access control it's possible to get the entire document. We won't encounter any limitations as we may know from SQL injection attacks.
 +
 
 +
==Example Vulnerability ==
 +
 
 +
We'll use this XML snippet for the examples.
  
 
<pre>
 
<pre>
Line 33: Line 40:
 
</pre>
 
</pre>
  
Suppose we have a user authentication system on a web page that used a data file of this sort to login users.  Once a username and password had been supplied the software might use an XPath to lookup the user such as this:
+
Suppose we have a user authentication system on a web page that used a data file of this sort to login users.  Once a username and password have been supplied the software might use XPath to look up the user:
  
 
<pre>
 
<pre>
Line 47: Line 54:
 
</pre>
 
</pre>
  
With a normal username and password this XPath would work, but an attacker may send a bad username and password and get an xml node selected without knowing the username or password, like this:
+
With a normal username and password this XPath would work, but an attacker may send a bad username and password and get an XML node selected without knowing the username or password, like this:
  
 
<pre>
 
<pre>
Line 63: Line 70:
 
In this case, only the first part of the XPath needs to be true.  The password part becomes irrelevant, and the UserName part will match ALL employees because of the "1=1" part.
 
In this case, only the first part of the XPath needs to be true.  The password part becomes irrelevant, and the UserName part will match ALL employees because of the "1=1" part.
  
Just like SQL injection, in order to protect yourself you must escape single quotes (or double quotes) if your application uses them.   
+
==XPath Injection Defenses==
 +
 
 +
Just like the techniques to avoid SQL injection, you need to use a parameterized XPath interface if one is available, or escape the user input to make it safe to include in a dynamically constructed query. If you are using quotes to terminate untrusted input in a dynamically constructed XPath query, then you need to escape that quote in the untrusted input to ensure the untrusted data can't try to break out of that quoted context. In the following example, single quotes (') are used to terminate the Username and Password parameters. So, we need to replace any ' characters in this input with the XML encoded version of that character, which is "&amp;apos;".   
  
 
<pre>
 
<pre>
 
VB:
 
VB:
 
Dim FindUserXPath as String
 
Dim FindUserXPath as String
FindUserXPath = "//Employee[UserName/text()='" & Request("Username").Replace("'", "&apos;") & "' And  
+
FindUserXPath = "//Employee[UserName/text()='" & Request("Username").Replace("'", "&amp;apos;") & "' And  
         Password/text()='" & Request("Password").Replace("'", "&apos;") & "']"
+
         Password/text()='" & Request("Password").Replace("'", "&amp;apos;") & "']"
 +
 
 +
C#:
 +
String FindUserXPath;
 +
FindUserXPath = "//Employee[UserName/text()='" + Request("Username").Replace("'", "&amp;apos;") + "' And
 +
        Password/text()='" + Request("Password").Replace("'", "&amp;apos;") + "']";
 +
</pre>
 +
 
 +
Another <strong>better</strong> mitigation option is to use a precompiled XPath[http://www.tkachenko.com/blog/archives/000385.html] query.  Precompiled XPath queries are already preset before the program executes, rather than created on the fly <strong>after</strong> the user's input has been added to the string.  This is a better route because you don't have to worry about missing a character that should have been escaped.
 +
 
 +
==Related [[Threat Agents]]==
 +
* [[Command Injection]]
 +
* [[SQL Injection]]
 +
* [[LDAP injection]]
 +
* [[Server-Side_Includes_%28SSI%29_Injection]]
 +
 
 +
==Related [[Attacks]]==
 +
* [[Blind_SQL_Injection]]
 +
* [[Blind_XPath_Injection]]
  
 +
==Related [[Vulnerabilities]]==
 +
* [[:Category: Input Validation Vulnerability]]
 +
 +
==Related [[Controls]]==
 +
* [[:Category:Input Validation]]
 +
* [[Input Validation]]
 +
 +
Just like SQL injection, in order to protect yourself you must escape single quotes (or double quotes) if your application uses them.
 +
 +
VB:
 +
<pre>
 +
Dim FindUserXPath as String
 +
FindUserXPath = "//Employee[UserName/text()='" &
 +
Request("Username").Replace("'", "&apos;") & "' And
 +
      Password/text()='" & Request("Password").Replace("'", "&apos;") & "']"
 +
</pre>
 
C#:
 
C#:
 +
<pre>
 
String FindUserXPath;
 
String FindUserXPath;
FindUserXPath = "//Employee[UserName/text()='" + Request("Username").Replace("'", "&apos;") + "' And  
+
FindUserXPath = "//Employee[UserName/text()='" +
        Password/text()='" + Request("Password").Replace("'", "&apos;") + "']";
+
Request("Username").Replace("'", "&apos;") + "' And
 +
      Password/text()='" + Request("Password").Replace("'", "&apos;") + "']";
 
</pre>
 
</pre>
  
Another <strong>better</strong> mitigation option is to use a precompiled XPath[http://www.tkachenko.com/blog/archives/000385.html]. Precompiled XPaths are already preset before the program executes, rather than created on the fly <strong>after</strong> the user's input has been added to the string. This is a better route because you don't have to worry about missing a character that should have been escaped.
+
Another better mitigation option is to use a precompiled XPath[1]. Precompiled XPaths are already preset before the program executes,
 +
rather than created on the fly after the user's input has been added to the string. This is a better route because you don't have to worry about missing a character that should have been escaped.
 +
 
 +
Use of parameterized XPath queries - Parameterization causes the input to be restricted to certain domains, such as strings or integers, and any input outside such domains is considered invalid and the query fails.
  
==Related Attacks==
+
Use of custom error pages - Attackers can glean information about the nature of queries from descriptive error messages. Input validation must be coupled with customized error pages that inform about an error without disclosing information about the database or application.
  
* [[Injection problem]]
+
[[Category: Injection]]
* [[SQL injection]]
 
  
==Categories==
 
 
[[Category:Attack]]
 
[[Category:Attack]]
 
[[Category:Injection Attack]]
 
[[Category:Injection Attack]]

Latest revision as of 16:58, 15 April 2015

This is an Attack. To view all attacks, please see the Attack Category page.


Last revision (mm/dd/yy): 04/15/2015


Description

Similar to SQL Injection, XPath Injection attacks occur when a web site uses user-supplied information to construct an XPath query for XML data. By sending intentionally malformed information into the web site, an attacker can find out how the XML data is structured, or access data that he may not normally have access to. He may even be able to elevate his privileges on the web site if the XML data is being used for authentication (such as an XML based user file).

Querying XML is done with XPath, a type of simple descriptive statement that allows the XML query to locate a piece of information. Like SQL, you can specify certain attributes to find, and patterns to match. When using XML for a web site it is common to accept some form of input on the query string to identify the content to locate and display on the page. This input must be sanitized to verify that it doesn't mess up the XPath query and return the wrong data.

XPath is a standard language; its notation/syntax is always implementation independent, which means the attack may be automated. There are no different dialects as it takes place in requests to the SQL databases.

Because there is no level access control it's possible to get the entire document. We won't encounter any limitations as we may know from SQL injection attacks.

Example Vulnerability

We'll use this XML snippet for the examples.

<?xml version="1.0" encoding="utf-8"?>
<Employees>
   <Employee ID="1">
      <FirstName>Arnold</FirstName>
      <LastName>Baker</LastName>
      <UserName>ABaker</UserName>
      <Password>SoSecret</Password>
      <Type>Admin</Type>
   </Employee>
   <Employee ID="2">
      <FirstName>Peter</FirstName>
      <LastName>Pan</LastName>
      <UserName>PPan</UserName>
      <Password>NotTelling</Password>
      <Type>User</Type>
   </Employee>
</Employees>

Suppose we have a user authentication system on a web page that used a data file of this sort to login users. Once a username and password have been supplied the software might use XPath to look up the user:

VB:
Dim FindUserXPath as String
FindUserXPath = "//Employee[UserName/text()='" & Request("Username") & "' And 
        Password/text()='" & Request("Password") & "']"

C#:
String FindUserXPath;
FindUserXPath = "//Employee[UserName/text()='" + Request("Username") + "' And 
        Password/text()='" + Request("Password") + "']";

With a normal username and password this XPath would work, but an attacker may send a bad username and password and get an XML node selected without knowing the username or password, like this:

Username: blah' or 1=1 or 'a'='a
Password: blah

FindUserXPath becomes //Employee[UserName/text()='blah' or 1=1 or 
        'a'='a' And Password/text()='blah']

Logically this is equivalent to:
        //Employee[(UserName/text()='blah' or 1=1) or 
        ('a'='a' And Password/text()='blah')]

In this case, only the first part of the XPath needs to be true. The password part becomes irrelevant, and the UserName part will match ALL employees because of the "1=1" part.

XPath Injection Defenses

Just like the techniques to avoid SQL injection, you need to use a parameterized XPath interface if one is available, or escape the user input to make it safe to include in a dynamically constructed query. If you are using quotes to terminate untrusted input in a dynamically constructed XPath query, then you need to escape that quote in the untrusted input to ensure the untrusted data can't try to break out of that quoted context. In the following example, single quotes (') are used to terminate the Username and Password parameters. So, we need to replace any ' characters in this input with the XML encoded version of that character, which is "&apos;".

VB:
Dim FindUserXPath as String
FindUserXPath = "//Employee[UserName/text()='" & Request("Username").Replace("'", "&apos;") & "' And 
        Password/text()='" & Request("Password").Replace("'", "&apos;") & "']"

C#:
String FindUserXPath;
FindUserXPath = "//Employee[UserName/text()='" + Request("Username").Replace("'", "&apos;") + "' And 
        Password/text()='" + Request("Password").Replace("'", "&apos;") + "']";

Another better mitigation option is to use a precompiled XPath[1] query. Precompiled XPath queries are already preset before the program executes, rather than created on the fly after the user's input has been added to the string. This is a better route because you don't have to worry about missing a character that should have been escaped.

Related Threat Agents

Related Attacks

Related Vulnerabilities

Related Controls

Just like SQL injection, in order to protect yourself you must escape single quotes (or double quotes) if your application uses them.

VB:

Dim FindUserXPath as String
FindUserXPath = "//Employee[UserName/text()='" &
Request("Username").Replace("'", "'") & "' And
       Password/text()='" & Request("Password").Replace("'", "'") & "']"

C#:

String FindUserXPath;
FindUserXPath = "//Employee[UserName/text()='" +
Request("Username").Replace("'", "'") + "' And
       Password/text()='" + Request("Password").Replace("'", "'") + "']";

Another better mitigation option is to use a precompiled XPath[1]. Precompiled XPaths are already preset before the program executes, rather than created on the fly after the user's input has been added to the string. This is a better route because you don't have to worry about missing a character that should have been escaped.

Use of parameterized XPath queries - Parameterization causes the input to be restricted to certain domains, such as strings or integers, and any input outside such domains is considered invalid and the query fails.

Use of custom error pages - Attackers can glean information about the nature of queries from descriptive error messages. Input validation must be coupled with customized error pages that inform about an error without disclosing information about the database or application.