This site is the archived OWASP Foundation Wiki and is no longer accepting Account Requests.
To view the new OWASP Foundation website, please visit https://owasp.org
Difference between revisions of "OWASP JSON Sanitizer"
(Created page with "=Main= Project Leader’s content goes here =Project About= {{:Projects/OWASP_Example_Project}} Category:OWASP Project") |
|||
(74 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
− | =Main= | + | = Main = |
− | + | <div style="width:100%;height:160px;border:0,margin:0;overflow: hidden;">[[File:OWASP_Project_Header.jpg|link=]]</div> | |
− | = | + | {| style="padding: 0;margin:0;margin-top:10px;text-align:left;" |- |
− | + | | valign="top" style="border-right: 1px dotted gray;padding-right:25px;" | | |
− | [[Category:OWASP Project]] | + | == OWASP JSON Sanitizer Project == |
+ | |||
+ | Our Mission: Given JSON-like content, convert it to valid JSON! The OWASP JSON Sanitizer Project is a simple to use Java library that can be attached at either end of a data-pipeline to help satisfy Postel's principle: <i>be conservative in what you do, be liberal in what you accept from others.</i> When applied to JSON-like content from others, this project will produce well-formed JSON that should satisfy any parser you use. When applied to your output before you send, it will coerce minor mistakes in encoding and make it easier to embed your JSON in HTML and XML. | ||
+ | |||
+ | This library is very easy to use. For more information, visit the [https://code.google.com/p/json-sanitizer/wiki/GettingStarted getting started guide]. | ||
+ | ==Licensing== | ||
+ | The OWASP JSON Sanitizer is free to use under the [http://www.apache.org/licenses/LICENSE-2.0 Apache 2 License]. | ||
+ | |||
+ | | valign="top" style="padding-left:25px;width:200px;border-right: 1px dotted gray;padding-right:25px;" | | ||
+ | |||
+ | == What is this? == | ||
+ | |||
+ | The OWASP JSON Sanitizer Projects provides: | ||
+ | |||
+ | * Java based JSON outbound or inbound sanitization library | ||
+ | |||
+ | == Code Repo == | ||
+ | |||
+ | [https://github.com/owasp/json-sanitizer OWASP JSON Sanitizer at GitHub] | ||
+ | |||
+ | == Email List == | ||
+ | |||
+ | [https://groups.google.com/forum/#!forum/json-sanitizer-support Project Support List] | ||
+ | |||
+ | == Project Leaders == | ||
+ | |||
+ | Author/Project Leader<br/>[https://www.owasp.org/index.php/User:Mike_Samuel Mike Samuel] [mailto:[email protected] @]<br/><br/> | ||
+ | Project Manager<br/>[https://www.owasp.org/index.php/User:Jmanico Jim Manico] [mailto:[email protected] @] | ||
+ | |||
+ | == Related Projects == | ||
+ | |||
+ | * [[XSS (Cross Site Scripting) Prevention Cheat Sheet]] | ||
+ | * [[OWASP Java HTML Sanitizer Project]] | ||
+ | * [[OWASP Java Encoder Project]] | ||
+ | * [[OWASP Dependency Check]] | ||
+ | * [https://github.com/sourceclear/headlines Sourceclear Headlines] | ||
+ | * [https://code.google.com/p/keyczar/ Google KeyCzar] | ||
+ | * [http://shiro.apache.org/ Apache SHIRO] | ||
+ | |||
+ | | valign="top" style="padding-left:25px;width:200px;" | | ||
+ | |||
+ | == Quick Download == | ||
+ | |||
+ | * [https://search.maven.org/#artifactdetails|com.mikesamuel|json-sanitizer|1.1|jar v1.1 home at Maven Central] | ||
+ | * [https://search.maven.org/remotecontent?filepath=com/mikesamuel/json-sanitizer/1.1/json-sanitizer-1.1.jar v1.1 jar at Maven Central] | ||
+ | |||
+ | == News and Events == | ||
+ | * [October 15, 2015] v1.1 Released at Maven Central! | ||
+ | * [July 5, 2014] v1 Released at Maven Central! | ||
+ | * [March 29, 2014] Template and Doc cleanup! | ||
+ | * [Oct 17, 2012] .9 Released! | ||
+ | |||
+ | ==Classifications== | ||
+ | |||
+ | {| width="200" cellpadding="2" | ||
+ | |- | ||
+ | | align="center" valign="top" width="50%" rowspan="2"| [[File:Owasp-incubator-trans-85.png|link=https://www.owasp.org/index.php/OWASP_Project_Stages#tab=Incubator_Projects]] | ||
+ | | align="center" valign="top" width="50%"| [[File:Owasp-builders-small.png|link=]] | ||
+ | |- | ||
+ | | align="center" valign="top" width="50%"| [[File:Owasp-defenders-small.png|link=]] | ||
+ | |- | ||
+ | | colspan="2" align="center" | [http://www.apache.org/licenses/LICENSE-2.0 Apache 2 License] | ||
+ | |- | ||
+ | | colspan="2" align="center" | [[File:Project_Type_Files_CODE.jpg|link=]] | ||
+ | |} | ||
+ | |||
+ | |} | ||
+ | |||
+ | = Getting Started = | ||
+ | |||
+ | == Importing the JSON Sanitizer== | ||
+ | |||
+ | You can fetch the jars from [http://search.maven.org/#search%7Cga%7C1%7Cjson-sanitizer Maven Central] or you can let your favorite java package manager handle it for you via: | ||
+ | |||
+ | <dependency> | ||
+ | <groupId>com.mikesamuel</groupId> | ||
+ | <artifactId>json-sanitizer</artifactId> | ||
+ | <version>1.1</version> | ||
+ | </dependency> | ||
+ | |||
+ | Once you've got the JSON sanitizer JAR on your classpath, | ||
+ | |||
+ | import com.google.json.JsonSanitizer; | ||
+ | will let you call | ||
+ | |||
+ | String wellFormedJson = JsonSanitizer.sanitize(myJsonLikeString); | ||
+ | That's it. Now wellFormedJson is a string of well-formed JSON that is safe to pass to JavaScript?'s eval operator and which can be easily embedded in XML or HTML. | ||
+ | |||
+ | If you have further questions, check our [https://github.com/OWASP/json-sanitizer/issues support list]. | ||
+ | |||
+ | = Motivation = | ||
+ | |||
+ | The OWASP JSON Sanitizer Project is a simple to use Java library that can be attached at either end of a data-pipeline. When applied to JSON-like content from others, this project will produce well-formed JSON that should satisfy any parser you use. When applied to your output before you send, it will coerce minor mistakes in encoding and make it easier to embed your JSON in HTML and XML. | ||
+ | |||
+ | [[Image:JSON-Sanitizer-Arch.png|500px|frameless|architecture]] | ||
+ | |||
+ | Many applications have large amounts of code that uses ad-hoc methods to generate JSON outputs. Frequently these outputs all pass through a small amount of framework code before being sent over the network. This small amount of framework code can use this library to make sure that the ad-hoc outputs are standards compliant and safe to pass to (overly) powerful deserializers like Javascript's eval operator. | ||
+ | |||
+ | Applications also often have web service APIs that receive JSON from a variety of sources. When this JSON is created using ad-hoc methods, this library can massage it into a form that is easy to parse. | ||
+ | |||
+ | By hooking this library into the code that sends and receives requests and responses, this library can help software architects ensure system-wide security and well-formedness guarantees. | ||
+ | |||
+ | = Input = | ||
+ | |||
+ | The sanitizer takes JSON like content, and interprets it as JS eval would. Specifically, it deals with these non-standard constructs. | ||
+ | |||
+ | {| | ||
+ | |- | ||
+ | | <tt>'...'</tt> | ||
+ | | Single quoted strings are converted to JSON strings. | ||
+ | |- | ||
+ | | <tt>\xAB</tt> | ||
+ | | Hex escapes are converted to JSON unicode escapes. | ||
+ | |- | ||
+ | | <tt>\012</tt> | ||
+ | | Octal escapes are converted to JSON unicode escapes. | ||
+ | |- | ||
+ | | <tt>0xAB</tt> | ||
+ | | Hex integer literals are converted to JSON decimal numbers. | ||
+ | |- | ||
+ | | <tt>012</tt> | ||
+ | | Octal integer literals are converted to JSON decimal numbers. | ||
+ | |- | ||
+ | | <tt>+.5</tt> | ||
+ | | Decimal numbers are coerced to JSON's stricter format. | ||
+ | |- | ||
+ | | <tt>[0,,2]</tt> | ||
+ | | Elisions in arrays are filled with <tt>null</tt>. | ||
+ | |- | ||
+ | | <tt>[1,2,3,]</tt> | ||
+ | | Trailing commas are removed. | ||
+ | |- | ||
+ | | <tt>{foo:"bar"}</tt> | ||
+ | | Unquoted property names are quoted. | ||
+ | |- | ||
+ | | <tt>//comments</tt> | ||
+ | | JS style line and block comments are removed. | ||
+ | |- | ||
+ | | <tt>(...)</tt> | ||
+ | | Grouping parentheses are removed. | ||
+ | |} | ||
+ | |||
+ | The sanitizer fixes missing punctuation, end quotes, and mismatched or missing close brackets. If an input contains only white-space then the valid JSON string <tt>null</tt> is substituted. | ||
+ | |||
+ | = Output = | ||
+ | |||
+ | The output is well-formed JSON as defined by | ||
+ | [http://www.ietf.org/rfc/rfc4627.txt RFC 4627]. | ||
+ | The output satisfies three additional properties: | ||
+ | |||
+ | # The output will not contain the substring (case-insensitively) <tt>"</script"</tt> so can be embedded inside an HTML script element without further encoding. | ||
+ | # The output will not contain the substring <tt>"]]>"</tt> so can be embedded inside an XML CDATA section without further encoding. | ||
+ | # The output is a valid Javascript expression, so can be parsed by Javascript's <tt>eval</tt> builtin (after being wrapped in parentheses) or by <tt>JSON.parse</tt>. Specifically, the output will not contain any string literals with embedded JS newlines (U+2028 Paragraph separator or U+2029 Line separator). | ||
+ | # The output contains only valid Unicode [http://www.unicode.org/glossary/#unicode_scalar_value scalar values] (no isolated [http://www.unicode.org/glossary/#surrogate_pair UTF-16 surrogates]) that are [http://www.w3.org/TR/xml/#charsets allowed in XML] unescaped. | ||
+ | |||
+ | = Security = | ||
+ | |||
+ | Since the output is well-formed JSON, passing it to eval will have no side-effects and no free variables, so is neither a code-injection vector, nor a vector for exfiltration of secrets. | ||
+ | |||
+ | This library only ensures that the JSON string → Javascript object phase has no side effects and resolves no free variables, and cannot control how other client side code later interprets the resulting Javascript object. So if client-side code takes a part of the parsed data that is controlled by an attacker and passes it back through a powerful interpreter like eval or innerHTML then that client-side code might suffer unintended side-effects. | ||
+ | |||
+ | var myValue = eval(sanitizedJsonString); // safe | ||
+ | var myEmbeddedValue = eval(myValue.foo); // possibly unsafe | ||
+ | Additionally, sanitizing JSON cannot protect an application from Confused Deputy attacks | ||
+ | |||
+ | var myValue = JSON.parse(sanitizedJsonString); | ||
+ | addToAdminstratorsGroup(myValue.propertyFromUntrustedSource); | ||
+ | |||
+ | = Performance = | ||
+ | |||
+ | The sanitize method will return the input string without allocating a new | ||
+ | buffer when the input is already valid JSON that satisfies the properties | ||
+ | above. Thus, if used on input that is usually well formed, it has minimal | ||
+ | memory overhead. | ||
+ | |||
+ | The sanitize method takes O(n) time where n is the length in UTF-16 | ||
+ | code-units. | ||
+ | |||
+ | = Analogy = | ||
+ | |||
+ | The JSON Sanitizer is similar to a [http://en.wikipedia.org/wiki/Decoupling_capacitor decoupling capacitor]. | ||
+ | |||
+ | Any module that takes noisy input in and puts out clean output helps *decouple* the security properties of one piece of code (that uses eval) from the security properties of another (that concatenates untrusted strings to produce JSON) so that a failure in one does not necessitate a failure in the other. | ||
+ | |||
+ | __NOTOC__ <headertabs /> | ||
+ | |||
+ | [[Category:OWASP_Tool]] | ||
+ | [[Category:OWASP_Alpha_Quality_Tool]] | ||
+ | [[Category:OWASP_Project|OWASP JSON Sanitizer Project]] |
Latest revision as of 16:40, 19 January 2016
OWASP JSON Sanitizer ProjectOur Mission: Given JSON-like content, convert it to valid JSON! The OWASP JSON Sanitizer Project is a simple to use Java library that can be attached at either end of a data-pipeline to help satisfy Postel's principle: be conservative in what you do, be liberal in what you accept from others. When applied to JSON-like content from others, this project will produce well-formed JSON that should satisfy any parser you use. When applied to your output before you send, it will coerce minor mistakes in encoding and make it easier to embed your JSON in HTML and XML. This library is very easy to use. For more information, visit the getting started guide. LicensingThe OWASP JSON Sanitizer is free to use under the Apache 2 License. |
What is this?The OWASP JSON Sanitizer Projects provides:
Code RepoOWASP JSON Sanitizer at GitHub Email ListProject LeadersAuthor/Project Leader Related Projects |
Quick DownloadNews and Events
Classifications
|
Importing the JSON Sanitizer
You can fetch the jars from Maven Central or you can let your favorite java package manager handle it for you via:
<dependency> <groupId>com.mikesamuel</groupId> <artifactId>json-sanitizer</artifactId> <version>1.1</version> </dependency>
Once you've got the JSON sanitizer JAR on your classpath,
import com.google.json.JsonSanitizer;
will let you call
String wellFormedJson = JsonSanitizer.sanitize(myJsonLikeString);
That's it. Now wellFormedJson is a string of well-formed JSON that is safe to pass to JavaScript?'s eval operator and which can be easily embedded in XML or HTML.
If you have further questions, check our support list.
The OWASP JSON Sanitizer Project is a simple to use Java library that can be attached at either end of a data-pipeline. When applied to JSON-like content from others, this project will produce well-formed JSON that should satisfy any parser you use. When applied to your output before you send, it will coerce minor mistakes in encoding and make it easier to embed your JSON in HTML and XML.
Many applications have large amounts of code that uses ad-hoc methods to generate JSON outputs. Frequently these outputs all pass through a small amount of framework code before being sent over the network. This small amount of framework code can use this library to make sure that the ad-hoc outputs are standards compliant and safe to pass to (overly) powerful deserializers like Javascript's eval operator.
Applications also often have web service APIs that receive JSON from a variety of sources. When this JSON is created using ad-hoc methods, this library can massage it into a form that is easy to parse.
By hooking this library into the code that sends and receives requests and responses, this library can help software architects ensure system-wide security and well-formedness guarantees.
The sanitizer takes JSON like content, and interprets it as JS eval would. Specifically, it deals with these non-standard constructs.
'...' | Single quoted strings are converted to JSON strings. |
\xAB | Hex escapes are converted to JSON unicode escapes. |
\012 | Octal escapes are converted to JSON unicode escapes. |
0xAB | Hex integer literals are converted to JSON decimal numbers. |
012 | Octal integer literals are converted to JSON decimal numbers. |
+.5 | Decimal numbers are coerced to JSON's stricter format. |
[0,,2] | Elisions in arrays are filled with null. |
[1,2,3,] | Trailing commas are removed. |
{foo:"bar"} | Unquoted property names are quoted. |
//comments | JS style line and block comments are removed. |
(...) | Grouping parentheses are removed. |
The sanitizer fixes missing punctuation, end quotes, and mismatched or missing close brackets. If an input contains only white-space then the valid JSON string null is substituted.
The output is well-formed JSON as defined by RFC 4627. The output satisfies three additional properties:
- The output will not contain the substring (case-insensitively) "</script" so can be embedded inside an HTML script element without further encoding.
- The output will not contain the substring "]]>" so can be embedded inside an XML CDATA section without further encoding.
- The output is a valid Javascript expression, so can be parsed by Javascript's eval builtin (after being wrapped in parentheses) or by JSON.parse. Specifically, the output will not contain any string literals with embedded JS newlines (U+2028 Paragraph separator or U+2029 Line separator).
- The output contains only valid Unicode scalar values (no isolated UTF-16 surrogates) that are allowed in XML unescaped.
Since the output is well-formed JSON, passing it to eval will have no side-effects and no free variables, so is neither a code-injection vector, nor a vector for exfiltration of secrets.
This library only ensures that the JSON string → Javascript object phase has no side effects and resolves no free variables, and cannot control how other client side code later interprets the resulting Javascript object. So if client-side code takes a part of the parsed data that is controlled by an attacker and passes it back through a powerful interpreter like eval or innerHTML then that client-side code might suffer unintended side-effects.
var myValue = eval(sanitizedJsonString); // safe var myEmbeddedValue = eval(myValue.foo); // possibly unsafe
Additionally, sanitizing JSON cannot protect an application from Confused Deputy attacks
var myValue = JSON.parse(sanitizedJsonString); addToAdminstratorsGroup(myValue.propertyFromUntrustedSource);
The sanitize method will return the input string without allocating a new buffer when the input is already valid JSON that satisfies the properties above. Thus, if used on input that is usually well formed, it has minimal memory overhead.
The sanitize method takes O(n) time where n is the length in UTF-16 code-units.
The JSON Sanitizer is similar to a decoupling capacitor.
Any module that takes noisy input in and puts out clean output helps *decouple* the security properties of one piece of code (that uses eval) from the security properties of another (that concatenates untrusted strings to produce JSON) so that a failure in one does not necessitate a failure in the other.