This site is the archived OWASP Foundation Wiki and is no longer accepting Account Requests.
To view the new OWASP Foundation website, please visit https://owasp.org
OWASP Java HTML Sanitizer Project
- Main
- Info
- Creating a HTML Policy
- Questions
- About
- Main
- Creating a HTML Policy
- CSS Sanitization
- Inline/Embedded Images
- Questions
- Roadmap
OWASP HTML Sanitizer ProjectThe OWASP HTML Sanitizer is a fast and easy to configure HTML Sanitizer written in Java which lets you include HTML authored by third-parties in your web application while protecting against XSS. The existing dependencies are on guava and JSR 305. The other jars are only needed by the test suite. The JSR 305 dependency is a compile-only dependency, only needed for annotations. This code was written with security best practices in mind, has an extensive test suite, and has undergone adversarial security review. A great place to get started using the OWASP Java HTML Sanitizer is here: https://code.google.com/p/owasp-java-html-sanitizer/wiki/GettingStarted. SecuritySince the output of the OWASP JSON Sanitizer Project is always well-formed JSON, passing it to eval will have no side-effects and no free variables, so is neither a code-injection vector, nor a vector for exfiltration of secrets. This library only ensures that the JSON string → Javascript object phase has no side effects and resolves no free variables, and cannot control how other client side code later interprets the resulting Javascript object. So if client-side code takes a part of the parsed data that is controlled by an attacker and passes it back through a powerful interpreter like eval or innerHTML then that client-side code might suffer unintended side-effects. LicensingThe OWASP Java Encoder is free to use under the Apache 2 License. |
What is this?The OWASP JSON Sanitizer Projects provides:
Code RepoOWASP JSON Sanitizer at Google Code Email ListProject LeaderProject Leader: Related Projects |
Quick DownloadNews and Events
Classifications
|
The OWASP HTML Sanitizer is a fast and easy to configure HTML Sanitizer written in Java which lets you include HTML authored by third-parties in your web application while protecting against XSS.
The existing dependencies are on guava and JSR 305. The other jars are only needed by the test suite. The JSR 305 dependency is a compile-only dependency, only needed for annotations.
This code was written with security best practices in mind, has an extensive test suite, and has undergone adversarial security review.
A great place to get started using the OWASP Java HTML Sanitizer is here: https://code.google.com/p/owasp-java-html-sanitizer/wiki/GettingStarted.
You can use prepackaged policies here: http://owasp-java-html-sanitizer.googlecode.com/svn/trunk/distrib/javadoc/org/owasp/html/Sanitizers.html.
PolicyFactory policy = Sanitizers.FORMATTING.and(Sanitizers.LINKS); String safeHTML = policy.sanitize(untrustedHTML);
or the tests show how to configure your own policy here: http://code.google.com/p/owasp-java-html-sanitizer/source/browse/trunk/src/tests/org/owasp/html/HtmlPolicyBuilderTest.java
PolicyFactory policy = new HtmlPolicyBuilder() .allowElements("a") .allowUrlProtocols("https") .allowAttributes("href").onElements("a") .requireRelNofollowOnLinks() .build(); String safeHTML = policy.sanitize(untrustedHTML);
or you can write custom policies to do things like changing h1s to divs with a certain class:
PolicyFactory policy = new HtmlPolicyBuilder() .allowElements("p") .allowElements( new ElementPolicy() { public String apply(String elementName, List<String> attrs) { attrs.add("class"); attrs.add("header-" + elementName); return "div"; } }, "h1", "h2", "h3", "h4", "h5", "h6")) .build(); String safeHTML = policy.sanitize(untrustedHTML);
- How was this project tested?
- This code was written with security best practices in mind, has an extensive test suite, and has undergone adversarial security review.
- How is this project deployed?
- This project is best deployed through Maven https://code.google.com/p/owasp-java-html-sanitizer/wiki/Maven
OWASP HTML Sanitizer ProjectThe OWASP HTML Sanitizer is a fast and easy to configure HTML Sanitizer written in Java which lets you include HTML authored by third-parties in your web application while protecting against XSS. The existing dependencies are on guava and JSR 305. The other jars are only needed by the test suite. The JSR 305 dependency is a compile-only dependency, only needed for annotations. This code was written with security best practices in mind, has an extensive test suite, and has undergone adversarial security review. A great place to get started using the OWASP Java HTML Sanitizer is here: https://github.com/OWASP/java-html-sanitizer/blob/master/docs/getting_started.md. Benefits
LicensingThe OWASP HTML Sanitizer is free to use and is dual licensed under the Apache 2 License and the New BSD License. |
What is this?The OWASP HTML Sanitizer Projects provides Java based HTML sanitization of untrusted HTML! Code RepoOWASP HTML Sanitizer at GitHub Email ListQuestions? Please sign up for our Project Support List Project LeadersAuthor/Project Leader Related Projects
Ohloh |
Quick DownloadOWASP HTML Sanitizer at Maven Central News and Events
Change LogFor recent release notes, please visit the changelog on GitHub. Classifications
|
You can view a few basic prepackaged policies for links, tables, integers, images and more here: https://github.com/OWASP/java-html-sanitizer/blob/master/src/main/java/org/owasp/html/Sanitizers.java.
PolicyFactory policy = Sanitizers.FORMATTING.and(Sanitizers.LINKS); String safeHTML = policy.sanitize(untrustedHTML);
There tests illustrate how to configure your own policy here: https://github.com/OWASP/java-html-sanitizer/blob/master/src/test/java/org/owasp/html/HtmlPolicyBuilderTest.java
PolicyFactory policy = new HtmlPolicyBuilder() .allowElements("a") .allowUrlProtocols("https") .allowAttributes("href").onElements("a") .requireRelNofollowOnLinks() .build(); String safeHTML = policy.sanitize(untrustedHTML);
... or you can write custom policies ...
PolicyFactory policy = new HtmlPolicyBuilder() .allowElements("p") .allowElements( new ElementPolicy() { public String apply(String elementName, List<String> attrs) { attrs.add("class"); attrs.add("header-" + elementName); return "div"; } }, "h1", "h2", "h3", "h4", "h5", "h6")) .build(); String safeHTML = policy.sanitize(untrustedHTML);
Please note that the elements "a", "font", "img", "input" and "span" need to be explicitly whitelisted using the `allowWithoutAttributes()` method if you want them to be allowed through the filter when these elements do not include any attributes.
You can also use the default "ebay" and "slashdot" policies. The Slashdot policy (defined here https://github.com/OWASP/java-html-sanitizer/blob/master/src/main/java/org/owasp/html/examples/SlashdotPolicyExample.java) allows the following tags ("a", "p", "div", "i", "b", "em", "blockquote", "tt", "strong"n "br", "ul", "ol", "li") and only certain attributes. This policy also allows for the custom slashdot tags, "quote" and "ecode".
CSS sanitization is challenging.
We disallow position:sticky and position:fixed so that client code can use a position:relative;overflow:hidden to contain self-styling sanitized snippets. Embedders of sanitized content do have to consistently do that and make sure that contributed content is clearly demarcated.
Most CSS attacks require a payload to specify selectors which the sanitizer should not allow. Unproxied images do allow tracking and, by positioning below the fold, can track whether a user scrolls down. Embedders do need to use URL rewriting if they allow background styling and use sensible Referrer-Policy and related headers.
That said, even if care is taken, CSS has a large attack surface, so not using it puts you in a safer place.
Inline images use the data URI scheme to embed images directly within web pages. The following describes how to allow inline images in an HTML Sanitizer policy.
1) Add the "data" protocol do your whitelist. See: https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20160628.1/org/owasp/html/HtmlPolicyBuilder.html#allowUrlProtocols
.allowUrlProtocols("data")
2) You can then allow an attribute with an extra check thus
.allowAttributes("src") .matching(...) .onElements("img")
3) There are a number of things you can do in the matching part such as allow the following instead of just allowing data.
data:image/...
4) Since allowUrlProtocols("data") allows data URLs anywhere data URLs are allowed, you might want to also add a matcher to any other URL attributes that reject anything with a colon that does not start with http: or https: or mailto:
.allowAttributes("href") .matching(...) .onElements("a")
How was this project tested?
This code was written with security best practices in mind, has an extensive test suite, and has undergone adversarial security review.
How is this project deployed?
This project is best deployed through Maven https://github.com/OWASP/java-html-sanitizer/blob/master/docs/getting_started.md
- Maintaining a fully featured HTML sanitizer is a lot of work. We intend to continue to handle community questions and bug reports in a very timely manner.
- There are no plans for major new features other than supporting incoming requests for advanced sanitization such as additional HTML5 support.