<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>https://wiki.owasp.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Raboof</id>
		<title>OWASP - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="https://wiki.owasp.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Raboof"/>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php/Special:Contributions/Raboof"/>
		<updated>2026-05-27T09:04:15Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.27.2</generator>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=OWASP_WebScarab_Differences_(Classic_vs_NG)&amp;diff=38174</id>
		<title>OWASP WebScarab Differences (Classic vs NG)</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=OWASP_WebScarab_Differences_(Classic_vs_NG)&amp;diff=38174"/>
				<updated>2008-09-03T12:28:22Z</updated>
		
		<summary type="html">&lt;p&gt;Raboof: /* Plugins */ added some structure&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;'''This page is intended to document the differences between WebScarab Classic and WebScarab Next Generation'''&lt;br /&gt;
&lt;br /&gt;
The objective is to list the major features that one has over the other, with the intent to track the porting of desirable features from Classic to NG.&lt;br /&gt;
&lt;br /&gt;
==Framework functionality==&lt;br /&gt;
&lt;br /&gt;
NG has no concept of the shared cookie jar, which is used in Classic to allow plugins such as the Spider and Manual Request plugins to use the most current cookies for a particular URL. This could/should be replaced by an Identity module, which can provide the most current identifiers for a particular identity (cookies, Basic auth, etc).&lt;br /&gt;
&lt;br /&gt;
NG now also has the Transcoder functionality, implemented as a non-modal dialog. It is intended to also implement &amp;quot;right-click&amp;quot; menus to perform various transcoding operations &amp;quot;in-place&amp;quot; in arbitrary text fields.&lt;br /&gt;
&lt;br /&gt;
==Plugins==&lt;br /&gt;
&lt;br /&gt;
NG has significantly fewer plugins than Classic. The only plugins currently implemented in NG are the Proxy, Manual Request and WebServices plugins. &lt;br /&gt;
&lt;br /&gt;
This leaves the following plugins to be implemented:&lt;br /&gt;
&lt;br /&gt;
* Spider&lt;br /&gt;
* Extensions&lt;br /&gt;
* XSSCRLF&lt;br /&gt;
* SessionIDAnalysis&lt;br /&gt;
* Scripting&lt;br /&gt;
* Fragments&lt;br /&gt;
* Compare&lt;br /&gt;
* Search&lt;br /&gt;
&lt;br /&gt;
=== Proxy ===&lt;br /&gt;
Features that remain to be ported:&lt;br /&gt;
* BeanShell scripts for programmatic modification of requests/responses&lt;br /&gt;
* Miscellaneous proxy plugins - Reveal hidden fields, prevent caching of responses&lt;br /&gt;
* Ability to modify Internet Explorer proxy settings automatically on startup and exit&lt;br /&gt;
&lt;br /&gt;
=== Manual Request ===&lt;br /&gt;
Features that remain to be ported:&lt;br /&gt;
* Ability to convert a request from a GET to a POST or multipart POST, and vice versa.&lt;br /&gt;
&lt;br /&gt;
=== WebServices ===&lt;br /&gt;
Partially completed. Currently it is sufficient to access the WebGoat web service.&lt;br /&gt;
&lt;br /&gt;
Features that remain to be ported:&lt;br /&gt;
* Support for complex types (This functionality could easily be added if desired)&lt;br /&gt;
&lt;br /&gt;
==HTTP Protocol support==&lt;br /&gt;
&lt;br /&gt;
WebScarab-classic has support for authentication to servers using SSL client certificates (including those stored on a smart card), as well as using NTLM. NG does not currently support SSL client certificates at all. NTLM should be supported through the Apache HTTPClient library, but this has not been tested.&lt;br /&gt;
&lt;br /&gt;
==Porting suggestions==&lt;br /&gt;
&lt;br /&gt;
For people interested in contributing to this project by porting one of the above plugins, here are some suggestions:&lt;br /&gt;
&lt;br /&gt;
* SessionID analysis&lt;br /&gt;
The current session id analysis plugin, while looking cool is actually very misleading. Anyone wanting to implement this feature for NG would be advised to take a look at Michal Zalewski's stompy to see how it can be done better.&lt;br /&gt;
&lt;br /&gt;
* Search, Compare&lt;br /&gt;
These plugins are very clunky to use. It actually makes a lot more sense to make those features available as part of the primary interface, rather than relegating them to a backwater. Search should provide a simple interface where the operator can type some text and click Go, rather than having to write code.&lt;br /&gt;
&lt;br /&gt;
* Spider&lt;br /&gt;
This plugin should also identify FORMs in the HTML responses, and identify those that have been submitted by matching them with the parameters of GET requests, or the bodies of POST's, using an intelligent matching algorithm. (Empty parameters in the form may be matched to anything in a GET/POST)&lt;br /&gt;
&lt;br /&gt;
==Execution==&lt;br /&gt;
&lt;br /&gt;
WebScarab NG is currently only executable via Java WebStart, which is likely to pose a problem for certain folk. An alternate packaging has been created, using the Maven onejar plugin, which packages all the required jars into a subdirectory of the &amp;quot;onejar&amp;quot;, and provides a specialised classloader to allow Java to access the contents of those jars.&lt;br /&gt;
&lt;br /&gt;
[[Category:OWASP WebScarab Project]]&lt;br /&gt;
[[Category:OWASP WebScarab NG Project]]&lt;/div&gt;</summary>
		<author><name>Raboof</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=Category:OWASP_AntiSamy_Project&amp;diff=28083</id>
		<title>Category:OWASP AntiSamy Project</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=Category:OWASP_AntiSamy_Project&amp;diff=28083"/>
				<updated>2008-04-15T20:06:47Z</updated>
		
		<summary type="html">&lt;p&gt;Raboof: /* Emailing the project lead */ spurious whitespace&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= OWASP AntiSamy =&lt;br /&gt;
&lt;br /&gt;
== What is it? ==&lt;br /&gt;
&lt;br /&gt;
The OWASP AntiSamy project is a few things. Technically, it is an API for ensuring user-supplied HTML/CSS is in compliance within an application's rules. Another way of saying that could be: It's an API that helps you make sure that clients don't supply malicious cargo code in the HTML they supply for their profile, comments, etc. that gets persisted on the server. The term malicious code in terms of web applications is usually regarded only as JavaScript. Cascading Stylesheets are only considered malicious when they invoke the JavaScript engine. However, there are many situations where &amp;quot;normal&amp;quot; HTML and CSS can be used in a malicious manner.&lt;br /&gt;
&lt;br /&gt;
Philosophically, AntiSamy is a departure from all contemporary security mechanisms. Generally, the security mechanism and user have a communication that is virtually one way, for good reason. Letting the potential attacker know details about the validation is considered unwise as it allows the attacker to &amp;quot;learn&amp;quot; and &amp;quot;recon&amp;quot; the mechanism for weaknesses. These types of information leaks can also hurt in ways you don't expect. A login mechanism that tells the user, &amp;quot;Username invalid&amp;quot; leaks the fact that a user by that name does not exist. A user could use a dictionary or phone book or both to remotely come up with a list of valid usernames. Using this information, an attacker could launch a brute force attack or massive account lock denial-of-service. So, we get that.&lt;br /&gt;
&lt;br /&gt;
Unfortunately, that's just not very usable in this situation. Typical Internet users are largely ineffective when it comes to writing HTML/CSS, so where do they get their HTML from? Usually they copy it from somewhere out on the web. Simply rejecting their input without any clue as to why is jolting and annoying. Annoyed users go somewhere else to do their social networking.&lt;br /&gt;
&lt;br /&gt;
Socioeconomically, AntiSamy is a have-not enabler. Private companies like Google, MySpace, eBay, etc. have come up with proprietary solutions for solving this problem. This introduces two problems. One is that proprietary solutions are not usually all that good, and even if they are, well - naturally they're reluctant to share this hard-earned IP for free. Fortunately, we just don't care. We don't see any reason why all only these private companies should have this functionality, so I'm releasing this for free under the BSD license.&lt;br /&gt;
&lt;br /&gt;
== Who are you? ==&lt;br /&gt;
&lt;br /&gt;
AntiSamy was originally authored by Arshan Dabirsiaghi (arshan.dabirsiaghi [at the] gmail.com) with help from Jason Li (li.jason.c [at the] gmail.com), both of Aspect Security (http://www.aspectsecurity.com/). The problem AntiSamy solves was often described as &amp;quot;impossible&amp;quot; or &amp;quot;impossible to do right&amp;quot;. The folks with the AntiSamy project hope to antiquate that idea in a hurry. As of now, there is only a Java implementation of AntiSamy, though the framework is implementable in any language. The AntiSamy team hopes to release versions in .NET and PHP by Spring of 2008. There has not been much interest in this project from the Rails community, so no implementation for Rails is being planned.&lt;br /&gt;
&lt;br /&gt;
== How do I get started? ==&lt;br /&gt;
&lt;br /&gt;
There's 4 steps in the process of integrating AntiSamy. Each step is detailed in the next section, but the high level overview follows:&lt;br /&gt;
# Download AntiSamy from [http://code.google.com/p/owaspantisamy/downloads/list its home on Google Code]&lt;br /&gt;
# Choose one of the standard policy files that matches as close to the functionality you need:&lt;br /&gt;
#* antisamy-slashdot.xml&lt;br /&gt;
#* antisamy-ebay.xml&lt;br /&gt;
#* antisamy-myspace.xml&lt;br /&gt;
#* antisamy-anythinggoes.xml&lt;br /&gt;
# Tailor the policy file according to your site's rules&lt;br /&gt;
# Call the API from the code&lt;br /&gt;
&lt;br /&gt;
=== Stage 1 - Downloading AntiSamy ===&lt;br /&gt;
&lt;br /&gt;
Which package you download depends largely on what you want to do with AntiSamy. If you'd like to extend it or review the code, download the source package. If you're looking for quick integration, simply download the JAR and put it into you classpath. It has a relatively small footprint (2.2mb), and most of that could be taken out with ProGuard if anyone out there is a ProGuard wizard.&lt;br /&gt;
&lt;br /&gt;
=== Stage 2 - Choosing a base policy file ===&lt;br /&gt;
&lt;br /&gt;
Chances are that your site's use case for AntiSamy is at least roughly comparable to one of the predefined policy files. They each represent a &amp;quot;typical&amp;quot; scenario for allowing users to provide HTML (and possibly CSS) formatting information. Let's look into the different policy files:&lt;br /&gt;
&lt;br /&gt;
1) antisamy-slashdot.xml&lt;br /&gt;
&lt;br /&gt;
Slashdot (http://www.slashdot.org/) is a techie news site that allows users to respond anonymously to news posts with very limited HTML markup. Now Slashdot is not only one of the coolest sites around, it's also one that's been subject to many different successful attacks. Even more unfortunate is the fact that most of the attacks led users to the infamous goatse.cx picture (please don't go look it up). The rules for Slashdot are fairly strict: users can only submit the following HTML tags and no CSS: &amp;amp;lt;b&amp;amp;gt;, &amp;amp;lt;u&amp;amp;gt;, &amp;amp;lt;i&amp;amp;gt;, &amp;amp;lt;a&amp;amp;gt;, &amp;amp;lt;blockquote&amp;amp;gt;. &lt;br /&gt;
&lt;br /&gt;
Accordingly, we've built a policy file that allows fairly similar functionality. All text-formatting tags that operate directly on the font, color or emphasis have been allowed. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
2) antisamy-ebay.xml&lt;br /&gt;
&lt;br /&gt;
eBay (http://www.ebay.com/) is the most popular online auction site in the universe, as far as I can tell. It is a public site so anyone is allowed to post listings with rich HTML content. It's not surprising that given the attractiveness of eBay as a target that it has been subject to a few complex XSS attacks. Listings are allowed to contain much more rich content than, say, Slashdot- so it's attack surface is considerably larger. The following tags appear to be accepted by eBay (they don't publish rules): &amp;lt;a&amp;gt;,...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
3) antisamy-myspace.xml&lt;br /&gt;
&lt;br /&gt;
MySpace (http://www.myspace.com/) is arguably the most popular social networking site today. Users are allowed to submit pretty much all HTML and CSS they want - as long as it doesn't contain JavaScript. MySpace is currently using a word blacklist to validate users' HTML, which is why they were subject to the infamous Samy worm (http://namb.la/). The Samy worm, which used fragmentation attacks combined with a word that should have been blacklisted (eval) - was the inspiration for the project. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4) antisamy-anythinggoes.xml&lt;br /&gt;
&lt;br /&gt;
I don't know of a possible use case for this policy file. If you wanted to allow every single valid HTML and CSS element (but without JavaScript or blatant CSS-related phishing attacks), you can use this policy file. Not even MySpace is _this_ crazy. However, it does serve as a good reference because it contains base rules for every element, so you can use it as a knowledge base when using tailoring the other policy files.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Stage 3 - Tailoring the policy file ===&lt;br /&gt;
&lt;br /&gt;
Smaller organizations may want to deploy AntiSamy in a default configuration, but it's equally likely that a site may want to have strict, business-driven rules for what users can allow. The discussion that decides the tailoring should also consider attack surface - which grows in relative proportion to the policy file.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Stage 4 - Calling the AntiSamy API ===&lt;br /&gt;
&lt;br /&gt;
Using AntiSamy is abnormally easy. &lt;br /&gt;
Here is an example of invoking AntiSamy with a policy file:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;import org.owasp.validator.html.*;&lt;br /&gt;
&lt;br /&gt;
Policy policy = Policy.getInstance(POLICY_FILE_LOCATION);&lt;br /&gt;
&lt;br /&gt;
AntiSamy as = new AntiSamy();&lt;br /&gt;
CleanResults cr = as.scan(dirtyInput, policy);&lt;br /&gt;
&lt;br /&gt;
MyUserDAO.storeUserProfile(cr.getCleanHTML()); // some custom function&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Policy files can also be referenced by filename by passing a second argument to the &amp;lt;code&amp;gt;AntiSamy:scan()&amp;lt;/code&amp;gt; method as the following examples show.:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;AntiSamy as = new AntiSamy();&lt;br /&gt;
CleanResults cr = as.scan(dirtyInput, policyFilePath);&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Lastly, policy files can be referenced by File objects directly in the second parameter:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;AntiSamy as = new AntiSamy();&lt;br /&gt;
CleanResults cr = as.scan(dirtyInput, new File(policyFilePath));&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Stage 4 - Analyzing CleanResults ===&lt;br /&gt;
&lt;br /&gt;
The CleanResults object provides a lot of useful stuff. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;getErrorMessages()&amp;lt;/code&amp;gt; - a list of &amp;lt;code&amp;gt;String&amp;lt;/code&amp;gt; error messages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;getCleanHTML()&amp;lt;/code&amp;gt; - the clean, safe HTML output&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;getCleanXMLDocumentFragment()&amp;lt;/code&amp;gt; - the clean, safe &amp;lt;code&amp;gt;XMLDocumentFragment&amp;lt;/code&amp;gt; which is reflected in &amp;lt;code&amp;gt;getCleanHTML()&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;getScanTime()&amp;lt;/code&amp;gt; - returns the scan time in seconds&lt;br /&gt;
&lt;br /&gt;
== Project roadmap ==&lt;br /&gt;
&lt;br /&gt;
We have a number of milestones we'd like to accomplish with the help of the community. Hopefully we can allocate some funds for this in the OWASP Spring of Code 2008, but it is far too early to tell. In the meantime, this is a labor of love.&lt;br /&gt;
&lt;br /&gt;
=== .NET version (early Spring 2008, rc1 Fall 2008) ===&lt;br /&gt;
We're aiming for a beta of a .NET version of AntiSamy to be available by early Spring 2008. The major hurdles are finding a suitable &amp;quot;HTML cleaner&amp;quot; like nekohtml in .NET. It needs to be capable of producing document fragments, not just entire HTML documents. For example, if I pass in &amp;lt;code&amp;gt;&amp;amp;lt;i&amp;amp;gt;&amp;amp;lt;b&amp;amp;gt;This is a test&amp;amp;lt;/i&amp;amp;gt;&amp;amp;lt;/b&amp;amp;gt;&amp;lt;/code&amp;gt; to the HTML cleaner, we can't have it send back &amp;lt;code&amp;gt;&amp;amp;lt;html&amp;gt;&amp;amp;lt;head&amp;gt;&amp;amp;lt;/head&amp;gt;&amp;amp;lt;body&amp;gt;&amp;amp;lt;i&amp;amp;gt;&amp;amp;lt;b&amp;amp;gt;This is a test&amp;amp;lt;/i&amp;amp;gt;&amp;amp;lt;/b&amp;amp;gt;&amp;amp;lt;/body&amp;amp;gt;&amp;amp;lt;/html&amp;amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
I personally (Arshan) plan on developing this, but am happy to let someone take over who can focus more time on it.&lt;br /&gt;
&lt;br /&gt;
=== PHP version (beta early Spring 2008, rc1 Fall 2008) ===&lt;br /&gt;
We're aiming for a beta of a PHP version of AntiSamy to be available by early Spring 2008. The major hurdles are finding a suitable &amp;quot;HTML cleaner&amp;quot; like nekohtml in PHP. It needs to be capable of producing document fragments, not just entire HTML documents. For example, if I pass in &amp;lt;code&amp;gt;&amp;amp;lt;i&amp;amp;gt;&amp;amp;lt;b&amp;amp;gt;This is a test&amp;amp;lt;/i&amp;amp;gt;&amp;amp;lt;/b&amp;amp;gt;&amp;lt;/code&amp;gt; to the HTML cleaner, we can't have it send back &amp;lt;code&amp;gt;&amp;amp;lt;html&amp;gt;&amp;amp;lt;head&amp;gt;&amp;amp;lt;/head&amp;gt;&amp;amp;lt;body&amp;gt;&amp;amp;lt;i&amp;amp;gt;&amp;amp;lt;b&amp;amp;gt;This is a test&amp;amp;lt;/i&amp;amp;gt;&amp;amp;lt;/b&amp;amp;gt;&amp;amp;lt;/body&amp;amp;gt;&amp;amp;lt;/html&amp;amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Several members of the community have been in touch with us about working together, including the smart folks over at Zend.com.&lt;br /&gt;
&lt;br /&gt;
== Contacting us ==&lt;br /&gt;
There are two ways of getting information on AntiSamy. The mailing list, and contacting the project lead directly.&lt;br /&gt;
&lt;br /&gt;
=== OWASP AntiSamy mailing list ===&lt;br /&gt;
The first is the mailing list which is located at https://lists.owasp.org/mailman/listinfo/owasp-antisamy. The list was previously private and the archives have been cleared with the release of version 1.0. We encourage all prospective and current users and bored attackers to join in the conversation. We're happy to brainstorm attack scenarios, discuss regular expressions and help with integration.&lt;br /&gt;
&lt;br /&gt;
=== Emailing the project lead ===&lt;br /&gt;
&lt;br /&gt;
For content which is not appropriate for the public mailing list, you can alternatively contact the project lead, Arshan Dabirsiaghi, at [arshan.dabirsiaghi] at the [aspectsecurity.com] (s/ at the /@/).&lt;br /&gt;
&lt;br /&gt;
=== Issue tracking ===&lt;br /&gt;
&lt;br /&gt;
Visit the [http://code.google.com/p/owaspantisamy/issues/list Google Code issue tracker]&lt;br /&gt;
&lt;br /&gt;
[[Category:OWASP Project]]&lt;br /&gt;
[[Category:OWASP Tool]]&lt;br /&gt;
[[Category:OWASP Download]]&lt;/div&gt;</summary>
		<author><name>Raboof</name></author>	</entry>

	<entry>
		<id>https://wiki.owasp.org/index.php?title=Category:OWASP_AntiSamy_Project&amp;diff=28082</id>
		<title>Category:OWASP AntiSamy Project</title>
		<link rel="alternate" type="text/html" href="https://wiki.owasp.org/index.php?title=Category:OWASP_AntiSamy_Project&amp;diff=28082"/>
				<updated>2008-04-15T20:06:16Z</updated>
		
		<summary type="html">&lt;p&gt;Raboof: /* Contacting us */ link to google code issue tracker for troubleshooting&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= OWASP AntiSamy =&lt;br /&gt;
&lt;br /&gt;
== What is it? ==&lt;br /&gt;
&lt;br /&gt;
The OWASP AntiSamy project is a few things. Technically, it is an API for ensuring user-supplied HTML/CSS is in compliance within an application's rules. Another way of saying that could be: It's an API that helps you make sure that clients don't supply malicious cargo code in the HTML they supply for their profile, comments, etc. that gets persisted on the server. The term malicious code in terms of web applications is usually regarded only as JavaScript. Cascading Stylesheets are only considered malicious when they invoke the JavaScript engine. However, there are many situations where &amp;quot;normal&amp;quot; HTML and CSS can be used in a malicious manner.&lt;br /&gt;
&lt;br /&gt;
Philosophically, AntiSamy is a departure from all contemporary security mechanisms. Generally, the security mechanism and user have a communication that is virtually one way, for good reason. Letting the potential attacker know details about the validation is considered unwise as it allows the attacker to &amp;quot;learn&amp;quot; and &amp;quot;recon&amp;quot; the mechanism for weaknesses. These types of information leaks can also hurt in ways you don't expect. A login mechanism that tells the user, &amp;quot;Username invalid&amp;quot; leaks the fact that a user by that name does not exist. A user could use a dictionary or phone book or both to remotely come up with a list of valid usernames. Using this information, an attacker could launch a brute force attack or massive account lock denial-of-service. So, we get that.&lt;br /&gt;
&lt;br /&gt;
Unfortunately, that's just not very usable in this situation. Typical Internet users are largely ineffective when it comes to writing HTML/CSS, so where do they get their HTML from? Usually they copy it from somewhere out on the web. Simply rejecting their input without any clue as to why is jolting and annoying. Annoyed users go somewhere else to do their social networking.&lt;br /&gt;
&lt;br /&gt;
Socioeconomically, AntiSamy is a have-not enabler. Private companies like Google, MySpace, eBay, etc. have come up with proprietary solutions for solving this problem. This introduces two problems. One is that proprietary solutions are not usually all that good, and even if they are, well - naturally they're reluctant to share this hard-earned IP for free. Fortunately, we just don't care. We don't see any reason why all only these private companies should have this functionality, so I'm releasing this for free under the BSD license.&lt;br /&gt;
&lt;br /&gt;
== Who are you? ==&lt;br /&gt;
&lt;br /&gt;
AntiSamy was originally authored by Arshan Dabirsiaghi (arshan.dabirsiaghi [at the] gmail.com) with help from Jason Li (li.jason.c [at the] gmail.com), both of Aspect Security (http://www.aspectsecurity.com/). The problem AntiSamy solves was often described as &amp;quot;impossible&amp;quot; or &amp;quot;impossible to do right&amp;quot;. The folks with the AntiSamy project hope to antiquate that idea in a hurry. As of now, there is only a Java implementation of AntiSamy, though the framework is implementable in any language. The AntiSamy team hopes to release versions in .NET and PHP by Spring of 2008. There has not been much interest in this project from the Rails community, so no implementation for Rails is being planned.&lt;br /&gt;
&lt;br /&gt;
== How do I get started? ==&lt;br /&gt;
&lt;br /&gt;
There's 4 steps in the process of integrating AntiSamy. Each step is detailed in the next section, but the high level overview follows:&lt;br /&gt;
# Download AntiSamy from [http://code.google.com/p/owaspantisamy/downloads/list its home on Google Code]&lt;br /&gt;
# Choose one of the standard policy files that matches as close to the functionality you need:&lt;br /&gt;
#* antisamy-slashdot.xml&lt;br /&gt;
#* antisamy-ebay.xml&lt;br /&gt;
#* antisamy-myspace.xml&lt;br /&gt;
#* antisamy-anythinggoes.xml&lt;br /&gt;
# Tailor the policy file according to your site's rules&lt;br /&gt;
# Call the API from the code&lt;br /&gt;
&lt;br /&gt;
=== Stage 1 - Downloading AntiSamy ===&lt;br /&gt;
&lt;br /&gt;
Which package you download depends largely on what you want to do with AntiSamy. If you'd like to extend it or review the code, download the source package. If you're looking for quick integration, simply download the JAR and put it into you classpath. It has a relatively small footprint (2.2mb), and most of that could be taken out with ProGuard if anyone out there is a ProGuard wizard.&lt;br /&gt;
&lt;br /&gt;
=== Stage 2 - Choosing a base policy file ===&lt;br /&gt;
&lt;br /&gt;
Chances are that your site's use case for AntiSamy is at least roughly comparable to one of the predefined policy files. They each represent a &amp;quot;typical&amp;quot; scenario for allowing users to provide HTML (and possibly CSS) formatting information. Let's look into the different policy files:&lt;br /&gt;
&lt;br /&gt;
1) antisamy-slashdot.xml&lt;br /&gt;
&lt;br /&gt;
Slashdot (http://www.slashdot.org/) is a techie news site that allows users to respond anonymously to news posts with very limited HTML markup. Now Slashdot is not only one of the coolest sites around, it's also one that's been subject to many different successful attacks. Even more unfortunate is the fact that most of the attacks led users to the infamous goatse.cx picture (please don't go look it up). The rules for Slashdot are fairly strict: users can only submit the following HTML tags and no CSS: &amp;amp;lt;b&amp;amp;gt;, &amp;amp;lt;u&amp;amp;gt;, &amp;amp;lt;i&amp;amp;gt;, &amp;amp;lt;a&amp;amp;gt;, &amp;amp;lt;blockquote&amp;amp;gt;. &lt;br /&gt;
&lt;br /&gt;
Accordingly, we've built a policy file that allows fairly similar functionality. All text-formatting tags that operate directly on the font, color or emphasis have been allowed. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
2) antisamy-ebay.xml&lt;br /&gt;
&lt;br /&gt;
eBay (http://www.ebay.com/) is the most popular online auction site in the universe, as far as I can tell. It is a public site so anyone is allowed to post listings with rich HTML content. It's not surprising that given the attractiveness of eBay as a target that it has been subject to a few complex XSS attacks. Listings are allowed to contain much more rich content than, say, Slashdot- so it's attack surface is considerably larger. The following tags appear to be accepted by eBay (they don't publish rules): &amp;lt;a&amp;gt;,...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
3) antisamy-myspace.xml&lt;br /&gt;
&lt;br /&gt;
MySpace (http://www.myspace.com/) is arguably the most popular social networking site today. Users are allowed to submit pretty much all HTML and CSS they want - as long as it doesn't contain JavaScript. MySpace is currently using a word blacklist to validate users' HTML, which is why they were subject to the infamous Samy worm (http://namb.la/). The Samy worm, which used fragmentation attacks combined with a word that should have been blacklisted (eval) - was the inspiration for the project. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4) antisamy-anythinggoes.xml&lt;br /&gt;
&lt;br /&gt;
I don't know of a possible use case for this policy file. If you wanted to allow every single valid HTML and CSS element (but without JavaScript or blatant CSS-related phishing attacks), you can use this policy file. Not even MySpace is _this_ crazy. However, it does serve as a good reference because it contains base rules for every element, so you can use it as a knowledge base when using tailoring the other policy files.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Stage 3 - Tailoring the policy file ===&lt;br /&gt;
&lt;br /&gt;
Smaller organizations may want to deploy AntiSamy in a default configuration, but it's equally likely that a site may want to have strict, business-driven rules for what users can allow. The discussion that decides the tailoring should also consider attack surface - which grows in relative proportion to the policy file.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Stage 4 - Calling the AntiSamy API ===&lt;br /&gt;
&lt;br /&gt;
Using AntiSamy is abnormally easy. &lt;br /&gt;
Here is an example of invoking AntiSamy with a policy file:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;import org.owasp.validator.html.*;&lt;br /&gt;
&lt;br /&gt;
Policy policy = Policy.getInstance(POLICY_FILE_LOCATION);&lt;br /&gt;
&lt;br /&gt;
AntiSamy as = new AntiSamy();&lt;br /&gt;
CleanResults cr = as.scan(dirtyInput, policy);&lt;br /&gt;
&lt;br /&gt;
MyUserDAO.storeUserProfile(cr.getCleanHTML()); // some custom function&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Policy files can also be referenced by filename by passing a second argument to the &amp;lt;code&amp;gt;AntiSamy:scan()&amp;lt;/code&amp;gt; method as the following examples show.:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;AntiSamy as = new AntiSamy();&lt;br /&gt;
CleanResults cr = as.scan(dirtyInput, policyFilePath);&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Lastly, policy files can be referenced by File objects directly in the second parameter:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;AntiSamy as = new AntiSamy();&lt;br /&gt;
CleanResults cr = as.scan(dirtyInput, new File(policyFilePath));&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Stage 4 - Analyzing CleanResults ===&lt;br /&gt;
&lt;br /&gt;
The CleanResults object provides a lot of useful stuff. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;getErrorMessages()&amp;lt;/code&amp;gt; - a list of &amp;lt;code&amp;gt;String&amp;lt;/code&amp;gt; error messages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;getCleanHTML()&amp;lt;/code&amp;gt; - the clean, safe HTML output&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;getCleanXMLDocumentFragment()&amp;lt;/code&amp;gt; - the clean, safe &amp;lt;code&amp;gt;XMLDocumentFragment&amp;lt;/code&amp;gt; which is reflected in &amp;lt;code&amp;gt;getCleanHTML()&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;getScanTime()&amp;lt;/code&amp;gt; - returns the scan time in seconds&lt;br /&gt;
&lt;br /&gt;
== Project roadmap ==&lt;br /&gt;
&lt;br /&gt;
We have a number of milestones we'd like to accomplish with the help of the community. Hopefully we can allocate some funds for this in the OWASP Spring of Code 2008, but it is far too early to tell. In the meantime, this is a labor of love.&lt;br /&gt;
&lt;br /&gt;
=== .NET version (early Spring 2008, rc1 Fall 2008) ===&lt;br /&gt;
We're aiming for a beta of a .NET version of AntiSamy to be available by early Spring 2008. The major hurdles are finding a suitable &amp;quot;HTML cleaner&amp;quot; like nekohtml in .NET. It needs to be capable of producing document fragments, not just entire HTML documents. For example, if I pass in &amp;lt;code&amp;gt;&amp;amp;lt;i&amp;amp;gt;&amp;amp;lt;b&amp;amp;gt;This is a test&amp;amp;lt;/i&amp;amp;gt;&amp;amp;lt;/b&amp;amp;gt;&amp;lt;/code&amp;gt; to the HTML cleaner, we can't have it send back &amp;lt;code&amp;gt;&amp;amp;lt;html&amp;gt;&amp;amp;lt;head&amp;gt;&amp;amp;lt;/head&amp;gt;&amp;amp;lt;body&amp;gt;&amp;amp;lt;i&amp;amp;gt;&amp;amp;lt;b&amp;amp;gt;This is a test&amp;amp;lt;/i&amp;amp;gt;&amp;amp;lt;/b&amp;amp;gt;&amp;amp;lt;/body&amp;amp;gt;&amp;amp;lt;/html&amp;amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
I personally (Arshan) plan on developing this, but am happy to let someone take over who can focus more time on it.&lt;br /&gt;
&lt;br /&gt;
=== PHP version (beta early Spring 2008, rc1 Fall 2008) ===&lt;br /&gt;
We're aiming for a beta of a PHP version of AntiSamy to be available by early Spring 2008. The major hurdles are finding a suitable &amp;quot;HTML cleaner&amp;quot; like nekohtml in PHP. It needs to be capable of producing document fragments, not just entire HTML documents. For example, if I pass in &amp;lt;code&amp;gt;&amp;amp;lt;i&amp;amp;gt;&amp;amp;lt;b&amp;amp;gt;This is a test&amp;amp;lt;/i&amp;amp;gt;&amp;amp;lt;/b&amp;amp;gt;&amp;lt;/code&amp;gt; to the HTML cleaner, we can't have it send back &amp;lt;code&amp;gt;&amp;amp;lt;html&amp;gt;&amp;amp;lt;head&amp;gt;&amp;amp;lt;/head&amp;gt;&amp;amp;lt;body&amp;gt;&amp;amp;lt;i&amp;amp;gt;&amp;amp;lt;b&amp;amp;gt;This is a test&amp;amp;lt;/i&amp;amp;gt;&amp;amp;lt;/b&amp;amp;gt;&amp;amp;lt;/body&amp;amp;gt;&amp;amp;lt;/html&amp;amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Several members of the community have been in touch with us about working together, including the smart folks over at Zend.com.&lt;br /&gt;
&lt;br /&gt;
== Contacting us ==&lt;br /&gt;
There are two ways of getting information on AntiSamy. The mailing list, and contacting the project lead directly.&lt;br /&gt;
&lt;br /&gt;
=== OWASP AntiSamy mailing list ===&lt;br /&gt;
The first is the mailing list which is located at https://lists.owasp.org/mailman/listinfo/owasp-antisamy. The list was previously private and the archives have been cleared with the release of version 1.0. We encourage all prospective and current users and bored attackers to join in the conversation. We're happy to brainstorm attack scenarios, discuss regular expressions and help with integration.&lt;br /&gt;
&lt;br /&gt;
=== Emailing the project lead ===&lt;br /&gt;
&lt;br /&gt;
For content which is not appropriate for the public mailing list, you can alternatively contact the project lead, Arshan Dabirsiaghi, at [arshan.dabirsiaghi] at the [aspectsecurity.com] (s/ at the /@/).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Issue tracking ===&lt;br /&gt;
&lt;br /&gt;
Visit the [http://code.google.com/p/owaspantisamy/issues/list Google Code issue tracker]&lt;br /&gt;
&lt;br /&gt;
[[Category:OWASP Project]]&lt;br /&gt;
[[Category:OWASP Tool]]&lt;br /&gt;
[[Category:OWASP Download]]&lt;/div&gt;</summary>
		<author><name>Raboof</name></author>	</entry>

	</feed>