This site is the archived OWASP Foundation Wiki and is no longer accepting Account Requests.
To view the new OWASP Foundation website, please visit https://owasp.org

Difference between revisions of "How to perform HTML entity encoding in Java"

From OWASP
Jump to: navigation, search
m (Why was this article deleted)
 
(21 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
==Status==
 
==Status==
Released 14/1/2008
+
Deleted 9/16/2010
  
==Overview==
+
==Why was this article deleted==
  
Injection attacks rely on the fact that interpreters take data and execute it as commands. If an attacker can modify the data that's sent to an interpreter, they may be able to make it misbehave. One way to help prevent this from happening is to encode the attacker's data in such a way that the interpreter will not get confused. [http://www.w3.org/TR/html401/sgml/entities.html HTML entity encoding] is just such an encoding mechanism for many interpreters.
+
HTML Entity Encoding is not enough to stop XSS in web applications. Please see [[XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet]] for more information.
 
 
This is not a guarantee by the way. It's almost certain that someone, probably from the XML/Web Services world, will create an engine that performs HTML entity decoding automatically, thus reintroducing the injection threat.  However, for the time being, HTML entity encoding seems to work pretty well to prevent many types of injection.
 
 
 
==Status==
 
Released 14/1/2007
 
 
 
==Approach==
 
 
 
We're going to implement a simple little method that encodes special characters. The nice .NET folks over at Microsoft had the foresight to build this into their platform, but the Java community seems to resist adding validation to the Java EE environment despite all the security issues that it could solve.  View layers such as Java Server Faces, Spring-MVC, WebWork and others automatically perform HTML encoding through custom tags.
 
 
 
The best place for this method is in some kind of ValidationEngine, but since it's a good candidate for being static, it doesn't matter what class it ends up in that much.
 
 
 
Note that this implementation doesn't produce the special characters like & lt; or & gt; - but it's not difficult to implement with a simple lookup table.
 
 
 
    public static String HTMLEntityEncode( String s )
 
    {
 
        StringBuffer buf = new StringBuffer();
 
        int len = (s == null ? -1 : s.length());
 
 
        for ( int i = 0; i < len; i++ )
 
        {
 
            char c = s.charAt( i );
 
            if ( c>='a' && c<='z' || c>='A' && c<='Z' || c>='0' && c<='9' )
 
            {
 
                buf.append( c );
 
            }
 
            else
 
            {
 
                buf.append( "&#" + (int)c + ";" );
 
            }
 
        }
 
        return buf.toString();
 
    }
 
 
 
==Libraries==
 
* The Jakara Commons Lang package has a generic class for performing a wide range of [http://commons.apache.org/lang/api-release/org/apache/commons/lang/StringEscapeUtils.html String escaping functions].
 
* jTidy includes an [http://jtidy.sourceforge.net/multiproject/jtidyservlet/apidocs/org/w3c/tidy/servlet/util/HTMLEncode.html HTMLEncode class] for performing HTML encoding.
 
 
 
[[Category:How To]]
 
[[Category:OWASP Java Project]]
 
[[Category:OWASP Validation Project]]
 

Latest revision as of 05:38, 17 September 2010

Status

Deleted 9/16/2010

Why was this article deleted

HTML Entity Encoding is not enough to stop XSS in web applications. Please see XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet for more information.