This site is the archived OWASP Foundation Wiki and is no longer accepting Account Requests.
To view the new OWASP Foundation website, please visit https://owasp.org
Deserialization Cheat Sheet
Last revision (mm/dd/yy): 04/7/2018
IntroductionThis article is focused on providing clear, actionable guidance for safely deserializing untrusted data in your applications. What is Deserialization?Serialization is the process of turning some object into a data format that can be restored later. People often serialize objects in order to save them to storage, or to send as part of communications. Deserialization is the reverse of that process -- taking data structured from some format, and rebuilding it into an object. Today, the most popular data format for serializing data is JSON. Before that, it was XML. However, many programming languages offer a native capability for serializing objects. These native formats usually offer more features than JSON or XML, including customizability of the serialization process. Unfortunately, the features of these native deserialization mechanisms can be repurposed for malicious effect when operating on untrusted data. Attacks against deserializers have been found to allow denial-of-service, access control, and remote code execution attacks. Guidance on Deserializing Objects SafelyThe following language-specific guidance attempts to enumerate safe methodologies for deserializing data that can't be trusted. PHPWhiteBox ReviewCheck the use of 'unserialize()' and review how the external parameters are accepted. PythonBlackBox ReviewIf the traffic data contains the symbol dot . at the end, it's very likely that the data was sent in serialization. WhiteBox ReviewThe following API in Python will be vuleerable to serialization attack. Search code for the pattern below. 1. The uses of pickle/c_pickle/_pickle with load/loads import pickle data = """ cos.system(S'dir')tR. """ pickle.loads(data) 2. Uses of PyYAML with load import yaml document = "!!python/object/apply:os.system ['ipconfig']" print(yaml.load(document)) 3. Uses of jsonpickle with encode or store methods JavaThe following techniques are all good for preventing attacks against deserialization against Java's Serializable format. Implementation: In your code, override the ObjectInputStream#resolveClass() method to prevent arbitrary classes from being deserialized. This safe behavior can be wrapped in a library like SerialKiller. Implementation: Use a safe replacement for the generic readObject() method as seen here. Note that this addresses "billion laughs" type attacks by checking input length and number of objects deserialized. WhiteBox ReviewBe aware of the following Java API uses for potential serilization vulnerability. 1. 'XMLdecoder' with external user defined parameters 2. XStream with fromXML method. (xstream version <= v1.46 is vulnerable to the serialization issue.) 3. 'ObjectInputSteam' with 'readObject' 4. Uses of 'readObject' 'readObjectNodData' 'readResolve' 'readExternal' BlackBox ReviewIf the captured traffic data may include 'ACED0005' in hex-ascii encoded bytes, it may suggest that the data was sent in Java serialization streams Prevent Data Leakage and Trusted Field ClobberingIf there are members of the object graph that should never be controlled by end users during deserialization or exposed to users during serialization, they should be marked with the Prevent Deserialization of Domain ObjectsSome of your application objects may be forced to implement Serializable due to their hierarchy. To guarantee that your application objects can't be deserialized, a private final void readObject(ObjectInputStream in) throws java.io.IOException { throw new java.io.IOException("Cannot be deserialized"); } Harden Your Own java.io.ObjectInputStreamThe
The general idea is to override public class LookAheadObjectInputStream extends ObjectInputStream { public LookAheadObjectInputStream(InputStream inputStream) throws IOException { super(inputStream); } /** * Only deserialize instances of our expected Bicycle class */ @Override protected Class<?> resolveClass(ObjectStreamClass desc) throws IOException, ClassNotFoundException { if (!desc.getName().equals(Bicycle.class.getName())) { throw new InvalidClassException( "Unauthorized deserialization attempt", desc.getName()); } return super.resolveClass(desc); } } More complete implementations of this approach have been proposed by various community members:
Harden All java.io.ObjectInputStream Usage with an AgentAs mentioned above, the Globally changing ObjectInputStream is only safe for blacklisting known malicious types, because it's not possible to know for all applications what the expected classes to be deserialized are. Fortunately, there are very few classes needed in the blacklist to be safe from all the known attack vectors, today. It's inevitable that more "gadget" classes will be discovered that can be abused. However, there is an incredible amount of vulnerable software exposed today, in need of a fix. In some cases, "fixing" the vulnerability may involve re-architecting messaging systems and breaking backwards compatibility as developers move towards not accepting serialized objects. To enable these agents, simply add a new JVM parameter: -javaagent:name-of-agent.jar Agents taking this approach have been released by various community members: A similar, but less scalable approach would be to manually patch and bootstrap your JVM's ObjectInputStream. Guidance on this approach is available here. Language-Agnostic Methods for Deserializing SafelyUsing Alternative Data FormatsA great reduction of risk is achieved by avoiding native (de)serialization formats. By switching to a pure data format like JSON or XML, you lessen the chance of custom deserialization logic being repurposed towards malicious ends. Many applications rely on a data-transfer object pattern that involves creating a separate domain of objects for the explicit purpose data transfer. Of course, it's still possible that the application will make security mistakes after a pure data object is parsed. Only Deserialize Signed DataIf the application knows before deserialization which messages will need to be processed, they could sign them as part of the serialization process. The application could then to choose not to deserialize any message which didn't have an authenticated signature. References
Authors and Primary EditorsArshan Dabirsiaghi - arshan [at] contrastsecurity dot org Other Cheatsheets |