Java serialization - The gift that keeps on taking (Part 2)

In the previous post we started to look at the reasons why Java serialization is required and what the requisite and necessary design points were.

In this entry, we'll examine these particular characteristics and design points to reveal a few unexpected consequences.

As always when looking at software designs that have been around for some time is important to consider the wider context in place at the time. So as we go through these unexpected consequences and later explore how Java serialization can be exploited, remember that Java was very new, the internet was just beginning and the concept of external cyber attacks was practically unknown. Viruses were a peculiar Windows thing and spread by the most common internet medium - email.

The original designers of Java serialization and those that support it now may curse at how it can be abused, but hindsight is always right and at the time what was created was agreed to be a good practical solution.

Unreplaceable

It turns out the the design and implementation of Java serialization is so practical and useful that not using it is a challenge. With native Java serialization being so embedded in the design of the Java runtime and being so performant, alternative methods of serialization are often ignored simply because they are slower and harder to use.

Simple to use

Using Java serialization is simple. A few lines of Java code and a working serialization process is achieved. The only other requirement is that the Java class to be serialized implements the java.io.Serializable interface.

Three lines of Java code will serialize a Java object to a file. In this case the object is of some arbitrary type referenced by the variable hw.

File f=new File(PATHNAME);

ObjectOutputStream oos=new ObjectOutputStream(new FileOutputStream(f));

oos.writeObject(hw);

And three lines of code will deserialize an object from a file.

File f=new File(PATHNAME);

ObjectInputStream ois=new ObjectInputStream(new FileInputStream(f));

Object obj=ois.readObject();

Widespread and performant

This is so easy that using the native Java serialization is extremely widespread. A simple search on GitHub for "ObjectInputStream'' found almost three million hits at time of writing. The serialisation process is highly optimized and built into the JVM. For almost all use-cases it just works. It's not perhaps then an unexpected or undesired consequence that native Java serialization is so widely used today though it certainly is an unhappy modern one.

As we're going to show, the widespread use of Java serialization without the necessary understanding or safety checks is a gift to the many bad actors out there. Many dangerous consequences come from the self-defining nature of a Java serialization stream. A design intended to deal with Java polymorphism is easily exploited unless the necessary code checks are in place.

Beginners guide to exploitation

The self defining nature of Java serialization is at the core of serialization attacks. Let's see it in action. First here is a simple class:

public class HelloSerialisedWorld implements Serializable {

  private String greeting=null;

  public HelloSerialisedWorld(String salutation) {

    if(salutation==null) 
     throw new IllegalArgumentException("missing salutation");

    this.greeting=salutation.toUpperCase();

  }

  public void printGreeting() {

    System.out.println(greeting);

  }

}

Serializing an instance of this object is straightforward. Here is a slightly contrived example so that we can see the object being constructed.

File f=new File(PATHNAME);

HelloSerialisedWorld hw=new HelloSerialisedWorld("Hello DevZone");

ObjectOutputStream oos=new ObjectOutputStream(new FileOutputStream(f));

oos.writeObject(hw);

Peeking inside the box

The file just written has a structured format that is defined here. The structure has a complex layout (we are serializing a graph of objects after all) so for this discussion we'll simplify the format into a table and skip irrelevant metadata.

Our serialized object with a greeting that contains "Hello DevZone" looks like this

Field	Value
class name	HelloSerialisedWorld
field count	1
field type	L
field name	greeting
field value class name	java.lang.String
field string value	“Hello DevZone”

At first glance the table looks very much as expected. Essentially there is 1 field, called "greeting" and its value is of type "String". However with a more detailed examination the essential polymorphic nature of Java and Java serialization is revealed. The "java.lang.String" entry simply describes the form of the data to follow. It does not have any relation to the type of the field "greeting."

Assuming the bad actors can change this data what they do?

Dangerous consequences

The field type 'L' means "Object" hence the presence of the class name field. Since this is a self defining process, what happens if a different object type is used? May be we could add an Integer instead.

Field	Value
class name	HelloSerialisedWorld
field count	1
field type	L
field name	greeting
field value class name	java.lang.Integer
field integer value	100

This results in an exception during deserialization:

java.lang.ClassCastException: cannot assign instance of
java.lang.Integer to field HelloSerialisedWorld.greeting 
of type java.lang.String in instance of HelloSerialisedWorld

Which is exactly the behavior expected and wanted. However, look closely at the message. It says "cannot assign instance of java.lang.Integer."

This message shows that an instance of Integer has already been created before being assigned. That may not mean much for an integer but consider what would happen for a more complex class - what code might get driven by the simple act of instantiation?

Cold hard reality

A prized objective of bad actors is to be able to execute code under their control. The fact that ultimately there may be an exception thrown is immaterial if during deserialization they manage to run the code they desire. In fact throwing exceptions can often mask their intention and obfuscate tracks.

Java serialization's self-defining protocol makes running code particularly straightforward to initiate. The structure of the original class is irrelevant. Once the bad actor can force the instantiation of one object they can force instantiation of complex object graphs. Completely under their control.

Think back to the GitHub search earlier. Java serialization is used widely. It can be found in many frameworks and is used for sending data between systems in many ways. There is significant scope for exploitation. All it takes is a coding mistake, a misconfigured server to provide a path in.

Remember: All the bad actor has to have is the ability to influence the serialization data.

Next time

We'll explore more around exploiting serialization data streams. How it's possible to compromise systems silently and in different ways: from changing data, running arbitrary code or even crashing systems.

As this series suggests - Java serialization is the gift that keeps on taking.