Why XOR alone is an incredibly bad “encryption” technique

From HITB presentation

From HITB presentation

In Wilco Bann Hoffman’s presentation at HITB, he showed some slides with screenshots from Wireshark, a packet capture utility.

Down at the bottom, you can see a hex representation of the data in the packet on the left. On the right is an ASCII representation of the same hex data. “.” means “Yeah, that hex doesn’t mean anything to me”. When you see this, it means you are looking at something that is likely binary data or encrypted text.

This is a security protocol, so we might assume it is encrypted in some manner. It can be a royal pain to work out how something is encrypted, especially with such a small amount of data. But because the designer of this protocol has made a schoolboy error, anyone with any experience will immediately see how it is encrypted and what they key is. I am not exaggerating here – I saw it in seconds and knew others would to.

Look at the hex data. What do you see? There are an awful lot of 0xB6 values there, aren’t there?

Why would that be?

A really common building block in encryption is to XOR (eXclusive OR) the data with some key material. It’s a clever and fundamental tool used in nearly every single encryption algorithm.

Let’s look at a really quick example:

We bitwise XOR the values of the data (0xD2) with the key (0xAA) to get the “encrypted” data (0x78).

To unencrypt, we just do the same thing:

Very clever, very clean, and also very easy to do quickly on most processors.

If the key is not constant, this can be an excellent way of encrypting. But if the key (0xAA) stays the same, we have a big problem.

Why? What happens when we encrypt 0x00:

The cipher text is the same as the key! Oh no. We’ve just leaked our key.

An awful lot of protocols end up with a lot of 0x00 in them. It’s a common padding value and a common string termination value. We just need to look at the packet in Wireshark, see all of those 0xB6, and we can be pretty certain that XOR is being used with a key of 0xB6.

A very simple python script

A very simple python script

And this is the case. A quick Python script later, and mostly ASCII comes out.

Now we have some plaintext

Now we have some plaintext

Even if 0x00 wasn’t used so frequently, XOR with a fixed key does nothing to increase the entropy of the data. If we used frequency analysis, we would get meaningful results. It’s also pretty easy to bruteforce a single byte key – you can even automate the detection of valid ASCII characters.

Very worrying that this was used over the open internet for security purposes.

One thought on “Why XOR alone is an incredibly bad “encryption” technique

Leave a Reply

Your email will not be published. Name and Email fields are required.