Why XOR alone is an incredibly bad “encryption” technique

From HITB presentation

From HITB presentation

In Wilco Bann Hoffman’s presentation at HITB, he showed some slides with screenshots from Wireshark, a packet capture utility.

Down at the bottom, you can see a hex representation of the data in the packet on the left. On the right is an ASCII representation of the same hex data. “.” means “Yeah, that hex doesn’t mean anything to me”. When you see this, it means you are looking at something that is likely binary data or encrypted text.

This is a security protocol, so we might assume it is encrypted in some manner. It can be a royal pain to work out how something is encrypted, especially with such a small amount of data. But because the designer of this protocol has made a schoolboy error, anyone with any experience will immediately see how it is encrypted and what they key is. I am not exaggerating here – I saw it in seconds and knew others would to.

Look at the hex data. What do you see? There are an awful lot of 0xB6 values there, aren’t there?

Why would that be?

A really common building block in encryption is to XOR (eXclusive OR) the data with some key material. It’s a clever and fundamental tool used in nearly every single encryption algorithm.

Let’s look at a really quick example:

11010010
10101010 ⊕
--------
01111000

We bitwise XOR the values of the data (0xD2) with the key (0xAA) to get the “encrypted” data (0x78).

To unencrypt, we just do the same thing:

01111000
10101010 ⊕
--------
11010010

Very clever, very clean, and also very easy to do quickly on most processors.

If the key is not constant, this can be an excellent way of encrypting. But if the key (0xAA) stays the same, we have a big problem.

Why? What happens when we encrypt 0x00:

00000000
10101010 ⊕
--------
10101010

The cipher text is the same as the key! Oh no. We’ve just leaked our key.

An awful lot of protocols end up with a lot of 0x00 in them. It’s a common padding value and a common string termination value. We just need to look at the packet in Wireshark, see all of those 0xB6, and we can be pretty certain that XOR is being used with a key of 0xB6.

A very simple python script

A very simple python script

And this is the case. A quick Python script later, and mostly ASCII comes out.

Now we have some plaintext

Now we have some plaintext

Even if 0x00 wasn’t used so frequently, XOR with a fixed key does nothing to increase the entropy of the data. If we used frequency analysis, we would get meaningful results. It’s also pretty easy to bruteforce a single byte key – you can even automate the detection of valid ASCII characters.

Very worrying that this was used over the open internet for security purposes.

One thought on “Why XOR alone is an incredibly bad “encryption” technique

Leave a Reply

Your email will not be published. Name and Email fields are required.

This site uses Akismet to reduce spam. Learn how your comment data is processed.