Cryptography is a complex subject. There are many subtle issues that can be introduced if you don’t know what you are doing.
There is a common mantra: “don’t roll your own crypto”. This is because both inexperienced and experienced developers frequently build cryptographic systems that are insecure.
However, there has to be a line – when does it start becoming “rolling your own”? Particularly in embedded systems, there are times when custom protocols need to be used, and developers stray into the dangerous area of cryptography.
One of the most common mistakes we have seen is the use of unauthenticated encryption.
What is encryption?
Encryption is encoding a plaintext into a ciphertext using a key, with the goal of keeping the plaintext confidential.
Only someone with the correct key should be able to decrypt the ciphertext and turn it back into plaintext.
Encryption provides confidentiality. It stops someone working out what the message is.
So what’s the issue?
An attacker can modify the ciphertext and cause the plaintext to change. There is no inherent means in encryption to detect this change.
Encryption does not provide authenticity. You cannot check that the message is genuine and has not been tampered with.
What can an attacker do with this?
I’m going to describe one attack against unauthenticated encryption.
Many encryption algorithms only operate on fixed-size blocks of data – they are called block ciphers. To encrypt longer lengths of data, a mode of operation is used to apply the block cipher repeatedly.
One mode of operation is called CBC (Cipher Block Chaining). When encrypting the data, the previous ciphertext block is mixed into the current plaintext block using an operation called “exclusive OR“. This is denoted with the + in a circle in diagrams.
There is also an input called the initialisation vector, or IV. This is a random input to the algorithm, and is intended to ensure that the ciphertext is different, even if the same plaintext is encrypted. This prevents leaking information about the content.
The initialisation vector is transmitted alongside the ciphertext.
Decryption is similar. The previous ciphertext block is exclusive ORed with the output of the block cipher to obtain the plaintext.
Exclusive OR is a deterministic operation. If we look at a single bit, then it operates as follows:
I always think of this as “if one input is high, invert the other input, otherwise leave it alone”.
The operation is carried out for each bit in a byte.
A: 0 1 0 1 1 0 0 1 (0x59) B: 1 1 1 1 0 0 0 0 (0xF0) O: 1 0 1 0 1 0 0 1 (0xA9)
What this means is that modifying one of the inputs to exclusive OR results in a predictable change to the output. And the operation can be easily reversed.
A: 0123456789ABCDEF B: FFFF00FFF00F0FF0 O: FEDC459879A4C21F
If we now exclusive OR the output with one of the inputs:
A: FEDC459879A4C21F B: FFFF00FFF00F0FF0 O: 0123456789ABCDEF
Hopefully that explains exclusive OR.
Let’s look back to how CBC uses this in decryption. In the first block, the IV is exclusive ORed with the output of the block cipher. The IV is transmitted alongside the ciphertext and an attacker can modify both at at will.
We can encrypt the string “A dog’s breakfast” using a key and the initialisation vector of all 0x00 (here on CyberChef).
Key: 0123456789ABCDEF0123456789ABCDEF IV: 0000000000000000000000000000000 Plaintext: A dog's breakfast Ciphertext: c7b1d96f0f520f33faaccfdc107f718aafe8892c3a29c76b0732a760a0f54f50
Of course, this can be decrypted (here on CyberChef).
If I change just one byte in the ciphertext, the entire message is corrupted (here on Cyberchef). There’s no way for me to predictably modify this plaintext by changing the ciphertext.
Key: 0123456789ABCDEF0123456789ABCDEF IV: 0000000000000000000000000000000 Ciphertext: c7b2d96f0f520f33faaccfdc107f718aafe8892c3a29c76b0732a760a0f54f50 Plaintext: .L...Q½êU...ì7Ò.t
But the attacker also has control over the IV. Let’s set the first byte of the IV to 0xFF (here on CyberChef). Only the first byte of the plaintext has changed!
Key: 0123456789ABCDEF0123456789ABCDEF IV: FF00000000000000000000000000000 Ciphertext: c7b1d96f0f520f33faaccfdc107f718aafe8892c3a29c76b0732a760a0f54f50 Plaintext: ¾ dog's breakfast
And it has changed predictably. The capital A (ASCII 0x41) has been exclusive ORed with 0xFF to become 0xBE (which decodes as ¾ although it’s above the normal ASCII range).
A: 0 1 0 0 0 0 0 1 (0x41) B: 1 1 1 1 1 1 1 1 (0xFF) O: 1 0 1 1 1 1 1 0 (0xBE)
This is a very high level of control! The attacker can now modify the plaintext without detection. Let’s try and significantly change the meaning of it.
The original message contained “A dog’s breakfast”. Can we change this canine feast into a feline one?
We exclusive OR the original plaintext with the desired one (here on CyberChef). Notice how the output only has value for the characters we have changed.
Original: A. .d.o.g.'.s. .b.r.e.a.k.f.a.s.t. Original: 4120646f67277320627265616b66617374 Desired: A. .c.a.t.'.s. .b.r.e.a.k.f.a.s.t. Desired: 4120636174277320627265616b66617374 Output: 0000070e13000000000000000000000000
Pop that output in as the IV to the decryption, and we’ve successfully changed the message (here on CyberChef). All of this without even knowing the key.
Key: 0123456789ABCDEF0123456789ABCDEF IV: 0000070e130000000000000000000000 Ciphertext: c7b1d96f0f520f33faaccfdc107f718aafe8892c3a29c76b0732a760a0f54f50 Plaintext: A cat's breakfast
Of course, the attacker needs to have knowledge of the plaintext to make use of this attack. However, it’s extremely common for some or all of the message to be known. For example, when we visit most websites, the first part of the response will be “HTTP/1.1 200 OK”. If this was only protected by CBC encryption, we could change that to “HTTP/1.1 404 No”, changing the behaviour of the browser (here on CyberChef).
This doesn’t just impact the first block of data either. After the first block, instead of the IV, the previous ciphertext block is used in the exclusive OR operation. The attacker can modify the ciphertext and end up controlling the plaintext.
This comes at a cost though – the previous plaintext block will be totally corrupted as a result.
To illustrate this, we can encrypt a longer block of text (here on CyberChef).
Let’s change “baud” to “cats”. We need to locate the correct place in the ciphertext. AES (the encryption algorithm we are using) works in 16 byte blocks. The word “baud” is 85 characters in, so in the 6th block. We therefore want to modify the 5th block of ciphertext.
The exclusive OR is a bit more complex than last time – we now need to exclusive OR the ciphertext, the original text, and the desired text (here on CyberChef). But change those 4 bytes, and we change the word “baud” to “cats”.
The only issue is, as expected, the previous block has been entirely corrupted. Whilst in this case, it’s made part of the message nonsensical, it frequently has no impact when carrying out attacks.
But there are worse problems?
The above issue allows an attacker to modify the plaintext without detection. This would be an issue in certain situations, such as lock/unlock messages to a door.
But not authenticating your encryption can lead to worse issues. A type of attack called padding oracle attacks can let an attacker obtain the plaintext by sending a large number of specially crafted packets.
Block ciphers only operated on fixed blocks. If the data is shorter than a block, it must be padded. There are a number of ways of doing this, such as appending the number of padding bytes (e.g. 0x02 0x02 or 0x05 0x05 0x05 0x05 0x05). The process of decryption may check this padding is correct or not, and respond differently in each case.
An attacker can exploit these differential responses to leak the plaintext. This can break the confidentiality of messages.
What’s the solution to this?
Encryption should always be authenticated. There are two common solutions to this:
- Add a Message Authentication Code (MAC). This is a keyed cryptographic checksum that provides authenticity and integrity.
- Use an authenticated mode of operation such as GCM.
Even with this advice, there are many pitfalls. Applying the authentication and encryption in the wrong order can lead to weaknesses; this is so common that it has been deemed the Cryptographic Doom Principle.
Generally, developers shouldn’t be working with cryptography at this level unless they are suitably skilled. That’s easy to say, harder to put into action. There is a big movement to make use of secure-by-default cryptographic libraries and APIs that provide developers with useful functions without giving them so much rope they can hang themselves.
There are scant few reasons for not authenticating encryption.