Write Crypto Code! Don’t publish it!

Today I’m talking about this code. It’s PHP code that’s supposed to do encryption. It’s ugly, broken, and insecure, but we can learn from it.

Thanks to @voodooKobra for pointing me to the code.

Open it in another tab and take a quick look through it. You’ll see that there are two classes. One is called hash_encryption, and the other is called blowfish_encryption. I will only talk about hash_encryption in this post.

The comment in hash_encryption says that it is “Simple but secure encryption based on hash functions” and that it “provides a block cipher in ofb mode (output feedback mode)”. As you can easily guess, none of that is true, except that it is “based on hash functions.”

Initialization

To use the class, you have to first call hash_encryption(), passing the secret key as an argument. Inside this method, a “random” “salt” is generated and a class variable called hash_key is set to the hash of the key parameter. After you’ve done that, you can call encrypt() and decrypt() to encrypt and decrypt strings.

Generating the “Salt”

The salt is generated like this:

$this->salt = sha1('#num::'.md5(''.time()).'->'.'Date:'.date('Y-m-d H:i:s O')).'@'.md5('This is the RANDOM salt generator which must be used ! @$#&(#&)(*_*$^$*$%*$%*$^^^%&*$^($^%**!@#@^%#:"{}<>:?"');

The only thing that changes in this line of code is what time() and date() return, both of which are extremely predictable. It looks like someone tried to make it a little more random by typing some symbols into the string, but it doesn’t help, since they’re always the same.

Encrypting a String

The encryption looks even worse. First, it generates a random “Initialization Vector”, which is just the hash of the “salt”, a random number from rand() (which has been seeded with the time), and the contents of all global variables. Like the salt, it’s very easy to predict.

Next, it XORs that predictable IV with the key, and adds the result to the output. We don’t need to look any further to know that this encryption is useless. We can take the output, XOR it with the IV (which we can easily predict or guess), and get the key. But, for no good reason, I’ll go on to finish describing how the encryption works.

Next, a variable called $key is set to the value of the IV. To avoid confusion with what we were calling the “key” in the last paragraph, I’ll just keep calling this the “IV.” The process from here on is like a stream cipher, where the key stream is generated by the following process: The first block of the stream is the IV, then the second block is the hash of the IV and the first block of plaintext, and so on. It’s kind of like OFB mode, like the comment said, but not really. Again, since we can predict the IV, we can generate the keystream, so this is trivially broken.

Don’t Write Your Own Crypto

I could tell you that it should be using a cipher instead of a hash, that it needs to provide authentication instead of just secrecy, that it needs to use a cryptographically secure pseudo-random number generator, and all of those things. But I’ve mentioned all of those on Crypto Fails before. This time I want to talk about something different: Why you shouldn’t write your own crypto.

This is something a lot of people, including myself, have a hard time digesting. We’re hackers, and we like to do things ourselves. When cryptographers tell us that there’s a whole class of code we shouldn’t be allowed to write, we feel bad. It’s the ultimate argument from authority, and we don’t like authority.

Or: Don’t Release Your Crypto Code

I understand that point of view, so I’m going to rephrase the standard advice into: Don’t release your crypto code. Write whatever you want, but if you write crypto code, don’t release it. I hope cryptography experts will follow me by saying “Don’t release.” instead of “Don’t write.”

Edit: Just to be clear, in the context of closed-source software, you shouldn’t release the software at all! Perhaps it would be even better to say “Don’t use!”

Unknown Unknowns

Donald Rumsfeld said this:

There are known knowns. There are things we know that we know. There are known unknowns. That is to say, there are things that we know we don’t know. But there are also unknown unknowns. There are things that we don’t know we don’t know.

I hypothesize that most crypto errors, especially the ones at the design level, are “unknown unknowns” to the designer. I have many examples of this.

My audit of EncFS found that MACs were not being compared in constant time. I think “Do you have to compare MACs in constant time?” was an unknown unknown to the EncFS authors. As were “Is incrementing the IV safe with CBC mode?” and “Is a 64-bit MAC good enough?” Once you know what to ask, it’s easy to find answers. The authors didn’t, so as a result, there are vulnerabilities.

When I looked at how Synergy encrypts its network traffic, I found it to be insecure, because it reuses the same IV for both communication directions (client to server and server to client). Another unknown unknown: “Is it safe to reuse an IV?”

If you look at the other posts on this blog, you’ll find the same pattern. It’s reasonable to believe that most problems are caused by the authors’ unknown unknowns.

That’s why you shouldn’t release your crypto code. You don’t know what you don’t know. A cryptographer is a person whose set of unknown unknowns (about cryptography) is a lot smaller than yours or mine. An unknown unknown is a lot less likely to create a vulnerability in a protocol they design, or some code they write (but sometimes it still happens).

It takes decades of experience to adequately reduce your set of unknown unknowns. So, unfortunately, this “argument from authority” is valid after all. But please don’t be discouraged: Write your own crypto code, then break it, and eventually become the authority. Just don’t release it.