Telegram is an encrypted instant messaging app for iOS and Android devices. Obviously, I wouldn’t mention it on this blog if its crypto was perfect. In fact, it’s far from perfect. It’s almost horrifying.
I’m not sure if it was in response to some good criticism, but Telegram recently announced a crypto contest. Basically, if you can recover an email address that was encrypted with their app, you win $200,000 USD worth of Bitcoin.
This seems like a good thing. The massive award should get people looking for bugs in their code.
Unfortunately, the contest is useless. Neither users nor Telegram developers will learn anything from it. But Telegram will still be able to point to it and say, “Look! No one has won the contest, so our software is secure!” Naive users will believe it, and they will feel safe using dangerously broken encryption.
Why is the contest no good? To understand why, we need to learn how security is defined and analyzed in cryptography. This is a good opportunity to explain it, so that’s what I’ll do in the next section. If you already know what IND-CPA and IND-CCA mean, you can probably skip over this section.
What does “secure” mean?
In cryptography, we’re always reasoning about “security”, so we need to give the word “secure” a rigorous definition. This is done by defining security with respect to the capabilities of an adversary - a person (or computer) trying to break the system.
We have to define how much computing power the adversary has, what data they have access to, what data they can modify, how they can communicate with the honest users of the system, and so on.
Over the years, cryptographers have given names to various types of adversaries. We’ll go over the important ones in the next three sections.
Known Plaintext Attack (KPA)
Let’s start out by defining the adversary as having access to the system’s ciphertext (encrypted data), and let’s give them the full plaintext for a few ciphertexts. They have access to the ciphertexts and the plaintexts for some of them (these plaintext-ciphertext pairs are called known plaintexts), but they can not interact with the system in any way. We’ll call this model the Known Plaintext Attack (KPA).
This seems like a reasonable model at first. In the case of a communication encryption system, the adversary in this model is like someone who can sniff the network traffic but can’t change it or insert their own traffic.
As you could have probably guessed, this model is too weak. A system secure only in the KPA model is missing some very important properties that we’d like a “secure” system to have.
Imagine a system that always encrypts the same message to the same ciphertext. If two different messages are encrypted, their ciphertexts are different, but if the same message is encrypted twice, the ciphertexts are the same. An example would be a system that uses the same key to encrypt messages with AES in CBC mode with a zero Initialization Vector (IV).
The adversary is given a ciphertext and told that it’s either the encryption of “YEA” or “NAY”. Assuming the adversary is not given the encryption of “YEA” or “NAY” as one of the known-plaintext pairs, they can not figure out whether the ciphertext they have is the encryption of “YEA” or “NAY”. That’s good, a KPA adversary can’t break this system.
But what if you could make the system encrypt messages of your choice? Since all messages encrypt to the same thing, you could send “YEA”, get its ciphertext, and compare it to the one you’re trying to decrypt. If it’s the same, you know it is an encryption of “YEA”, otherwise it is an encryption of “NAY.”
This isn’t just a theoretical ability. More often than not, you can make the system encrypt a message of your choice. One real example is a TLS connection between a web browser and a google.com. If another website, say bing.com, includes an image on one of its pages from google.com, the browser will encrypt the image URL and send it to google.com.
The KPA model doesn’t agree with our intuitive notion of “security.” We need to make our adversary more powerful in order to make our definition of “security” stronger.
That’s the key point here: We give the adversary in our model more power, and this gives us a stronger definition of security. If you’re building a tank, you don’t test its armor with small arms fire, you shoot it with an RPG.
Chosen Plaintext Attack (CPA)
In the last section, we saw that if the adversary could encrypt messages of their choice, they could break a system that’s only KPA-secure. So what if we let the adversary make the system encrypt arbitrary messages as well?
This is the Chosen Plaintext Attack (CPA) model. The adversary can make the system encrypt messages of his or her choice. Even with that power, they should not be able to learn anything about another ciphertext.
This is a much more reasonable security model, but it’s still not good enough. A real adversary can modify ciphertexts.
Suppose there’s a bank system that encrypts money transfers with AES in CTR mode with a unique nonce for each message. This is secure in the CPA model, which is good, because a user can make their own transfers to get chosen plaintext pairs.
Suppose the transaction orders look like this:
"Transfer $100 from Alice to Bob"
Encrypted, one might look like:
The encryption (CTR mode) is malleable, so a real adversary can change the last part (highlighted with stars):
Because of the change, it will be decrypted to:
"Transfer $100 from Alice to Eve"
So, by changing one of the ciphertexts, the adversary can redirect the money that Alice was sending to Bob to Eve. Obviously, we need to strengthen our security model even more. We have to be sure that even when an adversary can modify ciphertexts, the system is still secure.
Chosen Ciphertext Attack (CCA)
This brings us to the last security model we will discuss, called the Chosen Ciphertext Attack (CCA) model. In this model, the adversary is also allowed to choose ciphertexts and gets to see their decryptions. The adversary can essentially do anything they want besides telling the system to decrypt the actual ciphertext they are trying to decrypt. The adversary can modify messages in transit, drop messages, replay messages, etc.
The CCA model is the standard we hold all cryptography systems to today. If someone proposes a system that is not secure in the CCA model, it will be rejected (except in some very specific circumstances, like encrypted databases).
By going through the attack models, I wanted to get across an idea that’s very important when reasoning about crypto security. I’ll repeat it again: If you want to show that a system is secure, give the adversary as much power as possible, and if they still can’t break it, the security is good. Knowing that the system is secure in a weak model, like KPA, does not mean it’s secure in a stronger (and more realistic) model like CCA. If a tank is bullet-proof, it might not be RPG-proof.
Okay, enough crypto theory. Back to the real problem: Telegram’s contest. The contest works like this:
Every day, Paul sends a message to Nick containing an email address. You win the contest by sending an email to that address. You get a transcript of the network traffic coming in and out of Paul’s account. According to the faq, you can send arbitrary packets to the server, but you can’t intercept/modify the communication.
The problem should be clear now: Telegram’s contest does not give the adversary enough power. The adversary doesn’t doesn’t get known plaintexts, can’t choose plaintexts, can’t choose ciphertexts, can’t modify network traffic, or anything like we covered in the previous sections. The contest barely fits into the known plaintext attack (KPA) model.
If nobody wins the contest, it does not mean Telegram is secure. It means Telegram might be secure within the constraints of the contest. However, there are extremely weak systems that can survive a Telegram-style contest, so if nobody wins the contest, it won’t give us any more confidence in Telegram’s security.
So how good is Telegram’s crypto?
Telegram’s design seems to disregard all of the important crypto research from the past two decades. Their protocol is described here, and here’s a diagram of their encryption:
Some problems are immediately apparent:
- They use the broken SHA1 hash function.
- They include a hash of the plaintext message in the ciphertext. Essentially, they are trying to do “Mac and Encrypt” which is not secure. They should be doing “Encrypt then Mac” with HMAC-SHA512.
- They rely on an obscure cipher mode called “Infinite Garble Extension.”
- Some really weird stuff about factoring 64-bit integers as part of the protocol.
- They do not authenticate public keys.
If their protocol is secure, it is so by accident, not because of good design. They claim the protocol was designed by “six ACM champions” and “Ph.Ds in math.” Quite frankly, the protocol looks like it was made by an amateur. The tight coupling between primitives suggests the designer was not familiar with basic constructs, like authenticated encryption, that you can find in any cryptography textbook.
What should Telegram do?
Telegram’s crypto is bad, and it needs to be scrapped. I know it’s tough to throw away all of that work, but if they want to build a trustworthy product, it’s what they need to do. Their protocol is already too complex to analyze (let alone prove secure), and adding band-aid fixes is only going to make it worse.
They should switch to an existing well-studied protocol like the one used by TextSecure. They need to bring in a real cryptographer to audit their design (and design process), and they need to make sure the programmers they’ve hired are qualified to write crypto code (most programmers are not).
If telegram wants, they can email me and I’ll offer as much advice as I can. I think their hearts are in the right place, they just goofed on the crypto.
Telegram has responded to this post. I sincerely appreciate the time they took to reply, and I apologize for making them sign up for a Tumblr account to do it.
Here’s a summary of their response and my replies.
Telegram notes in the contest FAQ that the scope of the contest will expand to active attacks if nobody wins this iteration. They should skip right to a contest that allows active attacks. The only reason not to is to avoid losing $200,000. It would not be more difficult for them to do. They would not need to set up any infrastructure for the contest except for a clear description of their threat model and criteria for an adequate attack demonstration.
Telegram justifies their use of SHA1 by noting that it has not been broken in practice (although there are theoretical attacks, so it is cryptographically broken), and that SHA1 is faster. That’s not a good reason to use SHA1, especially since according to benchmarks, SHA256 less than 1.5x slower. The bottleneck is almost always in the network, not the encryption.
Telegram explains their use of Mac and Encrypt on their FAQ. They do not give a reason to use their system over the standard encrypt-then-HMAC, which is well-understood by cryptographers and has been proven secure.
According to their Tumblr reply, the 64-bit factorization is part of a DoS-protection scheme. Presumably it’s just a proof of work and isn’t part of their crypto. However, factoring is a poor choice for a proof of work, since it’s vulnerable to lookup tables. A Bitcoin-style proof of work would have been better.
In the FAQ, Telegram attempts to argue for their system’s security under KPA, CPA, and CCA. Their arguments are not convincing. They mistake the security of the primitive (AES in IGE mode) for the security of their use of the primitive. Just because AES in IGE mode is secure against known plaintext attacks does not mean the whole construction is. You have to look at the bigger picture.
Telegram’s reason for creating their own protocol is as follows:
In order to achieve reliability on weak mobile connections as well as speed when dealing with large files (such as photos, large videos up to 1Gb and — coming soon — documents), MTProto uses an original approach.
In other words, they designed their own crypto for speed and reliability. I don’t see how encrypt-then-HMAC is incompatible with reliability. As for speed, encrypt-then-HMAC-SHA1 would be faster, and encrypt-then-HMAC-SHA256 would not be significantly slower. @zooko points out that the BLAKE2 hash function is as fast or faster than SHA1, and much more secure.
I stand by my belief that their protocol is not well-designed and should be scrapped. A good design should not need much explanation, and Telegram’s FAQ is still growing.