Our paper on TextSecure’s cryptographic protocol has been published on eprint over the weekend and it appears that people actually read it, which is something I, as one of the authors, am quite happy about.
It also has been posted to the ModernCrypto messaging mailing list, among others, which is run by Trevor Perrin, who authored the initial axolotl protocol, (a variant of) which is used in TextSecure. Moxie Marlinspike has also commented on our findings on said mailinglist. I would love to reply to his remarks there, but my subscription request to the list is still waiting for approval, so I’ll just post my remarks here. (EDIT: Turns out there is no approval necessary, it’s just that my university’s mail server delayed the confirmation message by a few hours.)
First of all I’d like to point out that to me the main takeaway from the paper is that TextSecure’s protocol can actually be proven to fulfil its claims, albeit so far just in the Random Oracle Model and under the assumption that we have a better way to establish entity authenticity than TextSecure currently provides. For something that I use to keep my conversations private, this is a rather good result and, from my perspective, way stronger than just having to trust someone’s experience in building and breaking crypto implementations, however significant that experience may be.
While developing the proof, we found that we need to require authenticity of the communicating entities (s. above). There, we stumbled upon the issue that communications partners do not need to show their knowledge of the private key that corresponds to the public key they present the fingerprint of. As Moxie has pointed out in his post to the ModernCrypto (MC)list, this situation is not uncommon and does not only affect TextSecure. However, at that point we also started looking into the question which issues can arise from this, and this is how we found an Unknown Key-Share attack exists. The scenario we used to illustrate the issue is that of a prank. The reason is bluntly that we did not want to discuss politics or general societal issues in a technical paper. I’m pretty sure, however, that I can imagine scenarios where someone is coerced to behave like Bart does in our example and the outcome is less funny than having Nelson ruin your birthday party.
For these—obviously rare—occasions we thought a fix would be appropriate. The easiest fix would be to include addresses into the tag of every message. In his post to the MC list Moxie claims that “We were actually the ones to suggest that to them” (them being us).
I believe that this is not correct. While I’m glad that both Moxie and we agree that this is a possible fix, I believe that we were the first to suggest to include addresses in the tag in this context. I just checked our email conversations with Moxie:
I contacted Moxie on the 8th of July 2014 after we thought we had understood the protocol correctly to verify that this was indeed the case. Beyond this rather essential question, we discussed explicitly signing prekeys, the purpose of the signalling key, and I asked about the length-reduced 8 byte MAC. Moxie’s reply did not contain the suggestions to include any addresses or identifiers into the tag and we did not receive further emails from Moxie or anyone else on the TextSecure team. On the 25th of July 2014 I sent Moxie a preliminary version of the paper containing a description of the issues we had found so far and the mitigations we proposed. In the “Mitigation of UKS” section, this version already contained the following sentences: “Intuitively, if both, P_a’s and P_e’s identity, are protected by the tag then the attacks above do not longer work. As identities we propose to use the respective parties’ phone numbers, as they represent a unique identifier within the system.”
That is why I think it was our suggestion in the first place, prove me wrong and we will gladly attribute it otherwise. But that just as an aside.
However, this fix does not solve the underlying issue with not having properly established that both parties know all parts of the key they present as theirs. We thus propose two different authentication methods that actually require each party to show knowledge of the private key that corresponds to the public key it presents the fingerprint of. We need this property for the proof to hold.
Later in the thread on MC Brian Smith writes that “the TextSecure protocol might be close to something that can be formally proved, and that it is worth considering the role of formal proofs in the design and development of TextSecure”, which rather straight to the point. Also, in my opinion, if something in the paper needs attacking, it is the proof. Seeing whether it holds is something every user of TextSecure will benefit from. Obviously, we think it does hold, which means that while we require a slightly different way for two parties to verifiy their respective keys, the remainder of the key exchange and messaging protocol is sound.
Nevertheless I’ll address Moxie’s other remarks below:
Regarding the way how HMAC is used in the key derivation: I just got back to Moxie on that. I also want to address this here, but I’d like to check back with the other authors first, so bear with me.
When pointing out what we considered an uncommon way of using HMAC, we were referring to the deriveSecrets method in crypto/kdf/HKDF.java, which creates an empty byte array salt, which is then used by the extract method as key to a new mac instance (mac.init(new SecretKeySpec(salt, “HmacSHA256"));). deriveSecrets is called in this way by the getMessageKeys method in crypto/ratchet/ChainKey.java when the initial session key for two parties is derived. While we expected the single unknown (and uniformly distributed) value that exists here—which is the result of the DH key agreement between the two parties—to be used as the key to the HMAC, intuitively, the outcome of doing this otherwise cannot be worse than the common practice of hashing a key to get an output of uniform length. In fact, Dodis et al. have shown that using an HMAC instead yields a better result. Krawczyk built on this and showed that the above HMAC, as utilized in the HKDF construction, which is also used in TextSecure, is suitable as computational extractor under various assumptions, among them the idealized modelling of compression functions as Random Oracle. We followed the same assumption in our proof, where we resort to the Random Oracle methodology.
Regarding the security of truncated SHA2–256:
Moxie further writes: “Since this is extremely common practice, is approved by NIST, and is approved in the HMAC RFC, I think we’re in good shape.”
In NIST’s Special Publication 800–56A (Table 1), we find the recommendation for Minimum MacLen (for use in key confirmation) to be at least 80 bit. At least in the first message two parties exchange per session in TextSecure, one of the uses of the tag is indeed key confirmation.
Moxie then states that he’d “like to hear more about were your see an practical attack”. We have at no point suggested that we have a practical attack on the shortened MAC. However the short length of 64 bit is something we consider noteworthy, also given the NIST recommendation of 80 bit cited above.