That's a fairly big problem I hadn't read about before, with great writing. But I don't think the proposed solutions are realistic.
[...] all those web sites must use exactly the same format for
authentication challenges.
[...] the key cannot be used for any automated purpose [...]
[...] all applications which use the key for signing must include
such a context string, and context strings must be chosen to avoid conflicts.
[...] better than either of the above is to use a different key for each purpose.
That's shifting protocol problems to standardization and human caution. It may work, but I'm skeptic.
Maybe a change in the cryptographic protocol can fix the problem? For example, instead of
and the server verifies that the two signed components, when XOR'd, result in the same value sent. This way the client only signs random data but can still authenticate.
This is not supposed to be a final solution, and is likely to have flaws, but I think it's a better direction to pursue.
Consider the original protocol: server sends a t-bit nonce N, client signs it.
Suppose that Eve can passively observe Alice's authentications to Bob's server, but she can't actively MitM the connection. Her best generic attack is a birthday attack:
1. Eve observes 2^b authentications and remembers each (N, S(N)) pair.
2. Eve attempts to authenticate as Alice. If Bob sends an N she has seen before, she's in. If he sends a fresh N, she aborts and tries again.
Eve will succeed in step 2 after roughly 2^(t-b) attempts.
Now consider the amended protocol. In each authentication, Eve observes two signatures: S(R) and S(R^N). Each is a signature over what is essentially a random t-bit number. This allows Eve to improve her attack:
1. Eve observes 2^b authentications. She stores two (X, S(X)) pairs per authentication: one for X = R and one for X = R^N. The size of the set is 2^(b+1).
2. Eve attempts to authenticate as Alice. Bob sends N. For each X in her set of observations, Eve calculates X' = X^N. If X' is also in the set, Eve wins. If not, she aborts and tries again.
Notice that Eve is now colliding N with the set of (X, X') pairs. Since this set grows with the square of Alice's authentication attempts, Eve will succeed after roughly 2^(t-2(b+1)) attempts.
This attack is plausible for sufficiently small t (say, 64) and sufficiently large b (say, 16-20). While b will be very small in web login scenarios, large b may be plausible in automated authentication scenarios.
This can obviously be defeated by choosing a sufficiently large t.
EDIT: (I'm not 100% certain about all the math. Corrections are welcome.)
That's a very cool attack. However, the client can sign R and R^N together, which I think defeats it. I didn't describe it as such because I was afraid "S(R+R^N)" may cause some confusion, and I ran some numbers and found the security level acceptable for 128-bit nonces and a few billion logins.
And keep in mind this is not a complete authentication protocol. I'm just proposing a small change to fix the signing-not-random-nonce problem, so I assume the rest of the protocol context is kept.
Barring stupid mistakes from my part, if this is vulnerable to impersonation so is the article's.
If Alice doesn't validate the identity of the server, that's absolutely a legitimate attack. But building a secure channel is out of scope and assumed solved here. The problem is how can Alice authenticate to Bob without exposing herself to signing arbitrary documents.
To be clear, Alice is intentionally authenticating to Eve in this scenario. So a secure channel does not impede the attack. For example, suppose Eve is Facebook and Bob is Google. Alice logs into Facebook, but what she doesn't know is that Facebook is passing her signatures directly through to Google. Now Mark Zuckerberg can like YouTube videos on Alice's behalf.
To prevent this, Alice must also sign the domain she's authenticating to. The article mentioned this, and maybe you had meant it to be implicit in your protocol, in which case: my bad.
Well, yeah, but as BoppreH said, that's still present in the original article's design. What BoppreH's solution doesn't allow (which is the problem the article raises) is the tricking of Alice into signing arbitrary data.
Imagine that the "check" protocol has a similar requirement: that the client xor the check with their own random value. Now we have the same problem again, as your signed message looks just like a valid check. You can make the attack increasingly unlikely by mutating your protocol in ways that you don't expect any other protocol also does, but this seems like security by obscurity unless done as part of an agreed standard which other protocol authors follow.
If the "check" protocol works like this, yes, we have a problem. But I think there are two very distinct types of "signatures" here: binding statements and proofs of possession. Because the client nonce was explicitly added so we don't sign intelligible data during proof of possession, I don't see why the "check" protocol, which is a binding statement, would have a similar requirement.
I agree completely. But it's an improvement, which shows how bad the current state is, and strictly better than relying on just standardization or caution.
(By the way, if you still think it's broken, I'm ball to improve it. "Wrestling with pig"...)
Sure, as I said, any change to the protocol that "seems unlikely" to collide with any other protocol makes an attack less likely. But I think an easier way to accomplish that, compared to your proposal, would be to add a context string prefix. E.g. require the signed message starts with "RFC 123456 authentication protocol\0" or something. The article has a link to Adam Langley suggesting this for some uses of signatures in TLS. In comparison, the client-chosen XOR introduces a construct that seems like it could have subtle cryptographic implications, so seems riskier.
Of course, the context string is also vulnerable to, say, a colliding protocol that happens to ignore leading strings (which is not unheard of). So ideally you specify that "this key shall only sign messages that begin with a context string", and then you're pretty solid. This is effectively saying "the type of things signed by this key are arbitrarily-typed messages prefixed with a NUL-terminated human-readable string identifying their type", which satisfies my "only sign unambiguously-typed data" rule. :)
(If you're referring to my edit earlier: I had previously thought I saw an attack that wasn't there, and had noticed other people claiming brokenness, so I assumed they were pointing it out. But then I realized the problem I imagined wasn't there. Sorry about that.)
That's a nice alternative. Though generating random data and XOR'ing is as known and as primitive as you can get, and none of it is ever reused, so I have no preferences of one over the other.
(if you are thinking "why not just sign the hash of the nonce without bothering with a key?", that's exactly how we sign documents, because they are long. You could sign the hash of the hash to avoid this conflict, but that's a hack)
This was the reason that made me reject both protobuf and thrift for my own project. I ended up developing my own protocol which has to guarantee a normalized stream: there is only one way of encoding a set of data, any other way causes a error in the parser.
Can be embedded into streams, but it has a "strict" flag that forces the parser to throw an error if unexpected data is found in the stream. Optional tags not specified in the schema simply cannot be there, and all the tags must be encoded following a specific order.
Still looking for a simple protocol that allows to have a normalized representation that is always the same for the same set of data; I hate to develop my own things and prefer to steal ready made things :-)
You may also want to consider cbor: http://tools.ietf.org/html/rfc7049 It doesn't require canonicalization, but it has a suggested format for canonicalization, and libraries I've been using have made it reasonably possible to force field ordering.
Though this may be something of a tangent, since as kentonv says, canonicalization is only part of the dance; it all depends on what other actors do as well. :)
While requiring canonicalization may make the attack harder, I don't think it eliminates the problem entirely. You can define a canonical form of protobufs by simply saying that fields must appear in numeric order and unknown fields are not allowed. But, there's no reason you couldn't have a canonical protobuf that also looks like an ASCII message, you'd just have to think harder when constructing it (something I didn't care to do for this blog post :) ).
One problem with this article (that does not invalidate the main point that you should not reuse keys for different protocols) is using SSH authentication as an example.
In SSH what gets signed in publickey (and hostkey) authentication is not controlled by server (client). Instead of some random value provided by other side, signed data include session id that is derived from result of first DH key exchange and thus unpredictable to either side of connection.
> signed data include session id that is derived from result of first DH key exchange and thus unpredictable to either side of connection
The unpredictability of DH output depends on the group parameters and public key validation performed by the peer. If the group contains small subgroups, Eve can send an element of small order and predict the DH output with high probability.
1) The DH group is negotiated by both peers from fixed list.
2) output of key exchange is hashed with various values that are not all controlled by same side of connection
Also after reading the relevant part of RFC4253 I've found out that the session id is conceptually an output of key exchange, but in the DH case it's derived from contents of key exchange messages and not from the result itself (not that it makes much of an difference).
While having the message be servers specified makes an attack easier, it is not necessary. Ultimately all you need is two protocols that can be mistaken for each other. SSH's session ID appears as the first few bytes of the signed message, so all you need is to find a protocol which happens to ignore those bytes or otherwise happens to allow the attacker to specify them. This is not unusual -- SSH itself, IIRC, will ignore leading "comment" lines at the start of a connection (though this is not in a context that gets signed, but the point is ignoring a prefix is not uncommon).
Yes, there's complete symmetry between signing and encrypting, thus your proposal is exactly equal to the article's.
The solution is quite simpler. Do not sign text that you didn't create. You create a random timestamped text, encrypt it, and send both to the server.
Or use the Diffie-Hellman key exchange, that as a bonus isn't vulnerable to men-in-the-middle attacks (that both the article's algorithms and the one I just wrote are).
A timestamp doesn't prevent replay attacks, it just time-limits them. Also, requiring time sync tends to be problematic -- many people have incorrect clocks. Finally, just because your protocol uses a timestamp at some position doesn't mean some other protocol won't interpret those bits in some other way. Removing server-controlled bits makes an attack harder but not necessarily impossible.
I assumed the "time is now" was meant to prevent replays. But yes, it's great if you can establish a secure channel first with a session key generated based on randomness provided by both client and server and then, once the session is up, send a signature of the session key, signed by your long-term key. No timestamp needed, then (I think). (Insert obligatory reminder not to roll your own crypto.)
But this is beside the point. You still have the problem that you are signing a thing, and if the signing key isn't restricted to signing only this type of thing, there is a risk that the thing you are signing could have a different meaning when interpreted as a different type. Even if the client fully-controls the content of the thing to be signed, this is still possible. Perhaps you can dismiss it as "unlikely", but I'm not sure I'm comfortable doing so.
Any scheme that protects against MITM is used to establish the baseline encrypted channel. Then what follows is used to authenticate the client, right?
The message you generate will only work for the server you generate it for, and likewise, you must ensure you are securely delivering the message to the correct server. It's often useful to treat these as two distinct problems solved at different layers in the stack (1) establish secure channel to shared public server, (2) authenticate.
Channel security isn't the important thing: once you've securely delivered the authentication to the server, can that server use it to authenticate somewhere else?
Maybe a change in the cryptographic protocol can fix the problem? For example, instead of
we use and the server verifies that the two signed components, when XOR'd, result in the same value sent. This way the client only signs random data but can still authenticate.This is not supposed to be a final solution, and is likely to have flaws, but I think it's a better direction to pursue.