Is it ASCII or Protobuf? The importance of types in cryptographic signatures

BoppreH · on May 12, 2015

That's a fairly big problem I hadn't read about before, with great writing. But I don't think the proposed solutions are realistic.

  [...] all those web sites must use exactly the same format for 
  authentication challenges.

  [...] the key cannot be used for any automated purpose [...]

  [...]  all applications which use the key for signing must include
  such a context string, and context strings must be chosen to avoid conflicts.

  [...] better than either of the above is to use a different key for each purpose.

That's shifting protocol problems to standardization and human caution. It may work, but I'm skeptic.

Maybe a change in the cryptographic protocol can fix the problem? For example, instead of

    Server: server_value // domain + action + nonce + ...
    Client: sign(server_value)

we use

    Server: server_value
    Client: sign(client_random) + sign(server_value XOR client_random)

and the server verifies that the two signed components, when XOR'd, result in the same value sent. This way the client only signs random data but can still authenticate.

This is not supposed to be a final solution, and is likely to have flaws, but I think it's a better direction to pursue.

sdevlin · on May 12, 2015

Here's another attack scenario to think about.

Consider the original protocol: server sends a t-bit nonce N, client signs it.

Suppose that Eve can passively observe Alice's authentications to Bob's server, but she can't actively MitM the connection. Her best generic attack is a birthday attack:

1. Eve observes 2^b authentications and remembers each (N, S(N)) pair.

2. Eve attempts to authenticate as Alice. If Bob sends an N she has seen before, she's in. If he sends a fresh N, she aborts and tries again.

Eve will succeed in step 2 after roughly 2^(t-b) attempts.

Now consider the amended protocol. In each authentication, Eve observes two signatures: S(R) and S(R^N). Each is a signature over what is essentially a random t-bit number. This allows Eve to improve her attack:

1. Eve observes 2^b authentications. She stores two (X, S(X)) pairs per authentication: one for X = R and one for X = R^N. The size of the set is 2^(b+1).

2. Eve attempts to authenticate as Alice. Bob sends N. For each X in her set of observations, Eve calculates X' = X^N. If X' is also in the set, Eve wins. If not, she aborts and tries again.

Notice that Eve is now colliding N with the set of (X, X') pairs. Since this set grows with the square of Alice's authentication attempts, Eve will succeed after roughly 2^(t-2(b+1)) attempts.

This attack is plausible for sufficiently small t (say, 64) and sufficiently large b (say, 16-20). While b will be very small in web login scenarios, large b may be plausible in automated authentication scenarios.

This can obviously be defeated by choosing a sufficiently large t.

EDIT: (I'm not 100% certain about all the math. Corrections are welcome.)

BoppreH · on May 13, 2015

That's a very cool attack. However, the client can sign R and R^N together, which I think defeats it. I didn't describe it as such because I was afraid "S(R+R^N)" may cause some confusion, and I ran some numbers and found the security level acceptable for 128-bit nonces and a few billion logins.

sdevlin · on May 12, 2015

The scheme you outline still allows trivial impersonation attacks.

BoppreH · on May 12, 2015

Care to provide an example?

And keep in mind this is not a complete authentication protocol. I'm just proposing a small change to fix the signing-not-random-nonce problem, so I assume the rest of the protocol context is kept.

Barring stupid mistakes from my part, if this is vulnerable to impersonation so is the article's.

sdevlin · on May 12, 2015

Alice is an end user. Eve and Bob each host legitimate servers. Alice authenticates to Eve, and Eve impersonates Alice to Bob like so:

    A        E        B
     -hello->
              -hello->
              <-nonce-
     <-nonce-

     -random->
     -S(rand)>
     -S(n^r)->
              -random->
              -S(rand)>
              -S(n^r)->

BoppreH · on May 12, 2015

If Alice doesn't validate the identity of the server, that's absolutely a legitimate attack. But building a secure channel is out of scope and assumed solved here. The problem is how can Alice authenticate to Bob without exposing herself to signing arbitrary documents.

(nice diagram, by the way)

sdevlin · on May 12, 2015

To be clear, Alice is intentionally authenticating to Eve in this scenario. So a secure channel does not impede the attack. For example, suppose Eve is Facebook and Bob is Google. Alice logs into Facebook, but what she doesn't know is that Facebook is passing her signatures directly through to Google. Now Mark Zuckerberg can like YouTube videos on Alice's behalf.

To prevent this, Alice must also sign the domain she's authenticating to. The article mentioned this, and maybe you had meant it to be implicit in your protocol, in which case: my bad.

BoppreH · on May 12, 2015

I totally overlooked that, thank you for pointing out. Thankfully the fix is simple. I've updated the original post.

spb · on May 12, 2015

Well, yeah, but as BoppreH said, that's still present in the original article's design. What BoppreH's solution doesn't allow (which is the problem the article raises) is the tricking of Alice into signing arbitrary data.

kentonv · on May 13, 2015

Imagine that the "check" protocol has a similar requirement: that the client xor the check with their own random value. Now we have the same problem again, as your signed message looks just like a valid check. You can make the attack increasingly unlikely by mutating your protocol in ways that you don't expect any other protocol also does, but this seems like security by obscurity unless done as part of an agreed standard which other protocol authors follow.

BoppreH · on May 13, 2015

If the "check" protocol works like this, yes, we have a problem. But I think there are two very distinct types of "signatures" here: binding statements and proofs of possession. Because the client nonce was explicitly added so we don't sign intelligible data during proof of possession, I don't see why the "check" protocol, which is a binding statement, would have a similar requirement.

kentonv · on May 13, 2015

I think that if your safety is based on "I don't think someone else would design a protocol like that", it seems pretty weak.

BoppreH · on May 13, 2015

I agree completely. But it's an improvement, which shows how bad the current state is, and strictly better than relying on just standardization or caution.

(By the way, if you still think it's broken, I'm ball to improve it. "Wrestling with pig"...)

kentonv · on May 13, 2015

Sure, as I said, any change to the protocol that "seems unlikely" to collide with any other protocol makes an attack less likely. But I think an easier way to accomplish that, compared to your proposal, would be to add a context string prefix. E.g. require the signed message starts with "RFC 123456 authentication protocol\0" or something. The article has a link to Adam Langley suggesting this for some uses of signatures in TLS. In comparison, the client-chosen XOR introduces a construct that seems like it could have subtle cryptographic implications, so seems riskier.

Of course, the context string is also vulnerable to, say, a colliding protocol that happens to ignore leading strings (which is not unheard of). So ideally you specify that "this key shall only sign messages that begin with a context string", and then you're pretty solid. This is effectively saying "the type of things signed by this key are arbitrarily-typed messages prefixed with a NUL-terminated human-readable string identifying their type", which satisfies my "only sign unambiguously-typed data" rule. :)

(If you're referring to my edit earlier: I had previously thought I saw an attack that wasn't there, and had noticed other people claiming brokenness, so I assumed they were pointing it out. But then I realized the problem I imagined wasn't there. Sorry about that.)

spb · on May 12, 2015

To make it a little more fixed-size and built of known primitives, what about just signing a mutated hash of the content a la HMAC?

BoppreH · on May 12, 2015

Like this?

    Server: server_nonce
    Client: random_key + sign(hmac(server_nonce, random_key))

That's a nice alternative. Though generating random data and XOR'ing is as known and as primitive as you can get, and none of it is ever reused, so I have no preferences of one over the other.

(if you are thinking "why not just sign the hash of the nonce without bothering with a key?", that's exactly how we sign documents, because they are long. You could sign the hash of the hash to avoid this conflict, but that's a hack)

binarnosp · on May 13, 2015

This was the reason that made me reject both protobuf and thrift for my own project. I ended up developing my own protocol which has to guarantee a normalized stream: there is only one way of encoding a set of data, any other way causes a error in the parser.

Here it is (still under development): https://bitbucket.org/binarno/goingthere

Can be embedded into streams, but it has a "strict" flag that forces the parser to throw an error if unexpected data is found in the stream. Optional tags not specified in the schema simply cannot be there, and all the tags must be encoded following a specific order.

Still looking for a simple protocol that allows to have a normalized representation that is always the same for the same set of data; I hate to develop my own things and prefer to steal ready made things :-)

heavenlyhash · on May 13, 2015

You may also want to consider cbor: http://tools.ietf.org/html/rfc7049 It doesn't require canonicalization, but it has a suggested format for canonicalization, and libraries I've been using have made it reasonably possible to force field ordering.

Though this may be something of a tangent, since as kentonv says, canonicalization is only part of the dance; it all depends on what other actors do as well. :)

kentonv · on May 13, 2015

While requiring canonicalization may make the attack harder, I don't think it eliminates the problem entirely. You can define a canonical form of protobufs by simply saying that fields must appear in numeric order and unknown fields are not allowed. But, there's no reason you couldn't have a canonical protobuf that also looks like an ASCII message, you'd just have to think harder when constructing it (something I didn't care to do for this blog post :) ).

dfox · on May 12, 2015

One problem with this article (that does not invalidate the main point that you should not reuse keys for different protocols) is using SSH authentication as an example.

In SSH what gets signed in publickey (and hostkey) authentication is not controlled by server (client). Instead of some random value provided by other side, signed data include session id that is derived from result of first DH key exchange and thus unpredictable to either side of connection.

sdevlin · on May 12, 2015

> signed data include session id that is derived from result of first DH key exchange and thus unpredictable to either side of connection

The unpredictability of DH output depends on the group parameters and public key validation performed by the peer. If the group contains small subgroups, Eve can send an element of small order and predict the DH output with high probability.

For example, this is one of the attack vectors for the triple handshake attack on TLS: https://www.secure-resumption.com/tlsauth.pdf.

dfox · on May 12, 2015

Which is mitigated in SSH by two things:

1) The DH group is negotiated by both peers from fixed list. 2) output of key exchange is hashed with various values that are not all controlled by same side of connection

Also after reading the relevant part of RFC4253 I've found out that the session id is conceptually an output of key exchange, but in the DH case it's derived from contents of key exchange messages and not from the result itself (not that it makes much of an difference).

kentonv · on May 13, 2015

While having the message be servers specified makes an attack easier, it is not necessary. Ultimately all you need is two protocols that can be mistaken for each other. SSH's session ID appears as the first few bytes of the signed message, so all you need is to find a protocol which happens to ignore those bytes or otherwise happens to allow the attacker to specify them. This is not unusual -- SSH itself, IIRC, will ignore leading "comment" lines at the start of a connection (though this is not in a context that gets signed, but the point is ignoring a prefix is not uncommon).

on May 12, 2015

[deleted]

marcosdumay · on May 12, 2015

Yes, there's complete symmetry between signing and encrypting, thus your proposal is exactly equal to the article's.

The solution is quite simpler. Do not sign text that you didn't create. You create a random timestamped text, encrypt it, and send both to the server.

Or use the Diffie-Hellman key exchange, that as a bonus isn't vulnerable to men-in-the-middle attacks (that both the article's algorithms and the one I just wrote are).

alcari · on May 12, 2015

That method's mentioned in the article. See the section 'The Obvious Problem'.

zaroth · on May 12, 2015

Why does the server present any data to be signed at all? The client generates a string, I am Sam logging into Google and Time is Now, Signed Sam.

The server has the public key and verifies.

kentonv · on May 13, 2015

A timestamp doesn't prevent replay attacks, it just time-limits them. Also, requiring time sync tends to be problematic -- many people have incorrect clocks. Finally, just because your protocol uses a timestamp at some position doesn't mean some other protocol won't interpret those bits in some other way. Removing server-controlled bits makes an attack harder but not necessarily impossible.

zaroth · on May 13, 2015

But anyone who can get this signed message can also get the session token?

Or are the rules of the game this has to work over an unencrypted channel? In which case you need two keys and you do DH.

kentonv · on May 13, 2015

I assumed the "time is now" was meant to prevent replays. But yes, it's great if you can establish a secure channel first with a session key generated based on randomness provided by both client and server and then, once the session is up, send a signature of the session key, signed by your long-term key. No timestamp needed, then (I think). (Insert obligatory reminder not to roll your own crypto.)

But this is beside the point. You still have the problem that you are signing a thing, and if the signing key isn't restricted to signing only this type of thing, there is a risk that the thing you are signing could have a different meaning when interpreted as a different type. Even if the client fully-controls the content of the thing to be signed, this is still possible. Perhaps you can dismiss it as "unlikely", but I'm not sure I'm comfortable doing so.

jerf · on May 12, 2015

Now the server can send the signed value somewhere else, and they can be Sam, too.

In fact, even if the server itself uses some sort of challenge-response, it can use the response anywhere that implements your policy.

zaroth · on May 13, 2015

Any scheme that protects against MITM is used to establish the baseline encrypted channel. Then what follows is used to authenticate the client, right?

The message you generate will only work for the server you generate it for, and likewise, you must ensure you are securely delivering the message to the correct server. It's often useful to treat these as two distinct problems solved at different layers in the stack (1) establish secure channel to shared public server, (2) authenticate.

pjc50 · on May 13, 2015

Channel security isn't the important thing: once you've securely delivered the authentication to the server, can that server use it to authenticate somewhere else?