Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This was the reason that made me reject both protobuf and thrift for my own project. I ended up developing my own protocol which has to guarantee a normalized stream: there is only one way of encoding a set of data, any other way causes a error in the parser.

Here it is (still under development): https://bitbucket.org/binarno/goingthere

Can be embedded into streams, but it has a "strict" flag that forces the parser to throw an error if unexpected data is found in the stream. Optional tags not specified in the schema simply cannot be there, and all the tags must be encoded following a specific order.

Still looking for a simple protocol that allows to have a normalized representation that is always the same for the same set of data; I hate to develop my own things and prefer to steal ready made things :-)



You may also want to consider cbor: http://tools.ietf.org/html/rfc7049 It doesn't require canonicalization, but it has a suggested format for canonicalization, and libraries I've been using have made it reasonably possible to force field ordering.

Though this may be something of a tangent, since as kentonv says, canonicalization is only part of the dance; it all depends on what other actors do as well. :)


While requiring canonicalization may make the attack harder, I don't think it eliminates the problem entirely. You can define a canonical form of protobufs by simply saying that fields must appear in numeric order and unknown fields are not allowed. But, there's no reason you couldn't have a canonical protobuf that also looks like an ASCII message, you'd just have to think harder when constructing it (something I didn't care to do for this blog post :) ).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: