Sure if you like ingesting 4GB records. There is nothing inherently safer in binary formats. It's easy to write parsers that can handle properly formatted files, it is when you're dealing with corrupt or misformed files that everything gets complicated.
> There is nothing inherently safer in binary formats.
Sure there is. Barring a pathologically bad wire format design, they’re easier to parse than an equivalent human editable encoding.
Eliminating the human-editing ability requirement also enables us to:
- Avoid introducing character encoding — a huge problem space just on its own — into the list of things that all parsers must get right.
- Define non-malleable encodings; in other words, ensure that there exists only one valid encoding for any valid message, eliminating parser bugs that emerge around handling (or not) multiple different ways to encode the same thing.
Define non-malleable encodings; in other words, ensure that there exists only one valid encoding for any valid message, eliminating parser bugs that emerge around handling (or not) multiple different ways to encode the same thing.
I've said similar things to this before. E.g. if you want a boolean, there's nothing simpler and less error-prone than a single bit. It represents exactly the values you need; nothing more and nothing less. You could take a byte if you didn't want to pack, and use the "0 is false, nonzero is true" convention, which is naturally usable in a lot of programming languages; that way there are 256 different values, but the set of inputs is still small and finite with each one having a defined interpretation.