Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

JSON is one of the best things to ever happen to software development.


It really isn't. I'm almost tempted to say it's the opposite though that would be overstating the case.

It's a format that still bears excessive decoration (what's the purpose of quotes around field names? what are all those commas for?) yet it's limited in the types of data structures that it's able to express (natively). I'm not particularly fond of Clojure specifically but a format like EDN would have been superior in just about every way.


I'd hardly call a few bytes per key and field excessive. Especially compared to something like XML.

The datastructure complexity being limited is also a pretty significant key to its success. More complex datatypes means greater chances for JSON handling libraries to lack compatibility.

The only substantial shortcomings of JSON I see are shortcomings associated with any textual serialization format. Optimizing for human readability in a use case that's 99.99% of the time not read by a human.


JSON looks good when compared to XML.

That's literally how low you have to go.

"The only substantial shortcomings of JSON I see are shortcomings associated with any textual serialization format. Optimizing for human readability in a use case that's not 99.99% of the time not read by a human."

There are a few other shortcomings that are reasonably substantial/significant, but yeah, that's the gist of the problem.


> JSON looks good when compared to XML.

JSON looks good compared to what we'd be using instead of JSON, which is nothing so nice and structured as XML. The competition to JSON is something infinitely more ad-hoc, probably without a distinct parser, such that a generic library to generate or consume it is impossible, and getting usable error messages is equally impossible.

> "The only substantial shortcomings of JSON I see are shortcomings associated with any textual serialization format. Optimizing for human readability in a use case that's not 99.99% of the time not read by a human."

I agree with this and disagree at the same time: Optimizing for human readability means optimizing for the weird case, the 0.01% (but it seems to be more often than that) of the time you need to go beyond the tools you have to fix something. Saying that's rare is true but inapt: Seatbelts are only used in rare cases, too.


If we didn't have JSON we'd have settled on something like MsgPack.


Your 0.01% case is perhaps true with a binary format, but by using JSON, that 0.01% case just became a whole lot larger. ;-)

It's like, "hey, we came up with a simple way to do it, but to make it easier to deal with this little edge condition that creates complexity, let's significantly up the complexity and number of edge conditions so they're endemic to the space, and then we're all good go".

The irony is, I invariably end up needing to use a computer to help me read JSON anyway.


Even if we agree with your made-up numbers, 0.01% of the time that it's read by a human costs orders of magnitude more than the other 99.99%.


If anything 99.99% is probably underestimating it, especially for larger companies. If you send 10 million JSON documents per day, have 100 devs, and devs on average inspect 1 JSON document on the wire once per week, you're looking at closer to 99.9999%. Let's say that a binary format saves on average 10ms over JSON, then those 10m documents represent slightly over a day of overhead.

If you've got good built in tooling for payload visualization, then you might have minimal overhead to debug from a text like format. Both protobufs and flatbuffers (not to mention BSON), have good tools that spit out JSON equivalents.


Sure, if you make up numbers, you can argue that the sun is going to crash into the earth tomorrow and we're all going to die, so nothing in this conversation matters.

In reality however, there are some cases where protobufs, flatbuffers, or BSON are superior to JSON, but there are a lot of cases where they aren't. You'll have to weigh the pros and cons for each situation. And a lot of the time, there's not time to benchmark everything, so you kind of have to guess how the elements of the system are going to interact.

The one element that every system has is humans, so it's a fairly safe bet that humans will have to read whatever format you use.

I spend probably 15-45 minutes a day just in Postman, testing JSON calls. If something goes wrong, I'm inspecting requests/responses in Chrome. When we integrate a new team member, we don't have to have them install any tools--they're included in the browser they have installed. We don't have to write any schemas. When I start a new project, I don't have to install any libraries: they're included in my language(s). When we integrate with a partner company, we hand them sample requests/responses as text.

How many JSON documents per day do we have to send to get the payoffs you're claiming?

And my company is not unique: in fact, the stack I'm using is one of the most common stacks on the market.


There's a lot made about the format being human readable, but the actual bytes that fly over the wire that you are seeing through postman are not what you see on the screen. So you're relying on a tool to extract and render them. It turns out that regardless of how they are serialized, you can actually render them the same way.


In theory, sure.

In reality, now, JSON opens in everything from Chrome network inspector to Vim, and protobuffers/flatbuffers don't.


Wait, are you saying the tools you use for JSON don't work for non-JSON data? ;-)

Chrome also decompresses gzip and understands TCP/IP, but it doesn't handle LZMA or SS7.

Sure, thanks to our cult-like following of bad principles, we've made support of on a pretty broken stack with lots of terrible consequences ubiquitous. I'd argue that's bug, not a feature.


So let me get this straight: you think we should all start using tools that may or may not even exist, and if they do exist, would require us to install a bunch of new stuff, write a bunch of schemas, and retrain, all so that we can solve a problem which so far boils down to "if you don't I'm gonna call your tools broken and you cult-like"?

I make a solid income solving problems that people pay me to solve. Why should I abandon that and devote my life to achieving 9% size and 4% availability time increase[1] that no client has ever asked me for?

And to be clear, it's not that I don't care about performance. It's that if my automated tests notice an endpoint loading slowly or if a client complains about performance, I can almost always achieve order-of-magnitude performance gains by optimizing a SQL query or twiddling some cache variables, which almost never happens by switching serialization formats. I have used protobuffers in a few cases, where profiling indicated it as a solution, but this has not been the norm in my experience.

The first optimization is getting it to work, and the second optimization is whatever profiling tells you it is.

[1] https://auth0.com/blog/beating-json-performance-with-protobu...


Yeah, that's really not what I'm saying.

But I think you're right that as long as we stay this course there's going to be more problems and so you'll be able to make more money solving problems that didn't need to exist.

You're absolutely right about the first optimization being to get it work. You're just discounting the reality is that you're making it far more difficult for that to happen.


I think that's rather missing the point, but it's also not really true, because that ratio is substantially smaller if you talk about all the different bits of hardware that are having to decode the JSON. The relative cost there is extraordinary, let alone the complexity cost.

You can make tools that present data in any format in a way that is easy for humans to digest. Letting a very small and trivial part of the problem drive what is a much larger problem space is pretty flawed.


Has anyone calculated the carbon foot print of parsing JSON?


I read and work with JSON all the time, logs, responses, code generated from json data. The format suffers from not being readable because of the quotes issue. Especially when what you are putting in there has quotes, the amount of escaping required is ridiculous.

{"time":"2020-07-22T10:59:14.95406-04:00","message":"{\"level\":\"debug\",\"module\":\"system\",\"time\":\"2020-07-22T10:59:14.953909-04:00\",\"message\":\"Running MetricCollector.Flush()\"}"}

this is a very moderate example of what I deal with daily, all because JSON includes quotes around fields.


It gets worse when you want to put it into a c-string and you need to escape the quotes and the slashes again.


exactly, I am excited about new formats like Amazon's Ion though https://github.com/amzn/ion-js


They embedded a JSON string within the JSON itself.

I wonder if it would’ve been better for them to Base64 encode their message. Of course, this itself presents other problems.


Yeah but that 0.01% it's developers checking API output and whatnot, and being able to read the output without any tooling (or maybe just a JSON 'prettier' tool) is great.


Quotes allows for any characters and limited types is probably one of the main reasons it's widely used and implemented. There's no doubt vastly superior formats tailored to specific languages and purposes, but I think it's hard to argue that JSON was not a huge net positive to the software industry as a whole.


Listing some specific perceived flaws with JSON isn’t counter to the claim that it’s one of the greatest things to happen in software development.


Lack of integer type is really suck but that's inherited failure from JavaScript.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: