Go 1.3 Linker Improvements

mratzloff · on May 22, 2014

So on most real world applications, you shave perhaps a second or two off compilation and linking times.

That's great, but... why does Go still have such an abysmal regexp package that is an order of magnitude slower than C and C++?[1]

Why does encoding/json still rely on reflection, making it slow compared to other approaches?[2]

In other words, I guess I feel like there are areas with more bang for the buck. The changes in 1.3 are great (100% precise garbage collection!), but I'm hoping improvements to regexp and encoding/json are somewhere on the roadmap as well.

[1] http://benchmarksgame.alioth.debian.org/u64q/performance.php...

[2] https://journal.paul.querna.org/articles/2014/03/31/ffjson-f...

burntsushi · on May 22, 2014

I grant that your questions are probably rhetorical, but I'll take a shot at them anyway.

While I was writing Rust's regex crate, I became very familiar with Russ Cox's work on implementing a regex engine with worst case O(mn) time complexity (m ~ regex and n ~ search text). As far as I know, he has two implementations of these ideas: RE2 (C++) and Go's regexp package. RE2 is extremely sophisticated. It includes a full NFA simulation (known as the "Pike VM") that can find submatches, and it also includes a DFA that avoids repeating work (but cannot find submatches). The DFA is very very fast (and I believe is multithreaded). The NFA simulation isn't as fast and is only used when necessary.

Go's regexp package only includes a full NFA simulation. There is no speedy DFA. My guess is that this is the primary reason why it is slow. (I will, however, note that Go 1.3 will include a reasonably large optimization which introduces a "one-pass" NFA simulation. It's used when it can be proven that the regex will never need to backtrack. Rob Pike posted some benchmarks on golang-dev a while back, and they showed a perf improvement in the regexp package. So they at least seem to be working on things that you care about.)

Speaking from a higher level, I'd say that the raw performance of regexes is probably not a huge priority. Needing wickedly fast regexes seems like a special case to me, in which case, you might just use an existing library to do it via cgo. (And indeed, the benchmark's game includes some Go benchmarks that use the PCRE C library, which has more reasonable performance.)

For more details on how it all works, Russ has written an amazing series of articles on the topic: http://swtch.com/~rsc/regexp/

> Why does encoding/json still rely on reflection, making it slow compared to other approaches?

Using reflection makes encoding/decoding JSON extraordinarily convenient. I do the same with TOML[1], although admittedly, performance is less of an issue there. My suspicion is that the added complexity for when you need raw speed isn't common enough to be in the standard library. megajson[2] tries to fill this hole, but it's a code generation tool, so you may find that less than ideal.

[1] - https://github.com/BurntSushi/toml

[2] - https://github.com/benbjohnson/megajson

benbjohnson · on May 22, 2014

JSON is inherently slow to parse. Even with the code generation of megajson, you can only get 2x performance at best. The megajson package has microbenchmarks for all the individual parsing tasks so you can get an idea of where things are slow. Here's are the benchmarks from my MacBook Pro:

https://gist.github.com/benbjohnson/6380fab7cac7533115e5

JSON is slow because of (at least) two reasons:

1) Strings are unframed - You have to check every single byte to see if it's a double quote to end the string. Compare this to MessagePack or Protobufs where the encoding states the exact length of the string. Those are faster because you can simply memmove() it.

2) Numbers require base 10 conversion - Numbers in JSON are typically only a couple bytes long. So for every 1 - 10 bytes you have to do an atoi(). You may have to do that a couple hundred thousand times in a megabyte of JSON. Numbers are also unframed so they suffer from the same "check-every-byte" issue as strings.

I feel like the encoding/json package actually does a great job in balancing parsing speed and convenience.

jerf · on May 22, 2014

3) JSON's object encoding is very fluffy and redundant. If one has an array of objects each of which all have the same set of keys (a very common use case), the keys must be spelled out each time, with double-quotes and other punctuation around them, for each usage.

There's a result from elementary complexity analysis which many have probably forgotten, which is that the time complexity of an algorithm is lower-bounded by the space complexity, because if an algorithm has a space complexity of O(n^2), it must have at least O(n^2) time complexity because that's how long it takes to even look at O(n^2) space once. Similarly, by being so fluffy, JSON incurs fundamental speed penalties that can not be recovered simply by virtue of being so physically large.

JSON is convenient and quite cross-platform. This carries it to a lot of places, quite justifiably. However, if speed is a concern, JSON is often a poor choice. It is literally impossible, no matter how clever you get, to write a JSON decoder that will be as fast as Protocol Buffers can be, because you can never overcome the fact that the JSON decoder has to look at more bytes. (No matter how cleverly you encode your JSON such that the result is a legal JSON file in the end, with key compression or whatever, other serialization methods can be just as clever, but then don't have JSON restrictions on how bytes are spent.)

kibwen · on May 22, 2014

  > the time complexity of an algorithm is lower-bounded by 
  > the space complexity

Just wanted to say that this is an awesome result that I hadn't considered before. Very cool.

jerf · on May 22, 2014

It's the rare case of a theoretical result whose theoretical impact is fairly marginal but practical impact is enormous... when you set out to seriously optimize something, you will eventually hit a point where this starts to become a serious consideration. Serialization is a particularly vivid and easy-to-see example, but sooner or later you'll hit it in any serious optimization effort.

bjz_ · on May 22, 2014

For the benefit of others, here is burntsushi's regex crate for Rust (compile time regexes included): http://static.rust-lang.org/doc/master/regex/index.html

chrismorgan · on May 22, 2014

And at present it is NFA-only, no DFA fast path.

But among NFA implementations, the compile-time regular expression engine is unbeatable.

enneff · on May 22, 2014

> So on most real world applications, you shave perhaps a second or two off compilation and linking times.

It's important to note that these linker changes are just a single step as part of a long-term plan to translate the Go compilers from C to Go. Once the compiler is written in Go it will be much easier to improve both compilation speed and the performance of the generated code.

The plan: https://docs.google.com/a/google.com/document/d/1P3BLR31VA8c...

mratzloff · on May 22, 2014

That's an important point that I didn't take into consideration. Thanks!

thrownaway2424 · on May 22, 2014

No idea about the regexp performance (it's been adequate for my uses). For encoding/json it's probably slow because nobody at Google gives two shits about JSON; they use protocol buffers for everything. Someone from outside Google would have to contribute encoding/json performance improvements if they need them.

enneff · on May 22, 2014

That's not true. We do care about JSON encoding/decoding performance, and encoding/json is pretty decent. It's just not as blazing fast as some other implementations. Our focus has been on providing a convenient API that's suitable for most uses, and that's what we have achieved IMO.

There are alternatives for people who really need the extra speed, as burntsushi mentioned.

kyrra · on May 22, 2014

Go uses a different regex algorithm that doesn't fall over in corner cases. See:

http://swtch.com/~rsc/regexp/regexp1.html

And Russ's comment here: https://groups.google.com/forum/m/#!topic/golang-nuts/6d1mae...

EDIT: I shouldn't answer question I don't know the full answer to.

comex · on May 22, 2014

The comment does not accurately reflect the state of the benchmarks today. (I do not know whether it accurately reflects the state of the benchmarks at the time of posting.) While some of the implementations are backtracking, the #2 slot is currently held by a C++ implementation that uses RE2:

https://code.google.com/p/re2/

which also provides the linear guarantee.

Edit: Apparently I forgot that RE2 was also written by Russ Cox, so he should know well that it provides better performance with the same guarantee.

_pfxa · on May 22, 2014

> [...] I guess I feel like there are areas with more bang for the buck [other than the linker].

Well, for a ~5-years-old language, the linker and the compiler is most presumably more important than the standard-library regexp-engine/json-package.

zik · on May 22, 2014

It's open source, so if these things are a priority for you then you should definitely consider improving them yourself.

bmatsuo · on May 22, 2014

Regular expressions are used less frequently in Go than in many popular languages (esp. dynamic ones). Often the "strings" package suffices. I almost never import "regexp". So yea. As people have said, I don't think it's a high priority.

coldtea · on May 22, 2014

>Often the "strings" package suffices.

Suffices for whom and for what work? Maybe suffices for you. This is not a dynamic vs static language issue -- C and C++ users use regexes all the time. It's a matter of business needs (e.g what you need your program to do).

dchest · on May 22, 2014

True, but there's also another factor: in Go it's easy to write string manipulation functions that are fast, while in most dynamic languages some regular expressions would run faster than your hand-written routine.

coldtea · on May 22, 2014

Perhaps, but I'm not 100% about that without some profiling.

For one, Go tends to offer string stuff implemented in Go, whereas dynamic languages usually delegate string operations to plain old C, which can be faster (especially when Unicode is added into the picture).

enneff · on May 22, 2014

Rob wrote about this: http://commandcenter.blogspot.com.au/2011/08/regular-express...

Laremere · on May 22, 2014

I think it's a mater of the priorities of the language developers. <I've removed my thoughts on regex here because others have posted with more knowledge on the subject> As for runtime reflection, it's a very handy tool which allows for quick development iteration. The page you link generates more efficient Go code, however it comes at the price of being more brittle. I've used standard library json package for things like reading a config file, where the time difference isn't noticeable. One size doesn't fit all, and if you're application really needs to parse lots of json quickly, then a more rigid approach may be in order.

ithkuil · on May 22, 2014

> So on most real world applications, you shave perhaps a second or two off compilation and linking times.

Which might be a significant part of the overall build time, if it takes 5-8 seconds to build.

jerf · on May 22, 2014

In my experience, it's not hard to have the same basic compile-run test-fix cycle with Go that it is with interpreted languages. Shaving a second off that may not save me a lot of literal clock time, but it makes it that much more pleasant to use, and therefore, that much less likely that I'll take unfortunate shortcuts. I haven't gotten up to a truly large Go code base, the whole shebang right now is only about ~25,000 lines, but the full save-test cycle is still comparable to what I get with a short Python script.

(Though as I like to point out occasionally, if you sample a lot of languages in the world, real-world C and C++ programs have exceptionally long compile times due to ancient design decisions. Most other compiled languages are actually quite fast to compile. Go is striving to be fast even so, but when benchmarked against something other than C or C++ it seems less spectacular.)

pcwalton · on May 22, 2014

> (Though as I like to point out occasionally, if you sample a lot of languages in the world, real-world C and C++ programs have exceptionally long compile times due to ancient design decisions. Most other compiled languages are actually quite fast to compile. Go is striving to be fast even so, but when benchmarked against something other than C or C++ it seems less spectacular.)

Nowadays, especially with unity builds, a lot of the reason that C and C++ are slow to compile (especially C) is actually that optimizations and codegen take a while. Most languages use JITs, so their compile steps are much faster. With the 6g/8g compilers, Go doesn't really do the same kind of optimization that e.g. GCC and LLVM do (e.g. SSA conversion, GVN, SCCP, LICM, aggressive algebraic simplifications, etc), so much of its build speed comes from that.

In my experience, there's no free lunch: fast compiles come at the expense of runtime performance, and runtime performance comes at the expense of fast compilation.

ithkuil · on May 22, 2014

I guess that as soon as the Go compiler introduces more expensive optimization passes to the code the more it makes sense to ensure that these passes are run only once (i.e. when the source is compiled) and not many times (i.e. when the whole binary is "linked", which in the Go toolchain up to 1.2.x includes a translation from pseudo-instructions to actual instructions).

TBH I don't know how much of the optimizations are currently doable before emitting pseudo-instructions, but I guess that moving the actual code generation in liblink makes sense nevertheless.

melling · on May 22, 2014

It's a 30% reduction in link times. That translates into a couple seconds for many projects. However, Go was built so developers could build those million line apps.