Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Shaving 40% Off Google’s B-Tree Implementation with Go Generics (2022) (scylladb.com)
114 points by 0xedb on Sept 20, 2023 | hide | past | favorite | 47 comments


Hey, Google go b-tree implementation author here. A few important things:

- this implementation was done by me while I worked at Google and needed a good ordered tree. It is not and never was supported by Google, just open sourced by the company and written on company time.

- it now supports generics! Actually, iirc it did almost at the time that this article came out. In go 1.18 and higher, the original API uses a specialization of the generic underneath.


Discussed at the time:

Making Faster B-Trees with Go Generics - https://news.ycombinator.com/item?id=31182645 - April 2022 (44 comments)


I got similar improvements when porting hashicorp/golang-lru[1]

  name         old time/op    new time/op    delta
  2Q_Rand-16     1.22µs ±10%    0.52µs ± 8%   -57.69%  (p=0.000 n=20+19)
  2Q_Freq-16     1.03µs ±10%    0.44µs ±12%   -57.42%  (p=0.000 n=18+20)
  ARC_Rand-16    1.51µs ± 9%    0.70µs ±15%   -53.63%  (p=0.000 n=20+20)
  ARC_Freq-16    1.22µs ±12%    0.54µs ± 7%   -56.20%  (p=0.000 n=20+20)
  LRU_Rand-16     401ns ± 9%     215ns ± 6%   -46.46%  (p=0.000 n=20+20)
  LRU_Freq-16     376ns ±13%     194ns ± 7%   -48.51%  (p=0.000 n=19+20)
[1]: https://github.com/hashicorp/golang-lru/pull/111


This may be confusing to those familiar with Google's libraries. The baseline is the Go BTree, which I personally never heard of until just now, not the C++ absl::btree_set. The benchmarks aren't directly comparable, but the C++ version also comes with good microbenchmark coverage.

https://github.com/google/btree

https://github.com/abseil/abseil-cpp/blob/master/absl/contai...


This feels like such a non-news. They've pretty much created a language which is deliberately dumbed down to be 20 years behind other languages, then they are proud that they could speed it up with a feature that was fairly well-known almost 20 years ago.

Also the obsession with not providing tools for the programmer to use the stack, and rely on compiler internals to not heap-allocate everything.


> This feels like such a non-news. They've pretty much created a language which is deliberately dumbed down to be 20 years behind other languages, then they are proud that they could speed it up with a feature that was fairly well-known almost 20 years ago.

If you want to hate on Go, do it properly: https://en.wikipedia.org/wiki/Generic_programming says

> Generic programming is a style of computer programming in which algorithms are written in terms of data types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973,[1][2] permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplicate code.

> Generics was introduced to the main-stream programming with Ada in 1977 and then with templates in C++ it became part of the repertoire of professional library design. The techniques were further improved and parameterized types were introduced in the influential 1994 book Design Patterns.[3]

If you go with 1973, you get Go being behind the times by a nice round 50 years. If you go with the latest date mentioned, 1994, you still get almost 30 years.

The ML family of languages lives on in eg Haskell and OCaml these days. But Ada was perhaps more mainstream at the time, and 1977 gives you a nice 46 years to roast Go with.

But in any case, while it is tempting to mock Go and its inventors, we should applaud them when they do recognize their mistakes and fix them. Better having generics late than never!

(Btw, from what I can tell by reading their earlier justifications, it seems that the problem was that they only really knew the mess that C++ templates are, and thought all generics were like that. If the choice is between C++ style templates and no generics at all, I can at least understand where they were coming from with their reluctance.

However, given that they must have heard of Java, I'm not quite sure why they thought generics must inevitably be so complicated.

In the end, the Go people wisely got Phil Wadler and friends, who are more at home in the functional programming community, to help them out with the design.)


I didn't do it properly on purpose, because that wouldn't have been fair towards Go. Ultimately, many things which aren't even common yet today have already existed in the 70s.

I have picked 20 years specifically since it was around 2003-2005 when Java and C# got generics. Obviously, they were also more than 20 years late but I compared it to mainstream languages people are more likely to know about or more likely to compare it with.

I wanted to be fair and compare Go to some other mainstream languages which are big today, not niche ones.


It’s so frustrating that every post on go brings out the same tired tropes. It just ruins any sensible discussion on any changes. Why do you do it? Are you trying to troll and actively make HN a worse place because that is for me what you are doing.

If you don’t like go don’t read the posts it’s not hard. Just stop.


Aren’t you doing the same by replying to their comment?


How so?


Because I genuinely don’t know, what do you mean when you say it’s 20 years behind other languages? Or rather, what features is it lacking that would make it up-to-date? And what languages are 20 years ahead of it?


People are upset that Go boringly models actual computers that you can buy and incorporates pretty much no exciting computer science concepts. You get structs, slices, and maps. This basically means, you never get to show off your esoteric computer science knowledge while making whatever you set out to make; all you get at the end of your journey is a working computer program that stands on the merits of what it does and not how it was implemented. This is very dissatisfying to many. They might not have anything in mind to make today, but do enjoy solving a good mystery, and there are many obscure programming languages that scratch that itch. Consider "monad tutorials". Nobody has ever needed to figure out what a monad is for their work, and they certainly aren't used to implement physical computers that you can buy at the computer store, but the high level of abstraction really pleases people. It's so abstract that it's difficult to reason about, and feels more like mathematics than making an automaton do your bidding (which is basically all programming is).

Like, structs, maps, and slices are things you can explain to someone in 5 minutes with no software engineering or computer science background. They map to real-life concepts that people use every day. Do you have an ordered todo list? Ever use a dictionary? Ever look at a driver's license, where every license has the same fields but different information? Congratulations, you now intuitively understand the core data structures in Go code.

There is also no way to golf down code. Want to map the elements of a slice into another slice? That's a for loop. Want to reduce over a slice? That's a for loop. People get super excited about giving each sort of operation like this its own name, map reduce foldr foldl zip... but like many things, the names hide what's actually going on, where as Go just makes you type what you're doing into your editor. This seems to annoy people greatly.

The "it's for dumb people" is riffing off a comment made when Go was first introduced; they were aiming for a language that would be easier for large teams to maintain. People new to the codebase could show up and start being productive. Fixes could be applied en masse across the entire codebase. So that's where that comes from. I guess we contrast it with C++, where one missed reference count increment here, or one stack variable returned from a function there, and you have an impossible to debug nightmare that is blowing up in production. Knowing not to make those mistakes makes you smart, and being new and having the system blow up in your face is a great learning exercise, right? Why should you get to come play with my system the day you graduate from college? You have to earn it! So I think that's where the hate comes from. If the junior engineers can start contributing, you're going to have to do something other than troll HN all day or you'll be rEpLaCeD.

A lot of criticisms of Go are right. You can absolutely type in the most dogshit broken programs imaginable. (Most of them start by overusing the keyword "interface", BTW.) There are lots and lots of traps for the newcomers and even the unwary ("invalid memory address or nil pointer dereference"! what's a nil interface value? why is every goroutine you started in that loop see the same value of the loop variable? your goroutine has no way of exiting, so after 100,000 web requests your app gets OOM killed.) Tons of stuff you have to learn to truly be fluent.

It's missing a lot of input from computer science. Structured concurrency? Software transactional memory? Algebraic data types? Classes? Metaclasses? And, it makes a lot of mistakes whose harm is well-documented. ("Nil in 2023!?") While Go tries to be a simple model of the computer, the simplicity is not free. There is no claim that abstractions are zero-cost. Garbage collection WILL use your CPU. Looking up an element in an array or slice has a bounds check added! That memory you got from the OS wasn't zeroed, so guess who set that variable you just declared to the zero value. You just used a resource and you didn't get to decide. That sort of thing can make a certain type of person furious.

There is also a lot of good. The standard library is actually useful. Using third-party modules is easy, and requires no registration with some intermediary to publish one. (Also fetching modules can't print advertisements. I have no idea if any Go developers drunkenly drove their motorcycle into some pedestrians and now need a job, whereas with Javascript, it tells me that every time I build my app! Frankly, my dear, I don't give a damn. I am in the middle of my workday!) You are only nagged about security vulnerabilities if you actually call into the vulnerable code. The speedy compiler outputs a binary that can run on pretty much any computer of the right OS and architecture. You can trivially cross-compile. Everything that is tedious about the mechanics of making something that people can use largely Go away with Go. You can open up your editor, type in a program, and send it to other people to use. That's really nifty.

At the end of the day, Go isn't the last programming language society will ever develop. But you can make a lot of stuff with it. If you've ever been to an art museum, you might notice that you rarely see an example of some famous artist's pencil. You see their work. That's how programming languages should be; get out of the way, and let your work speak for itself. A lot of people really don't want their work to be front and center, and those people really don't like Go.


This is an amazingly non-succinct write up of why I love programming in Go.

I think it, I type it, it works and it works fast, with minimal cognitive overload.



I would love to answer in detail but right now I cannot.

One really good example though is error handling - they've pretty much looked at C and decided that that was the best we can do, so now every Go program is full of nilchecking boilerplate.

The whole design of the language generally feels like as if it was made for unthinking drones.


> I would love to answer in detail but right now I cannot.

Sorry but this is unacceptable. You post a generic comment hating on a popular language and then when asked for the specifics you bail out because you can't answer in detail for whatever reason. On top of that you then insult the language authors.

That's classic troll behavior, another commenter here asked you to stop, I'm going to do the same and flag your comments in this thread because they're strictly speaking completely off topic.


Hi. Sorry, I couldn't reply yesterday properly due to personal reasons, I am very sorry about that. It's not a generic comment, it directly reacted to the article. The article was about providing a speedup not because they innovated something new, but because they've finally fixed a wart which should never have been in the language in the first place.

I've said it another comment but I'll directly include it here: "I have picked 20 years specifically since it was around 2003-2005 when Java and C# got generics. Obviously, they were also more than 20 years late but I compared it to mainstream languages people are more likely to know about or more likely to compare it with." Generics are a feature Go has lacked for a very long time, and I think it's a bit misleading to claim that including them is some revolutionary thing which the article suggests. ("There are many reasons to be excited about generics in Go" or "The future of Go generics is bright")

Also, I don't see how it is insulting the language authors if I point out (what I think is) a bad design decision they made. They are not horrible human beings or anything for so, I am not attacking them.

I'd like to disagree with the trolling and off-topic assessments. I would understand if it was a tangential comment about some other problems of Go (like nilchecking for example), but I replied directly to the point the article was addressing. I see the argument that it could potentially invite hostile behaviour, but I'd like to believe in the civility of the forum.

Finally, to expand a bit on the "what features Go needs" question of GP, for example it doesn't support most unicode characters deliberately (only letters and digits), proper exceptions, unions, sum types, enums, and the language doesn't care about correctness at all, only about the "common use case" (see the nanotime/time.Time issue for example, they implemented it the worst possible way)


Seems to me they've rightly figured that exceptions were a mistake and C had the right overall idea about error handling all along. Go essentially took C's error handling, such at is is, and improved it one step further, taking it from ~20% complete to ~80%. In contrast, e.g. Rust is probably at ~95%.

---

It does endlessly fascinate me how 20 years after the definitive wisdom against exceptions appeared [1], programmers still keep insisting that seeing all code paths laid out bare on the page is somehow worse than keeping them implicit and hidden.

[1] https://www.joelonsoftware.com/2003/10/13/13/

"I think the reason programmers in C/C++/Java style languages have been attracted to exceptions is simply because the syntax does not have a concise way to call a function that returns multiple values, so it’s hard to write a function that either produces a return value or returns an error."

Effortlessly predicting the direction of future languages like Go/Rust/Zig decades ahead of their time. Excellent taste.


> now every Go program is full of nilchecking boilerplate

The try/throw/catch model is painful and Go opted for errors as values. It makes code more verbose but also much more predictable. Pick your pain.


Makes developers need to at least think about error handling compared to say python. However I find Java’s error handling to be superior, but Java coding generally requires an IDE.


> I would love to answer in detail but right now I cannot.

No problem, we can wait. When will you be able?


> This feels like such a non-news. They've pretty much created a language which is deliberately dumbed down to be 20 years behind other languages, then they are proud that they could speed it up with a feature that was fairly well-known almost 20 years ago.

Sounds like the average iPhone reveal to me.


Every post about Go becomes a pretty even split about why Go is terrible/behind the times/frustrating/obsolete, and why Go is productive/simple/awesome.

Maybe just let people like things?

Interestingly most posts about Rust also become about Go as well.


No, my preference is the One True Preference!


Nobody here talks about PHP. Should we be?



> 1 point


> 20 years behind other languages

What a ridiculous attitude


This feels like such a non-comment.

Which popular GC'd language provide tools to use stack?


C# does... you can stackalloc.

I don't see the incompatibility -stack isn't managed by the GC, so the stack isn't affected by the GC's operations, doesn't need to be pinned or anything

Why can't GC'd languages provide access to the stack?


If you care so deeply about memory access that you want control over stack allocation, you probably care enough that you don't want a GC. What's your use case where the GC doesn't murder you, but you care about better cache locality?


Isn't more explicit control of memory allocation useful precisely in the places where the default behavior would murder you? I don't see the incompatibility. Many python and java programs have portions that are native code for performance reasons (though that's not necessarily just about GC).

What's relatively rare is a program where the whole thing is absolutely performance critical rather than just some parts.


Objects on the stack don't participate in the GC (technically they're GC roots). Half of the reason you want to be able to put variables on the stack is because the stack variables are managed by lifetime and therefore the GC doesn't have to collect them.


Not sure if Julia counts as popular, but Julia.


Common Lisp does in the form of DYNAMIC-EXTENT declaration. Some implementations honor that declaration (it's optional, as it's only an optimization in any case).

Now whether Common Lisp is truly a popular language these days ...


"There are many reasons to be excited about generics in Go."

There were also many reasons to be excited about generics in C++, back in 1998 lol.


And C++ was already 25 years behind the times back then, because ML had generics in 1973.


C++ compilers already had experimental template support in 1992, e.g. Turbo C++ for Windows 3.1.


The headline is dismal. Now writers can’t even be bothered to write what the improvement is? 40% of ______? “…Just trust us, 40% of something is ‘shaved’ …it’s good we promise.”


(2022)


Added above. Thanks!


Another way of seeing it is that "40% of the code became imaginary", in the sense that you have to figure out what the macro is expanding to.

I'm not against generics. I was just wondering myself in a boring afternoon :)


The article is about performance or CPU time, not code size. Per the first paragraph, "In this blog post I’m going to show how, using the generics, we got a 40% performance gain in an already well optimized package, the Google B-Tree implementation."


I’m reasonably sure that running it through godbolt or using a coverage tool would show the (reachable) part of the program is smaller.


> Another way of seeing it is that "40% of the code became imaginary", in the sense that you have to figure out what the macro is expanding to.

The compiler does lots of internal 'expansion' steps in all kinds of different internal formats to spit out machine code at the end. We typically wonder what all those intermediate steps are.

Why should we suddenly care, just because one of those internal formats can be interpreted as being compatible with an earlier version of the Go language? (I don't say, that the internal format _is_ the Go language, because I doubt their internal representation looks like the bytes you have in your .go file.)


In this case I don't think there's really any mystery or hidden code, though.

Functions that operate on collections (or set-like structures) are what generics are most commonly used for, so there shouldn't be much surprise.


Hm but the same argument can be made for anything the compiler does, no? Like I have no idea of how the compiler handles the `Interface`-based approach either.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: