More

OskarS · 2026-01-08T09:47:05 1767865625

> HashMap implements Extend, so just h0.extend(h1) and you're done, the people who made your HashMap type are much better equipped to optimize this common operation.

Are you sure? I'm not very used to reading Rust stdlib, but this seems to be the implementation of the default HashMap extend [1]. It just calls self.base.extend. self.base seems to be hashbrown::hash_map, and this is the source for it's extend [2]. In other words, does exactly the same thing, just iterates through hash map and inserts it.

Maybe I'm misreading something going through the online docs, or Rust does the "random seed" thing that abseil does, but just blinding assuming something doesn't happen "because Rust" is a bit silly.

[1]: https://doc.rust-lang.org/src/std/collections/hash/map.rs.ht...

[2]: https://docs.rs/hashbrown/latest/src/hashbrown/map.rs.html#4...

tialaramex · 2026-01-08T11:20:34 1767871234

Yes, HashMap will by default be randomly seeded in Rust, but also the code you linked intelligently reserves capacity. If h0 is empty, it reserves enough space for all of h1, and if it isn't then it reserves enough extra space for half of h1, which turns out to be a good compromise.

Note that the worst case is we ate a single unneeded growth, while the best case is that we avoided N - 1 grows where N may be quite large.

attractivechaos · 2026-01-08T19:47:20 1767901640

First of all, as khuey pointed out, the current implementation accumulates values. extend() replaces values instead. It wouldn't achieve the same functionality.

I tried extend() anyway. It didn't work well. Based on your description, extend() implements a variation of preallocation (i.e. Solution II). However, because it doesn't always reserve enough space to hold the merged hash table, clustering still happens depending on N. I have updated the rust implementation (with the help of LLM as I am not a good rust programmer). You can try it yourself with "ht-merge-rust 1 -e -n14m" or point out if I made mistakes.

> HashMap will by default be randomly seeded in Rust

Yes, so it is with Abseil. The default rust hash functions, siphash in the standard library and foldhash in hashbrown, are ~3X as slow in comparison to simple hash functions on pure insertion load. When performance matters, we will use faster hash functions at least for small keys and will need a solution from my post.

> In a new enough C++ in theory you might find the same functionality supported, but Quality of Implementation tends to be pretty frightful.

This is not necessary. The rust libraries are a port of Abseil, a C++ library. Boost is as fast as Rust. Languages/libraries should learn from each other, not fight each other.

tialaramex · 2026-01-08T21:14:10 1767906850

> First of all, as khuey pointed out, the current implementation accumulates values. extend() replaces values instead. It wouldn't achieve the same functionality.

Ah! Yes, I apologise. I missed the + in += and I'm not used to a hash table which defaults initialization for unseen entries (as the C++ hash tables all tend to because its native container types behave that way) so I wasn't looking for it.

The SipHash will be noticeably slower, no question about it, and so if you need to and know what you're paying you can replace the hash, including with integer_hasher which gives you what you'd likely know from many C++ stdlib implementations - an identity function presented as a hash.

> This is not necessary. The rust libraries are a port of Abseil, a C++ library.

More specifically HashBrown is a port [edited: actually a re-implementation I think, design->Rust not C++->Rust] of Abseil's Swiss Tables, and these days Rust's HashMap (and HashSet of course) use HashBrown but that's not what I was getting at here

I was thinking about analogues of Extend (because as I wrote above, I didn't notice that you're accumulating not overwriting) and modern C++ has this kind of feature in Ranges::to however it doesn't quite have Extend and as I said QoI is poor, there are often trivial optimisations that Rust does but the C++ means the same but isn't optimised.

I am interested in a quite different benchmark for hash tables, rather than merging I'm interested in very small hash tables. Clearly for two items it will be faster to try them both, and clearly for a million items trying them all is awful, so I measure a VecMap type (same API as a hash table but actually just the growable array of unordered key->value pairs, searched linearly) against HashMap and other implementations of this API.

For N=25 VecMap is still competitive, but even at N=5 if we use a very fast hash (such as that identity function) instead of SipHash we can beat VecMap for most operations. I suspect this sort of benchmark would fare very differently on older hardware (faster memory relative to ALU operations) and the direction of travel is likely to stay the same for the foreseeable future. In 1975 if you have six key->value pairs you don't want a hash table because it's too slow but in 2025 you probably do.

OskarS · 2025-10-20T16:14:24 1760976864

The tricky part is really point 2 there, that can be harder than it looks (e.g. even simple file I/O can be network drives). Async IO can really shine here, though it’s not exactly trivial designing async cancelletion either.

OskarS · 2025-10-20T16:11:32 1760976692

Feel free to read the article before commenting.

f1shy · 2025-10-20T16:51:32 1760979092

I’ve read it, and I found nothing to justify that piece of code. Can you please explain?

bartvk · 2025-10-20T17:08:18 1760980098

The while loop surrounds the whole thread, which does multiple tasks. The conditional is there to surround some work completing in a reasonable time. That's how I understood, at least.

f1shy · 2025-10-20T17:29:56 1760981396

Does not seem so clear to me. If so it could be stated with more pseudo code. Also the eventual need for multiple exit points…

OskarS · 2025-10-20T15:22:11 1760973731

Never heard of this, I’m really interested in digging into this paper. Thank you both for the tip!

OskarS · 2025-10-15T09:42:51 1760521371

No, but if you phrase it like "there are multiple correct answers to the question 'I have a list of integers, write me a computer program that sorts it'", that is obviously true. There's an enormous variety of different computer programs that you can write that sorts a list.

OskarS · 2025-10-13T18:25:20 1760379920

Is the protocol inherently inferior in situations like that, or is this because we've spent decades optimizing for TCP and building into kernels and hardware? If we imagine a future where QUIC gets that kind of support, will it still be a downgrade?

Veserv · 2025-10-13T19:18:15 1760383095

There is no performance disadvantage at the normal speed of most implementations. With a good QUIC implementation and a good network stack you can drive ~100 Gb/s per core on a regular processor from userspace with 1500-byte MTU with no segmentation offload if you use a unencrypted QUIC configuration. If you use encryption, then you will bottleneck on the encryption/decryption bandwidth of ~20-50 Gb/s depending on your processor.

On the Linux kernel [1], for some benchmark they average ~24 Gb/s for unencrypted TCP from kernel space with 1500-byte MTU using segmentation offload. For encrypted transport, they average ~11 Gb/s. Even using 9000-byte MTU for unencrypted TCP they only average ~39 Gb/s. So there is no inherent disadvantage when considering implementations of this performance level.

And yes, that is a link to a Linux kernel QUIC vs Linux kernel TCP comparison. And yes, the Linux kernel QUIC implementation is only driving ~5 Gb/s which is 20x slower than what I stated is possible for a QUIC implementation above. Every QUIC implementation in the wild is dreadfully slow compared to what you could actually achieve with a proper implementation.

Theoretically, there is a small fundamental advantage to TCP due to not having multiple streams which could allow it maybe a ~2x performance advantage when comparing perfectly optimal implementations. But, you are comparing a per-core control plane throughput using 1500-byte MTU of, by my estimation, ~300 Gb/s on QUIC vs ~600 Gb/s on TCP at which point both are probably bottlenecking on your per-core memory bandwidth anyways.

[1] https://lwn.net/ml/all/cover.1751743914.git.lucien.xin@gmail...

OskarS · 2025-10-09T20:30:28 1760041828

No, this is very much not the same. The Raku version is like writing this in Python:

    def fibonacci():
        a, b = 0, 1

        while True:
            yield a
            a, b = b, a+b

And taking the 40th element. It's not comparable at all to the benchmark, that's deliberately an extremely slow method of calculating fibonacci numbers for the purpose of the benchmark. For this version, it's so fast that the time is dominated by the time needed to start up and tear down the interpreter.

OskarS · 2025-10-09T15:12:22 1760022742

Your argument is that it’s not ok to think of all Jewish people as a monolithic group, and therefore his statement where he considered all arabs as a monolithic group is ”legitimate”? Seriously?

Just like it’s not ok to see all jews as part of the same murderous conspiracy, it’s not ok to see all arabs as part of one either.

OskarS · 2025-10-09T12:24:16 1760012656

You can absolutely save data like that, it's just that it's a terrible idea. There are obvious portability concerns issues: little-endian vs. big endian, 32-bit vs. 64-bit, struct padding, etc.

Essentially, this system works great if you know the exact hardware and compiler toolchain, and you never expect to upgrade it with things that might break memory layout. Obviously this does not hold for Word: it was written originally in a 32-bit world and now we live in a 64-bit one, MSVC has been upgraded many times, etc. There's also address space concern: if you embed your pointers, are you SURE that you're always going to be able to load them in the same place in the address space?

The overhead of deserialization is very small with a properly written file format, it's nowhere near worth the sacrifice in portability. This is not why Word is slow.

skywal_l · 2025-10-09T13:43:39 1760017419

Andrew Kelley (author of zig) has a nice talk about programming without pointers allowing ultra fast serialize/deserialization. [0]

And then you have things like cap'n'proto if you want to control your memory layout. [1]

But for "productivity" files, you are essentially right. Portability and simplicity of the format is probably what matters.

[0]: https://www.hytradboi.com/2025/05c72e39-c07e-41bc-ac40-85e83...

[1]: https://capnproto.org/

OskarS · 2025-10-09T15:33:10 1760023990

That is true, cap’n proto and flatbuffers are excellent realizations of this basic concept. But that’s very different thing from what the commenter is talking about Word doing in the 90s, of just memory-mapping the internal data structures and be done with it.

actionfromafar · 2025-10-09T16:54:15 1760028855

Smalltalk is something like that.

amelius · 2025-10-09T15:11:06 1760022666

It's only a terrible idea because our tools are terrible.

That's exactly the point!

(For example, if Rust would detect a version change, it could rewrite the data into a compatible format, etc.)

johngossman · 2025-10-10T05:02:00 1760072520

At which point you're not just memory mapping the file. And if the new version changes the size of the object, it doesn't pack in the same place in memory, so you have to repack before saving. Even serializing with versioning is very hard. Memory mapping is much worse. Several other comments indicate that I am not the only one with bad experiences here.

OskarS · 2025-10-05T08:02:50 1759651370

And because they’re new, we have plenty of homegrown internet retailers for competition. Personally, I avoid Amazon in favor of the others if at all possible, seems like there’s a really significant risk Amazon is going to wipe them out.

sandebert · 2025-10-05T08:15:18 1759652118

For me it's problematic though. Sometimes I want some small gadget, and I just can't find it elsewhere. The other day I wanted a female-female connector for network cables and I couldn't find it on my Swedish go-to place for tech stuff, Kjell&Co. Instantly found lots of alternatives on Amazon.

Symbiote · 2025-10-05T08:52:57 1759654377

https://www.ebay.ie/sch/i.html?_nkw=network+coupler

eBay might have some similar problems as Amazon (fly-by-night retailers from China etc), but at least it doesn't pretend otherwise.

Or from Denmark, but delivery is probably expensive. (Within-EU delivery costs is something I'd like to see the EU improve, to allow smaller businesses to compete internationally with Amazon etc.)

https://www.computersalg.dk/i/2335209/startech-com-cat5e-rj4...

OskarS · 2025-10-05T10:18:08 1759659488

Yeah, I do this on occassion as well, but I aleays try the Swedish retailers first, and only go to Amazon when I can’t find whatever I’m looking for. It’s pretty rare I have to go to Amazon.

It’s rough, because you know that it will probably be cheaper and delivered faster from Amazon, at the moment it is probably the most consumer friendly place to buy. But you know once the honeymoom is over and all the Swedish retailers are gone, it’s just going to devolve to garbage. Support your non-Amazon retailers!