That's a cool find. I wonder if LLVM also does the other way around operation, where it pattern matches handwritten CAS loops and transform them into native ARM64 instructions.
That's a very good question. A proper compiler engineer would know, but I will do my best to find something and report back.
Edit: I could not find any pass with a pattern matching to replace CAS loops. The closest thing I could find is this pass: https://github.com/llvm/llvm-project/blob/06fb26c3a4ede66755... I reckon one could write a similar pass to recognize CAS idioms, but its usefulness would be probably rather limited and not worth the effort/risks.
The term of art for this technique is "idiom recognition" and it's proper ancient, like, APL compilers did have some idiom recognition 50+ years ago.
An example you'll see in say a modern C compiler is that if you write the obvious loop to calculate how many bits are set in an int, the actual machine code on a brand new CPU should be a single population count instruction, C provides neither intrinsics (like Rust) not a dedicated "popcount" feature, so you can't write that but it's obviously what you want here and yup an optimising C compiler will do that.
However, LLVM is dealing with an IR generated by other compiler folk so I think it probably has less use for idiom recognition. Clang would do the recognition and lower to the same LLVM IR as Rust does for its intrinsic population count core::intrinsics::ctpop so the LLVM backend doesn't need to spot this. I might be wrong, but I think that's how it works.
> An example you'll see in say a modern C compiler is that if you write the obvious loop to calculate how many bits are set in an int, the actual machine code on a brand new CPU should be a single population count instruction, C provides neither intrinsics (like Rust) not a dedicated "popcount" feature, so you can't write that but it's obviously what you want here and yup an optimising C compiler will do that.
C compilers definitely have intrinsics for this, for GCC for instance it is `__builtin_popcount`.
And apparently it has even standard language support for it since C23, it's `stdc_count_ones` [1] and in C++ you have `std::popcount` [2]
The existence of platform specific hacks is not interesting. In reality what happens is that software which has at any point cared about being portable doesn't use them.
But yes stdc_count_ones is indeed the intrinsic you'd want here, and only a few years after I stopped writing C, so thanks for mentioning that.
std::popcount is C++ but it's also kinda miserable that it took until C++ 20 and yet they still only landed the unsigned integer types, even though C++ 20 also insists the signed integers have two's complement representation, so the signed integers do have these desirable properties in fact but you can't use that.
> In reality what happens is that software which has at any point cared about being portable doesn't use them.
I don't think this generalization is actually true. Fast portable software compiles conditionally based on the target platform, picking the fast platform-specific intrinsic, and falls back to a slow but guaranteed portable software implementation. This pattern is widespread in numerical linear algebra, media codecs, data compressors, encryption, graphics, etc.
Maybe we are just quibbling over semantics but the compiler intrinsic here is '__builtin_popcount'. 'stdc_count_ones' is a standard library element that presumably will be implemented using the intrinsic.
And FWIW all major C/C++ have for a long time have had a an intrinsic for this. In clang it even has the same name, Visual Studio it's something like just '_popcount'. So it has long been easy to roll your own macro that works everywhere.
Yes, just semantics. But I don't think I can agree that because you could have ensured this works portably people actually did. That's not been my experience.
Yesterday I watched that "Sea of Thieves" C++ 14 to C++ 20 upgrade story on Youtube, that feels much more like what I've seen - code that shouldn't have worked but it did, kept alive by people whose priority is a working game.
(amomax is the atomic fetch-max instruction. lr and sc are load-reserved and store-conditional instructions; sc is like a regular store except it only succeeds if the address was not modified since the previous lr that accessed it. IOW the assembly is basically one-to-one with the C source.)
CodeWeavers, which I am very lucky to be part of, is also an (almost) 30-year-old small company! We make Wine, Proton, and CrossOver, you might've heard of them. And I think we are hiring as well, please have a look!
True but memory requirements grow with sequence length. For recurrent models the memory requirement is constant. This is why I qualified with "low memory".
I (kind of) know multiple languages and they all have different word orders. I find it interesting that my brain is able to switch from expecting information to be received in one order to another. Each word order feels normal to me in its respective language, mix them and they will feel weird. It's like my brain is able to process information in different orders, but there are feature flags to enable them based on the language used.
I suspect that as well, however, 3.5-turbo-instruct has been noted by other people to do much better at generating legal chess moves than the other models. https://github.com/adamkarvonen/chess_gpt_eval gave models "5 illegal moves before forced resignation of the round" and 3.5 had very few illegal moves, while 4 lost most games due to illegal moves.
Okay, I can't believe I am going to defend performance reviews (I hate them with passion), but I actually disagrees with the author's main point. Same accomplishments can be colored good or bad, but that in itself isn't wrong. You could've moved a mountain with a teaspoon, but that's pointless if you don't work for a mountain moving company. i.e. performance isn't just what you have done, but also whether that aligns with the goal of your employer.
(Of course there's the problem where the capitalistic system forces people to work and do things that aren't necessarily aligned with their personal goals and values, just to have a roof over their head and food on their table. But that's a whole different story.)
(And then there's also the problem where people will abuse the review system for their own benefits...)
You could work for a mountain moving company and your boss could still find issues with how the mountain you moved wasn’t the right height, or had rocks that didn’t quite match the destination.
I've had managers like this. It's a sort of uncontrolled OCD if I'm being honest, particular painful as well if you know it's a kind, deeply well intended person.. unable to control their OCD. People with OCD applied to a technical field can be brilliant superstars, as is the case here. Of course I said, "maybe better if we part ways". And as I left, I got grilled on, let's say by way of example, how the company NAS worked. Then it became me giving a lesson in NAS 101 and simultaneously explaining how the setup was kosher, because he didn't know jack about NAS setup but had deep anxiety there was a problem. It's like asking a surgeon "I don't trust you, how do I know you're a good surgeon" "you need to explain why you're a good surgeon but also explain surgery on the whole to me at the same time because I don't know it".. after that I cried all the way to a better job with full trust.
Now you all think it's bad in tech, now try medicine. They use "360 reviews" and your "boss" is a bureaucratic admin. It's all politics. If you throw a rock, you'll hit ten people who use reviews as weapons. Telling people off face to face is frowned upon. Penting up the rage for the day the review comes, that is the "safe" method to blow off grievances. At point whichever narrative of their grievance becomes hyperbolic. Medicine itself is saturated with sociopaths, type A personalities, OCD, so there's a lot of people with a lot that they get mad about. If the boss wants you to be the fall guy, nothing you can do. If another department decides they don't like you, or they make you the fall guy, your boss will weigh the politics of the situation, and if you're not politically as important as that department, they'll gladly sacrifice you as a pawn towards their goal. I once got written up due to my deep, seething, hatred of women. The person writing it (now a cancer doctor), you see, they did not claim word nor action against women, for I had committed none. They had seen me make eye contact with a male coworker a single time and thought, "why was he not making eye contact with the FEMALE coworker" and went on a long, long rant expanding upon the basis of this event regarding my hatred of women, perhaps due to an innate psychic ability to read my thoughts, or at least that's how it read. In medicine, that writeup is then held against you. I'm a republican now, in part because of this type of culture and the power people can wield playing games such as this.
I had the pleasure of reverse-engineering win32 SRWLOCKs, and based on the author description of nsync it is very close to how SRWLOCK works internally. Kind of surprised how much faster nsync is compared to SRWLOCK.
The post doesn't include any benchmarks for the uncontended case, where I've found SRWLock is also very fast, I'm curious why this wasn't included.
At least for what I use locks for, the uncontended case is like 10000x more common, I actually don't think I have any super heavy contention such as the case shown in the post, as this is simply something to be avoided--as no no matter how good the mutex this won't play well with the memory system.
This is pretty dumb on Debian's part. First of all I don't understand why they insist crate dependencies must be pulled from their repository. They are just source code, not built binary. AFAIK there is no other distro that does this, what they do is that they would download crates from crates.io (`cargo vendor` is a command that does this automatically) and build against that. Arch does this, Gentoo does this, NixOS does this, why does Debian has to be different?
Secondly, even if they have to use crates from their repository, I don't understand what's so hard to just have multiple versions of the same crate? That will solve the problem too.
This is just all-around weird what Debian is doing.
(Full disclosure, I am the one who introduced the first piece of Rust code into bcachefs-tools)
Debian has a Social Contract[1] as well as guidelines {the DFSG}[2] regarding the commitment to only distribute free and open source software that all package maintainers must adhere to. This means that package maintainers must check the licenses of source code and documentation files, clear up any ambiguities by talking to upstream, and (as a last resort) even excise code and/or documentation from Debian's copy of the codebase if it doesn't meet the requirements of the DFSG.
In practise, this means that Debian has to make its own copy of the source code available from a Debian-controlled repository, to ensure that no (accidental or otherwise) change to an upstream source archive can cause non-DFSG compliant Debian source or binary packages to be distributed.
But it’s not guaranteed. The Debian way provides a method of allocating responsibility. So if anything does go wrong they can point to a responsible party, the package maintainer. By providing tarball source you’re trying to placate responsibility of some code. You could build those tarballs on a different machine/different OS and any issues wouldn’t technically be your problem because “it’s just deps”.
> Arch does this, Gentoo does this, NixOS does this, why does Debian has to be different?
I say this as someone who ran Gentoo for years and daily drives Arch today.
Because sometimes you don't want entire swaths of your server being rebuilt/tinkered with on a regular basis under the hood. "Move fast, break everything" is great in dev/test land, or a prod environment where the entire fleet is just containers treated like cattle, but contrary to what the SREs of the valley would believe, there's a whole ecosystem of 'stuff' out there that will never be containerized, where servers are still treated like pets, or rather, at least "cherished mules", that just do their job 24/7 and get the occasional required security updates/patches and then go right back to operating the same way they did last year.
> AFAIK there is no other distro that does this, what they do is that they would download crates from crates.io (`cargo vendor` is a command that does this automatically) and build against that.
AFAIK, most traditional distributions do that, not just Debian. They consider it important that software can be rebuilt, even in the far future, with nothing more than a copy of the distribution's binary and source packages. Doing anything which depends on network access during a build of the software is verboten (and AFAIK the automated build hosts block the network to enforce that requirement).
Keep also in mind that these distributions are from before our current hyper-connected time; it was common for a computer to be offline most of the time, and only dial up to the Internet when necessary. You can still download full CD or DVD sets containing all of the Debian binaries and source code, and these should be enough to rebuild anything from that distribution, even on an air-gaped computer.
> Secondly, even if they have to use crates from their repository, I don't understand what's so hard to just have multiple versions of the same crate? That will solve the problem too.
That is often done for C libraries; for instance, Debian stable has both libncurses5 and libncurses6 packages. But it's a lot of work, since for technical reasons, each version has to be an independent package with a separate name, and at least for Debian, each new package has to be reviewed by the small ftpmaster team before being added to the distribution. I don't know whether there's anything Rust-specific that makes this harder (for C libraries, the filenames within the library packages are different, and the -dev packages with the headers conflict with each other so only one can be installed at a time).
There's also the issue that having multiple versions means maintaining multiple versions (applying security fixes and so on).
> There's also the issue that having multiple versions means maintaining multiple versions (applying security fixes and so on).
This is the most important part. Debian LTS maintains packages for 5 years. Canonical takes Debian sources, and offers to maintain their LTS for 10 years. Red Hat also promises 10 years of support. They don't want anything in the core part of their stable branches that they can't promise to maintain for the next 5-10 years, when they have no assurance that upstream will even exist that long.
If you want to move fast and break things, that's also fine. Just build and distribute your own .deb or .rpm. No need to bother distro maintainers who are already doing so much thankless work.
No arguments there, I'm more talking what that means for the "build and distribute your own dev/rpm" part that follows. Why are the only options "do work for maintainers" or "provide a prebuilt package for the distro", what happened to "nobody said this needed to be done yet"?
No problem, if upstream doesn't want their software packaged for Distro X, nobody needs to do anything.
The thing about Linux filesystems, though, is that they consist of two parts: the kernel patch and the userspace tooling. Bcachefs is already in the kernel, so it's a bit awkward to leave out bcachefs-tools. Which is probably why it got packaged in the first place. Stable distros generally don't want loose ends flailing about in such a critical part of their system. If nobody wants to maintain bcachefs-tools for Debian, Debian will probably remove bcachefs entirely from their kernel as well.
I’d consider the issue to be the opposite. Why does every programming language now have a package manager and all of the infrastructure around package management rather than rely on the OS package manager? As a user I have to deal with apt, ports, pkg, opkg, ipkg, yum, flatpak, snap, docker, cpan, ctan, gems, pip, go modules, cargo, npm, swift packages, etc., etc., which all have different opinions of how and where to package files.
On packaged operating systems (Debian, FreeBSD) - you have the system’s package manager to deal with (apt, pkg respectively). I can have an offline snapshot of _all_ packages that can be mirrors from one place.
IMHO, every programming language having its own package system is the weird thing.
If you are a developer you almost always eventually need some dependencies that don’t ship with the os package manager, and once some of your dependencies are upstream source, you very quickly find that some dependencies of the sources you download rely on features from newer versions of libraries. If you have multiple clients, you may also need to support both old and new versions of the same dependencies depending on who the work is for. Package managers for a Linux distribution have incompatible goals to these (except maybe nix)
We want to make our software available to any system without every library maintainer being a packaging expert in every system.
The user experience is much better when working within tkese packaging systems.
You can control versions of software independent of the machine (or what distros ship).
Or in other words, the needs of software development and software distribution are different. You can squint and see similarities but the fill different roles.
So every user has to be an expert in every package manager instead? Makes sense. Make life easy for the developer and pass the pain on to thousands of users. 20 years ago you may or may not support RPM and DEB and for everyone else a tarball with a make file that respected PREFIX was enough. (Obviously a tarball doesn’t support dependencies.)
Because OS packaging stuff sucks. It adds an enourmous barrier to sharing and publishing stuff.
Imagine that I make a simple OS-agnostic library in some programming language and want to publish it to allow others to use it. Do I need to package for every possible distro? That's a lot of work, and might still not cover everyone. And consider that I might not even use Linux!
A programming language will never get successful if that is what it takes to built up a community.
Moreover in the case of Rust distos are not even forced to build using crates.io. However the downside is that they have to package every single dependency version required, which due to the simplicity of publishing and updating them have become quite a lot and change much often than they would like.
The funny thing is that in the C/C++ world it's common to reimplement functionality due to the difficulty of using some dependencies for them. The result is not really different from vendoring dependencies, except for the reduced testing of those components, and this is completly acceptable to distros compared to vendoring. It makes no sense!
1. Because Windows/macOS/iOS/Android don't have a built-in package manager at the same granularity of individual libraries, but modern programming languages still want to have first-class support for all these OSes, not just smugly tell users their OS is inferior.
2. Because most Linux distros can only handle very primitive updates based on simple file overwrites, and keep calling everything impossible to secure if it can't be split and patched within limitations of C-oriented dynamic linker.
3. Because Linux distros have a very wide spread of library versions they support, and they often make arbitrary choices on which versions and which features are allowed, which is a burden for programmers who can't simply pick a library and use it, and need to deal with extra compatibility matrix of outdated buggy versions and disabled features.
From developer perspective with lang-specific packages
• Use 1 or 2 languages in the project, and only need to deal a couple of package repositories, which give the exact deps they want, and it works the same on every OS, including cross-compilation to mobile.
From developer perspective of using OS package managers:
• Different names of packages on each Linux distro, installed differently with different commands. There's no way to specify deps in a universal way. Each distro has a range of LTS/stable/testing flavors, each with a different version of library. Debian has super old useless versions that are so old it's worse than not having them, plus bugs reintroduced by removal of vendored patches.
• macOS users may not have any package manager, may have an obscure one, and even if they have the popular Homebrew, there's no guarantee they have the libs you want installed and kept up-to-date. pkg-config will give you temporary paths to precise library version, and unless you work around that, your binary will break when the lib is updated.
• Windows users are screwed. There are several fragmented package managers, which almost nobody has installed. They have few packages, and there's a lot of fiddly work required to make anything build and install properly.
• Supporting mobile platforms means cross-compilation, and you can't use your OS's package manager.
OS-level packaging suuuuuuuucks. When people say that dependency management in C and C++ is a nightmare, they mean the OS-level package managers are a nightmare.
>AFAIK there is no other distro that does this, what they do is that they would download crates from crates.io (`cargo vendor` is a command that does this automatically) and build against that.
Note your examples are all bleeding edge / rolling distributions. Debian and the non-bleeding edge distributions go a different route and focus on reproduce-ability and security, among other things.
With the "get from crates.io" route, if someone compromises/hijacks a crate upstream, you're in trouble, immediately. By requiring vendoring of sources, that requires at least some level of manual actions by maintainers to get that compromised source in to the debian repositories to then be distributed out to the users. As you get in towards distributions like RHEL, they get even more cautious on this front.
debian is substantially older than all of those distros, and you named three that happen to, in my view, been designed specifically in reaction against the debian style of maintenance (creating an harmonious, stable set of packages that's less vulnerable to upstream changes), so it's strange to say that debian is the odd one out
keeping a debian-hosted copy of all source used to build packages seems like a very reasonable defensive move to minimize external infrastructure dependencies and is key in the reproducible builds process
there's definitely a conflict with the modern style of volatile language-specific package management, and I don't think debian's approach is ideal, but there's a reason so many people use debian as a base system
also it seems like the idea of maintaining stable branches of software has fallen out of vogue in general
Part of a distros job is to audit all the packages (even if just to a minimal extent), and in many cases patch for various reasons. This is much harder if the source is external and there are N copies of everything.
Because Debian and similar distros have a goal of maintaining all of the software that users are expected to use. And this means they commit to fixing security issues in every single piece of software they distribute.
A consequence of this is that they need to fix security issues in every library version that any application they distribute uses, including any statically-linked library. So, if they allow 30 applications to each have their own version of the same library, even Rust crates, and those versions all have a security issue, then the Debian team needs to find or patch themselves 30 different pieces of code. If instead they make sure that all 30 of those applications use the same version of a Rust crate, then they only need to patch one version. Maybe it's not 30 times less work, but it's definitely 10 times work. At the size of the Debian repos, this is an extremely significant difference.
Now, it could be that this commitment from Debian is foolish and should be done away with. I certainly don't think my OS vendor should be the one that maintains all the apps I use - I don't even understand the attraction of that. I do want the OS maintainer to handle the packaging of the base system components, and patch those as needed for every supported version and so on - and so I understand this requirement for them. And I would view bcachefs-tools as a base system component, so this requirement seems sane for it.
It's pretty dumb (your words if you don't like them) not to understand (your words if you don't like them) that, how, and why, different distributions "has to be different". Debian is not Arch or Nix. A tractor is not a race car is not a submarine, even though they are all vehicles.
This seems like a very basic concept for anyone purporting to do any sort of engineering or designing in any field.
If it were me, I would not be so eager to advertize my failure to understsnd such basics, let alone presume to call anyone else pretty dumb.