Noting that AMD also operates OSS well in their NIC space. Onload and associated tools are on GitHub. Solarflare had the foresight long ago (>10 years) to be open source and that stewardship continued through acquisitions. [1]
As an OSS contributor, that available software (via tarball from their website) allowed me to make public Docker tooling and other projects [2]. I would be less inclined to do so with proprietary binary bundles.
But it wasn't "open collaboration" until recently -- but they were great on support emails! However, new GitHub projects are popping up from them even in the last few months! [3]
And thus, I can independently and openly work on issues [4] and integrate their software with other tools [5], just yesterday.
This all echoes to the recurrent Supabase thread of the value chain of OSS in companies[6]; there's a lot of reasons to do it.
However the AMD CEO has confirmed recently that Strix Point (laptop Zen 5) will be released later this year (and it is currently being tested by AMD partners like Microsoft) and that it will include a NPU much faster than that of Hawk Point (Ryzen 8040 series) and Phoenix (Ryzen 7040 series).
As Strix Point is currently in testing at external partners, they must have at least preliminary versions of the software for it.
It's a 4x4 configuration of the separately available Xilinx/AMD AIE-ML accelerator, available on Xilinx alveo platform IIRC.
VLIW vector units optimized for AI (-ML suffix) with runtime reconfigurable network on a chip letting you optimize "streaming" the data between individual cores.
The way "AMD IPU" device was implemented, which embeds the system in some new AMD CPUs, the previous drivers (which were Linux/RTOS only) didn't work.
I was actually in the middle of reverse engineering how it was exposed in Phoenix (7940hs) APUs to write custom driver for Linux based on the one shipped for windows.
How does it compare to cuda in terms of versatility and compatibility? most AI frameworks are built around cuda (pytorch, etc..) which is why nvidia has a monopoly now, I fail to see AMD breaking through, but one can hope. Do you see a light at the end of the tunnel?
It's targeting the high power efficiency part of the spectrum where nvidia is much less available. There's some reasonably capable software support, in fact the linux driver linked is significantly more capable than the "SDK" they released so far for windows.
Which makes sense, because under Xilinx brand they have been selling accelerators and SoCs using AIE and AIE-ML cores for various use cases in embedded world, and XRT lets you program those with pretty much plain C++.
For some AI tasks, they have provided high-end wrappers for models that use ONNX (iirc) so those can be used immediately. This is not ROCm, and has essentially none of the ROCm bullshit in terms of support. I seem to recall there's some work to integrate parts of ROCm (opencl, HIP) with XRT.
Definitely going to build some kernels with this and test drive it ASAP.
It is a matrix math accelerator. The 7000 series of CPUs had one that could perform 10TOPS with models quantized to INT8, and the 8000 series can do 16TOPS.
ONNX is a portable format for machine learning models.
The developer documentation is there. I skimmed through it. It doesn't explain what "Ryzen AI" does; unless you happen to already know what it does.
Consider:
> Ryzen AI software lets developers take machine learning models trained in PyTorch or TensorFlow and deploy them on laptops powered by Ryzen AI processors using ONNX Runtime.
Realise that "[not] Ryzen AI software lets developers take machine learning models trained in PyTorch or TensorFlow and deploy them on laptops [not] powered by Ryzen AI processors using [other] Runtimes.", because pretty much everyone can run this stuff without Ryzen AI, including AMD customers.
And then conclude that they haven't explained what Ryzen AI does. I'm sure they've done something (that something being a matrix math accelerator makes a lot of sense, thanks), but they are assuming the reader already has a lot of context. The changelog gets very excited about ONNX which is how I figured that part out.
Is your argument “if I remove the context and then negate the details, then the sentence is meaningless…”?
Sure, I agree with that.
There is plenty of information in the statement:
> Ryzen AI software lets developers take machine learning models […] and deploy them on laptops powered by Ryzen AI processors […].
If I have a passing understanding of computers, I know everything I need to decide if this is interesting to me or not. It is for machine learning developers. It helps them deploy their models to a specific AI processor.
The sentence is meaningful both ways. The argument is if [X] and [not X] are both true, X doesn't rule out any possible universes. And if it doesn't rule out any possible universes, it doesn't tell us anything about what they have done - because universes where they didn't do anything are still valid.
That is a handy test of whether a sentence is marketing fluff or whether it contains data.
> If I have a passing understanding of computers, I know everything I need to decide if this is interesting to me or not. It is for machine learning developers. It helps them deploy their models to a specific AI processor.
I can already deploy models to my processor though, and it is an AMD processor that doesn't support Ryzen AI. I've used it to develop AI models too, since I have an AMD GPU and can't use that to develop AI models because it crashes the driver when I do. So while you can deploy models to a Ryzen AI processor, you can also deploy models to non-Ryzen AI processors. So, logically and practically, they haven't told us anything.
The issue is with the limited understanding you've taken on for your argument, you didn't realise that all computers can already do what they promised Ryzen AI processors can do.
Compare that to eightysixfour who articulates what the thing actually does - lots of TOPS, and a 150% improvement in TOPS on the previous generation. His sentence contains information and I doubt my CPU can match that TOPS performance. And I wasn't developing an INT8 model which is probably also relevant.
Furthermore, it is that in 10 years or so models will be much bigger and more power hungry, possibly with different architectures. So there will be models that we won't be able to deploy on many CPUs even if they are powered by Ryzen AI. So not only is the documentation on Ryzen AI not telling me what they can do, even what the documentation promises can't be logically done. Which isn't a problem because that is more of a silly "gotcha!" style observation, but it really highlights how little useful information is in the docs intro material and marketing copy when it comes to identifying what it is this thing does.
Second, if someone says “AI processor” the context is quite obvious that it is a dedicated AI accelerator, like a Coral TPU. Especially since the first sentence on the page defines it.
> AMD Ryzen™ AI Software enables developers to take full advantage of AMD XDNA™ architecture integrated in select AMD Ryzen AI processors.
Third,
> you didn't realise that all computers can already do what they promised Ryzen AI processors can do
In this text they didn’t promise Ryzen AI processors could do anything, they promised the Ryzen AI software would let you deploy models to Ryzen AI processors.
> The issue is with the limited understanding you've taken on for your argument, you didn't realise that all computers can already do what they promised Ryzen AI processors can do.
I have deployed machine learning models to the edge in production on accelerators and CPUs. I use Llamas on my home rig, both CPU and CUDA. I am quite aware of what they can do, which is why I already knew the performance of the Ryzen AI processor. You are the one whose limited understanding kept you from recognizing that “Ryzen AI processor” is a distinct thing from a CPU or GPU.
Your complaint is the equivalent of complaining that the home page for CUDA’s documentation doesn’t explain what a GPU is and why it is different from a CPU, which can do the same computations.
You also seem moderately confused about the use cases for these relatively low performance, but very low power ML accelerators. The first thing that shipped on Ryzen’s AI accelerator is a model from Microsoft that makes it look like you are making eye contact with a camera. Not a giant LLM. They don’t compete with GPUs.
> Second, if someone says “AI processor” the context is quite obvious that it is a dedicated AI accelerator, like a Coral TPU. Especially since the first sentence on the page defines it.
The first sentence doesn't define anything, the first sentence is "AMD Ryzen™ AI Software enables developers to take full advantage of AMD XDNA™ architecture integrated in select AMD Ryzen AI processors. Ryzen AI software intelligently optimizes AI tasks and workloads, freeing-up CPU and GPU resources, providing optimal performance at lower power."
Which is at least better than meaningless but still doesn't explain what Ryzan AI is going to do. An architecture isn't an end goal. Intelligently optimizing AI tasks might be, but frankly AMD has been so behind in this space I don't believe the vapidity without an actual "we implement the such-and-such algorithm" or "the optimisation we refer to is ...". They're providing no evidence to believe they've optimised anything and their track record inspires little confidence. In my experience their AI software can't do BLAS while my GPU is running an X server and so "optimised" could just mean that I will be turning my computer off by choice after I buy one. Which would be a welcome improvement I admit, but they aren't exactly spelling out what the win is going to be.
But communicating that there is a part of a processor that processes tasks is not informative. They might not even process "AI" for all I know, the AI I want to run on my GPU right now isn't represented as an ONNX model. Not a problem, but it continues to go to how low in information the headline copy is. "AI" isn't a real term, it is an umbrella. None of the AIs for the games I know of can be offloaded to Ryzen AI, for example. So there is some fairly specific set of things that Ryzan AI does which they aren't telling me. You are, but you obviously weren't involved in writing the documentation (or, I suppose, if you were the evidence is your editor murdered your work).
Now if you already know what the chip does then sure, you don't need any of the copy or documentation. But there has to be an on ramp for people who don't know and they didn't include anything useful.
> ... they promised the Ryzen AI software would let you deploy models to Ryzen AI processors.
You can't name a processor that models can't be deployed to. They're big slightly-nonlinear operations. Every processor can do that. The sentence in the docs is literally information free.
> I have deployed machine learning models to the edge in production on accelerators and CPUs. I use Llamas on my home rig, both CPU and CUDA. I am quite aware of what they can do, which is why I already knew the performance of the Ryzen AI processor. You are the one whose limited understanding kept you from recognizing that “Ryzen AI processor” is a distinct thing from a CPU or GPU.
And you know what is interesting about this long list? None of it involves reference to the marketing copy or documentation that I'm talking about. Because as far as a cursory glance goes, nothing explains what "Ryzen AI" does.
Just to circle back, you proffered a link to that documentation as a description of what these chips will do. Now I suspect you're about to argue that someone who isn't already in the field shouldn't bother with it. That says about as much as I have about the quality of the documentation - it isn't very good. im3w1l back at the thread root had a very valid question.
> Your complaint is the equivalent of complaining that the home page for CUDA’s documentation doesn’t explain what a GPU is and why it is different from a CPU, which can do the same computations.
THe CUDA homepage (https://developer.nvidia.com/cuda-toolkit), first paragraph, says "The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler, and a runtime library."
10/10, I can tell what CUDA does. Does Ryzan AI do all this stuff? It allows me to deploy models and I'd need all that to deploy a models. I doubt it does. But the docs don't say what it does in the intro so... I dunno. It is probably in there somewhere. They defined ROCm as an umbrella term that included ROCm software once in their ROCm docs, so anything is possible. Being unable to explain what it is they are doing is a real pattern with AMD in the AI space.
For comparison, Google's Edge TPU (found in the Coral USB accelerator for example) will do 4 INT8 TOPS [0], an Nvidia T4 will do 130 [1], and an A100 or A6000 will do 620 [2]. Fully utilized it could be expected to be radically faster and more efficient than CPU but still of course much slower than workstation/server hardware for these operations.
I have a 7900 XT, and all my attempts at getting ROCm to work on different Linuxes seem to have failed at different points. Does anyone have a pointer to a clear explanation on how to get it to work?
> I have a 7900 XT, and all my attempts at getting ROCm to work on different Linuxes seem to have failed at different points.
Debian formed a ROCm Team a while ago and we are working hard on getting this to work out of the box, as in: apt-get install and you're done.
For various reasons, our initial focus was on RDNA2 and CDNA2 and earlier, mainly because we are building a CI network in which we test all of our packages (and their reverse dependencies) on actual cards, and we had to bootstrap the network infrastructure first. CI is central to the Debian ecosystem but all our official infrastructure cannot deal with the requirements for specific hardware attached to a machine -- that was never a factor, until now.
AMD has generously supported us with hardware donations, among which are RDNA3 cards. These are already in our possession, and it's just a question of person-time until these are integrated into our CI. We are already spec'ing the systems, though.
As of right now, the most recent versions would probably be Debian 'testing' (our next release-in-development). But eventually, this shouldn't matter.
As with all other Debian development, we target the 'unstable' distribution for packaging work (similar to git HEAD). Packages eventually migrate from 'unstable' to the 'testing' distribution once they pass certain checks. And when Debian is ready for a new release, 'testing' will basically be tagged as '13 (trixie)'.
Ubuntu periodically syncs their packages from Debian 'unstable', so eventually they will have they same packages that Debian has.
Either way, we realize that running bleeding-edge distributions is probably not for everyone, so we intend to provide backports for both Debian and Ubuntu. Hence why I said earlier that eventually, it shouldn't matter. With "eventually" being sometime in the next few months.
(Site note: our own PyTorch does not yet ROCm support, we're still working on a few dependencies.)
> (Site note: our own PyTorch does not yet ROCm support, we're still working on a few dependencies.)
Can I clarify something. If we install a stable release of Ubuntu now, we wouldn't be able to get PyTorch to work with a 7900 XT/XTX? At least not until you've worked your magic and that's made its way to Ubuntu in a few months?
You wouldn't be able to use the PyTorch as built and provided by Ubuntu, as in 'apt-get install python3-torch' [1], as that version currently only has CPU support.
You could still use a version provided by a third party, for example directly from pytorch.org, or a containerized version provided by ROCm upstream [2].
I would love the answer to this question. I’ve tried with Debian and Fedora and gotten to different stages of the journey, but not successfully in either case.
I've been working in this space for ~10 years at this point and have been attempting ROCm since the Vega days (pretty much since initial releases of ROCm ~six years ago).
"Hello world" in torch? On a new install/card/config/even upgrade with a bit of fiddling I can get there (I have it more-or-less down at this point).
Actually using something in a real world use case? Hours, days, or give up. Often the third scenario where I end up going back to CUDA because I just need to get something done. Messing around with different driver versions, docker containers, ROCm versions, obscure environment variables, weird random cherry picks/patches and hacks from spelunking a bunch of GH repos/discussions/PRs/blog posts/etc. Feels like it never ends.
I talk about it quite a bit and I keep buying AMD hardware and putting time into this because I'm rooting for AMD. However, consider the CEO of Nvidia saying 30% of their cost/spend is on software. He gets it.
I'm pretty convinced at this point that AMD just doesn't have software in their DNA, with even die-hard "Team Red" Windows gaming users frequently complaining about the quality of drivers. Drivers. For gaming. On Windows. Enter >80% Nvidia market share in that market and >90% market share in ML.
They just don't get it and I'm pretty sure they never will. I expect Intel, Apple, and even new entrants in the field will be the ones to offer viable alternatives to the Nvidia/CUDA monopoly.
For OpenGL and Vulkan drivers on Linux, AMD sort of fell into a good software culture by accident by hiring a bunch of good people that brought the Linux (kernel) / Xorg / Mesa culture with them. I am pretty happy with the results there!
Not contradicting you about the other areas, although I do have some hope because things (well... mostly hardware) have already improved in the GPU department under Lisa Su.
Su has been CEO of AMD for nearly 10 years. Does not contribute to my confidence.
Of course with the AI goldrush they're paying a lot of lip-service to ROCm (pumping stock price?) but a couple of examples:
1) They just added official support to ROCm for the top-end consumer card they released over a year ago. Meanwhile in CUDA land Nvidia back-ported support to CUDA 11.8 well before Ada Lovelace/Hopper even launched. Worked with everything out there day of launch. Even newest CUDA 12 and their drivers support anything with Nvidia stamped on it from the past 6-7 years. Literally everything.
2) Just check their repos, Docker Hub, etc. It's embarrassing for them. They used Python 3.10 for ROCm 5.7 because its basically the bare-minimum/standard for any broader AI, ML, etc project. Oh neat, they released ROCm 6 to support their year old ~ $1000 card! Python 3.9? Ubuntu 20.04? What the hell? Pytorch "hello world" usable but almost nothing else even runs.
It's a shame because the $ to raster performance ratio is a lot better for AMD. Yet their market share is still so low. You would imagine their execs have to understand the reasons why the market isn't buying AMD? But they still appear to be neglecting the software side. Still no indication of a ML based upscaler to compete with DLSS, for example. And it's not like they lack the funds to throw at this problem.
If I were on an AMD hardware team I would be (mentally) screaming at the software side of the house daily. I can't imagine how frustrating it must be to design and deliver such capable hardware only to have the software render it practically unusable.
AMD has a "ML based" upscaler called FSR. What exactly do you mean by "ML based" though? Note that since DLSS 2.0 Nvidia no longer uses a per-game trained neural network.
"However, the results were closer than this might suggest, with both upscaling modes often doing a decent job and delivering very similar levels of image quality. It was just the final 5% where DLSS 2 was that bit better."
To provide the full context from your own link, the sentence before says:
> And the end result? Nvidia’s DLSS 2 comprehensively beat FSR 2, coming out on top in almost every comparison and generally exhibiting less ghosting and other visual artefacts while preserving more detail.
Keep in mind that there's no objective way to determine how much worse FSR is. If FSR has slight shimmering and DLSS does not, is it "slightly worse" or "much worse"? I think that comes down to one's tolerance for shimmering artefacts, which may vary depending on whether you're playing for 5 minutes in order to compare, or whether you're playing for hours/days. The social sentiment on Reddit r/AMD and elsewhere seems to be that FSR is noticeably worse, and that aligns with my personal experience.
You want to use rocm directly or with something like pytorch?
If pytorch, keep in mind that most of the pytorch+rocm version is self contained, and you don't need to install every roc* package on your system for it to work. This confused me in the beginning at least.
I would like to use PyTorch. If you got it working, could you please share some steps? I feel like I am losing my mind and the overall quality of the software and docs seem really poor.
The pieces all have exciting versions-must-roughly-match properties. The official distribution idea is pick exactly the recommended OS and install the binary package from amd and it'll be fairly solid.
I totally ignore that recommendation and sometimes have a happily working system and sometimes it's a wreck. In particular, the upstream Linux driver (what you have anyway) and the dkms built one are sometimes rather divergent. Llvm in rocm has different behaviour to the upstream llvm too.
I'm cautiously optimistic that Debian is packaging will end up as turn key just-works. Part of that is persuading AND developers that breaking ABI has consequences, which it kind of doesn't on the install all of rocm together model.
Right now, I use Debian with the same kernel version that the ROCm recommended Ubuntu ships with and build the dkms driver source intended for Ubuntu on it. That be the wrong thing to do but also totally solid in ymmv fashion.
I have read many reports of people giving up on it, claiming that it is terribly buggy and has been for a while. The question is, under what circumstances did AMD test it and what does AMD consider "working". Probably it works on some specific setups under certain use conditions.
More than likely they are testing it on specific server setups with a dedicated GPU handling compute workloads. Think supercomputers and maybe cloud provider setups. That'd make sense if someone in AMD ranked the opportunities by $ value and started working down the list fixing bugs.
My anecdotal evidence with an AMD GPU is there are serious problems trying to use a graphics card to run graphics and throwing a few compute tasks at it - because X11 will reliably crash and the kernel graphics driver enters some sort of corrupt state. I think that is a use case that was beyond their design considerations to start with.
There is a hint that servers are targeted because they supported linux first with ROCm, and had a very tight support matrix of what OS they expected people to use. Unfortunately, this strategy generates a lot of bad publicity and I'd guess has trouble gaining traction because people can't experiment cheaply with AMD hardware. They're progressing though and even with a dose of cynicism about their wherewithal there is a lot in the pipe that has the potential to take on CUDA. Even APUs are promising.
and thats the problem. Seriously, its not that hard, just make some goddamn software that you can build and use more or less anywhere, why is it that I can relatively easily build mesa with vulkan/opengl driver for AMD cards, but their "rocm" is an atrocious mess?
Honestly, it just stinks of incompetence, or if one were otherwise inclined in beliefs, disgust for their customers.
If AMD had been hiring japanese to make this, they would have had to close up shop as they would have all had to perform seppuku.
Rocm hasn't been the the problem in my experience. It's things like pytorch on rocm. It uses hippify to convert cuda code to hip rather than having native support.
It "works", but it's an obvious second class citizen, and the developers don't seem to take bug reports / PRs seriously.
I have the same card, the above works with everything I threw at it so far. Haven't even installed amdgpu-pro drivers btw, only have radv (that steam installed by default).
I have a similar problem where it worked through Stable Diffusion / PyTorch (worked as in, didn't crash), but the resulting images are just full of garbage. I wish there was some sort of test tool or hello world that could be use to diagnose this stuff, but I didn't find anything.
Worth noting that the 7900 XT was released over a year ago, though. Having to wait half a hardware generation to actually use your hardware isn't ideal.
For folks who develop on Linux, do you have Linux as primary OS as in dual boot with windows?
I'm thinking to begin developing on Linux but love to know what is the current strategy to develop on or for Linux.
What I'm trying to say is of WSL2 and all the work Microsoft is doing with Linux operating system has any other bigger purpose beyond docker or container
Linux has been my primary OS for more than a decade now - spanning across web / mobile development, embedded linux, desktop applications and micro controller programming.
Honestly, the Linux ecosystem is more than mature - if not better - for most of the things you need as a developer imo.
The edge cases are usually when you'd need a:
1) A specific proprietary toolchain/IDE for some microcontroller.
2) Windows specific C++/.Net Projects that didn't work well on VS Code.
The shell and desktop environment (for me, KDE Plasma + Yakuake drop down terminal + Dolphin File Manager + KDE Connect) offers a way more superior development experience than anything I found on Windows.
I have been told that Ubuntu based distros are a bit finnicky these days. So, here's the obligatory: "I use Arch btw."
I don't personally use either anymore, but my anecdotal experience helping people getting into Linux is that Arch has been a lot less maintenance than Pop!_OS. Less experienced users just seem to wind up unknowingly getting themselves into a lot more trouble on Ubuntu variants, and before you know it there's some unbelievably convoluted situation where apt is locked up trying to resolve an impossible conflict. I understand that more than likely, if I knew how the problem was created, I could untangle it quicker. But my experience of being dumped onto a machine and trying to debug it, I absolutely hate how horribly tangled up apt can get.
Of course you can hose your system on Arch pretty easily, but it's not like it's difficult to hose your system with Debian or Ubuntu variants. The sneakier thing is that on Debian/Ubuntu variants, there will often be tools, be they third party repositories or even utilities included with the OS, that will make certain bespoke changes to your system for the sake of e.g. installing proprietary drivers, and the combination of those changes will leave you with a snowflake system that may or may not explode when you go to do something complicated (like upgrade major versions.) Whereas with Arch, _a lot_ of the things that you can do that will hose the system will _feel_ like things that will hose the system, and a lot of the scary-looking problems are surprisingly easy to resolve.
Arch shouldn't be any better since obviously, you still need e.g. proprietary drivers or non-standard packages, right? But in my experience, using the AUR to grab those things yields a very good success rate with a low chance of impacting system stability. It's not _zero_ - you can definitely run into problems - but the AUR is generally more reliable for me than third-party dpkg repos have been, and this seems to still be true today.
I wrote a huge wall of text below that nobody will read because it's too long and meandering, but I really think it's time to push for immutable Linux. Arch is oddly stable for what it is, and in my opinion it's a testament to the power of simplicity. But, SteamOS and Fedora Silverblue are starting to show that using modern Linux technologies, you can bring the benefits of immutable operating system design to the Linux desktop today. There's obviously much work to be done, but in my mind the tipping point is either here or coming very soon.
> But, SteamOS and Fedora Silverblue are starting to show that using modern Linux technologies, you can bring the benefits of immutable operating system design to the Linux desktop today.
Amen. You hit the nail when it comes to AUR vs. ppa. Steam OS was also going to bring in support for Nix packages, so definitely want to see how immutable desktop + nixpackages end up changing Linux Desktop.
Rolling releases are always going to have more problems than stable and tested releases. This is sinply mathematical and statistical fact.
For a new switching user choosing stable Ubuntu-based distro is the best choice. It has the best software support, the best informational support (distro articles, googling problems), the best quality (because of the popularity).
I am using Mint since 2014 and I am still using the same install of 9 years ago. I update software packages whenever. With big OS updates that come twice a year I wait just in case for 2 extra months and update to that new version. 9 years and counting.
And now imagine an Arch user and how many times something would break during 9 years of operation... It is very telling when discussing this topic and Arch users use arguments like "Arch doesn't really break that much" without truly realizing what others hear and notice. "THAT much", Karl!..
When dealing with Arch vs. Ubuntu - it is more about "Mostly unmodified upstream software" vs. "Canonical customized software", as opposed to "stay on the bleeding edge" vs. "be stuck with a year old packages"
I am using same Arch install on my desktop for the last 7 or so years. So far, the only time it broke for me was during the whole pulseaudio -> pipewire jump. Half of that is probably due to my pulseaudio tweaks. There's always Manjaro (I use this on my laptop) or other semi rolling release distros too. You are still subscribed to the same repositories but your updates lag by a month or two and would be released to you only after they are happy with their tests.
> Rolling releases are always going to have more problems than stable and tested releases. This is sinply mathematical and statistical fact.
lol.
I feel bad responding with just "lol", but I hope you do actually see how this is quite silly. There is indeed no simple mathematical or statistical fact that says running out of date software with downstream custom patches is more reliable than running the latest version from the developer. If anything, it can cause a lot of difficult-to-detect stability issues that may not impact all users or all use cases.
Not only that, but a big problem with "stable" distros is that most people don't want to use e.g. OBS from 2 years ago, so they need some way to run the latest software. Flatpak or Snap? Sure, but then how do you use e.g. OBS plugins? Suddenly, you are back at "OK, maybe I need a PPA" at which point you need to do things that will inevitably make your system less stable because you are now running somewhat of a "snowflake" configuration that gets more unique every time you add a new PPA or non-trivial modification. Whereas in Arch, you just install it from the package manager, or at worst, AUR, which due to the vastly simpler package management system, is a lot less prone to breakage.
Let me summarize:
- I don't agree that rolling release distros are inherently less stable overall. Stable distros are "tested" but blood, sweat and tears can only go so far. For some really compelling evidence, please ask the Linux kernel folks how their 'LTS' project went. As it turns out, maintaining LTS software is non-trivial in and of itself, and it introduces new problems that did not exist originally.
- Even if rolling release distros were inherently less stable, the reduced need to rely on third-party repositories, packages and even out-of-package-manager installations due to the more up to date packages would offset a lot of that, considering a large part of the problem with Debian/Ubuntu is also just simply that installations get borked too easily.
- Even if that weren't true, Arch's much simpler package management has less of a tendency to get tangled up in impossible-to-resolve dependency issues. Part of this is due to the nature of the AUR vs third-party PPAs, and part of it is just that it's literally much simpler overall, so there's less to go wrong. (Arch packages often err on the side of being less modular, which has its downsides but it certainly simplifies many things.)
- Even if that weren't true, my general experience running mainline Linux and bleeding edge packages for years as my primary operating environment suggests it's actually not very common to be hit with rolling release breakages. Usually people who release software do not just blindly release broken shit. Yes sure, regressions happen, this is a fact of life, but that doesn't mean that you're better off using old versions of things, there's definitely a balance, you don't get to have the benefits of having the newest fixes and never having a regression. In my opinion, frontloading the pain of rather occasional regressions is well worth getting the benefits of the new versions sooner, especially if rolling back is easy. And that, among other reasons, is Exactly why immutable Linux is the future.
P.S.: Yes, "that much" is a perfectly valid thing to say. I have spent many, many, many hours debugging Debian and Ubuntu issues, so the problem isn't that those distros never break and Arch does. The problem is that all distros break and Debian and Ubuntu installs mysteriously, despite being stable distros, absolutely seem to have the most trouble, especially during upgrades. And zero isn't an option. Windows installations break too. Sometimes, especially recently, every Windows installation breaks at once.
Using Stable doesn't mean that you are not getting new releases of consumer software. It only means more vetting and the fact that fundamental changes (like switching from X to Wayland) are not going to happen all of a sudden. Read Ubuntu's or Mint's "What's new" posts to understand what kind of changes we are talking about.
There is a reason why Release Candidates exist. There is a reason why testing exists and why, despite that, there will always be things not working for the first time, that would require bug-fixing releases.
And yes, Stable releases being more stable than bleeding-age, is a mathematical/statistical fact. You can prove it yourself with the magic of a spreadsheet:
- Make a graph that grows X1% on average. This indicates the quality of the software (it is getting better, after all; otherwise there would be no point in updating). But that is the average growth. In practice it is random; daily it can grow or decline by X2, pretty severely as well, but on average it grows.
- You can play with X1 and X2 numbers.
- Calculate percentages that the update will be better than the previous state for both cases: (1) update 365 times, (2) update 2 times
- also take into account that dropping the quality is not equal in strength to growing by the same number. There is a good reason why people are risk-aversed. Quality being dropped means all kinds of trouble and should be calculated as negatively impacting the following X3 (play with this number) days.
If you do that, you will mathematically prove me right. :-)
> P.S.: Yes, "that much" is a perfectly valid thing to say.
I don't think they want 0 breaks but one thing those stable distros can do is isolate you from breaking changes. Think semantic versioning but for package lists. Especially if there are packages that don't play well with each other. iirc gtk2 vs. gtk3 was one such thing?
There's a reason a lot of people still use debian etc.. on servers. It is that they just chose a configuration that works and they don't care for/about any breaking updates - as long as they get stable security patches etc..
And I'd like to share an anecdote - I installed Pop!_OS in a Samsung Galaxy Book3 Ultra (a combination of weird names) 2 days ago and I'm pleasantly surprised with everything that worked out of the box. The NVIDIA driver works, switching to Intel or NVIDIA works out of the box with zero issues. They even have a nifty command line tool, system76-power.
There are some rough edges. For example, the notebook's speakers do not work, but audio works via bluetooth or connected to a TV. Apparently, Linux Kernel 6.8 should solve that, and I could install it via PPA, but... I'm too afraid to do it. I got old, my primary job is no longer as a dev, and I don't want to spend all that time fixing a Linux system. The Arch Wiki for NVIDIA is incredibly detailed, but Pop!_OS... solves that out of the box.
> I don't want to spend all that time fixing a Linux system. The Arch Wiki for NVIDIA is incredibly detailed, but Pop!_OS... solves that out of the box.
I feel you. When i got a new laptop, even i couldn't be bothered with Arch install process, so went with Manjaro - just so i wouldn't have to worry about hardware compatibility and still get latest enough kernels.
Yeah the Arch line was a meme/joke but I do think there are better options than Ubuntu/Gnome these days. Manjaro for one. Linux Mint used to be good too. Haven't tried it recently though.
The main reason I wasn't recommending Ubuntu is because of it's recent annoyances. I recently recommended Ubuntu to a Linux newbie, but after seeing him deal with issues of having to add ppas for some of the toolchains he needed, deal with snap headaches for steam, How gnome takes away certain basic features like being able to edit the path in it's open dialogs, it seemed like there are better options.
I see very little value in developing via WSL unless you work in an organization that requires software that only runs natively on Windows (which is becoming a rarity). Personally I only have a Windows install for some games - everything else I do is on Linux, both for work and home.
Everything about Windows for software development is worse in my experience. From basics like text rendering to more complex things like managing toolchains and project dependencies. I've done a bit of work porting software from MacOS and Linux to Windows in order to support Windows users (because there are still many) but holy shit, it's a nightmare.
I don't know yet if WSL is a death rattle for software development on Windows, but it feels like one.
And about WSL being a death rattle... I think it's the other way around. I have students who wouldn't touch Linux because they are already overwhelmed by learning all the complexities in web development, but WSL gives them a good playground. They can use all the available Ubuntu libraries, heck, they can even use quickemu inside Ubuntu inside Windows.
I think it's more of an embrace/extend/extinguish thing.
I use it everyday, it's Windows only, and I didn't find an alternative as easy and "free" (it requires a Nvidia card, but there is no subscription. Unlike, say, Krisp.ai)
I've used Windows for a long time and recently switched to Linux as I feel Microsoft has lost its way with Windows. The Linux DEs have become pretty awesome over the years and, IMHO, they have surpassed Windows. That said, I still need some things that are Windows only and keep a Windows VM running on a remote system for the times I need it.
I bit the bullet and transitioned to Linux full time after Windows kept nuking my boot sectors every time it had to do an update and haven't looked back at all. Even gaming on Linux is continually getting better with more and more competitive MP games with anti-cheats supporting Linux now due to the SteamDeck
Unless you're working on something that is heavily Windows-specific (legacy .NET Framework, Win32, Visual C++, etc.) that requires Visual Studio and Windows as a runtime developing on Linux is a much nicer experience.
I used to use Linux as a primary OS and sometimes dual boot in Windows. However, I have since transitioned to using Windows directly and WSL2. It is seamless overall.
There are a few edge cases, e.g. it was as little annoying to set up LUKS encryption and it's a lot easier to just use BitLocker. Other minor edge cases with permission of volumes that are mounted from Windows into Linux.
Further, there are some nuances around networking, especially if you are running docker in WSL2.
Once you get over these hiccups, it's quite comfortable.
May I know what kind of development do you do specifically for Linux from Windows? Honestly, all personal projects that I do can be easily be done on Windows since web dev is agnostic. I do run the apps on Linux machine which is taken care by Docker images.
I played with WSL2 and it's interesting to see Windows can directly access Wsl2 Ubuntu directory.
I'm sure it has limitation such as once I tried LLM model and WSL2 lacked support for AMDGPU, therefore it did not have any GPU acceleration.
Linux desktop has come a long way. KDE, Gnome and newer shells that layer on top like Pop Shell are really a joy to use. I very rarely use Windows anymore except for gaming where there are still some compatibility issues. Steam has done great work here too. Battery life on a laptop is still an area where you might prefer to go the WSL route.
I am a very long-time user of desktop Linux as my only OS. I think the main challenge with using Linux as a desktop OS is just the desktop part of it. It's not grotesquely horrible, but if I wanted someone to think that it was, I'd know exactly what things to tell them to do to experience that. (It goes something like, buy an NVIDIA GeForce GTX 10xx series card and then try to run KDE Wayland on it.)
I'm always good to braindump my thoughts so here goes.
Hardware:
If you want to run Linux as a primary desktop OS, please think about your hardware choices. There's a surprising amount of tradeoffs, and while the Linux desktop is generally a mess, I'd say this roughly defines people who wind up loving it versus hating it. Even in ways you could never realize, it will just impact your entire experience.
- GPUs: If you go with AMD, you have to be aware that choosing the latest cards may not be the best choice. The software stack is rarely ever fully ready for a launch, and if you are not tracking the latest stable kernel it will take even longer to get kernel driver updates. That said, I strongly recommend AMD in general as AMD cards have been very reliable for me and have good drivers across the board once they're ready. NVIDIA has the advantage that you can pretty much get a card at launch and run it on Linux, one of the notable advantages of them staying out of mainline; on the other hand, if you want to keep up with the latest Linux kernels, you may find yourself unable to get the NVIDIA driver working sometimes, as it does not always work on mainline. Further, NVIDIA's drivers still have some issues on Wayland, although it's not nearly as bad as it used to be, though for compatibility reasons I would still not strongly recommend it.
- NICs: Intel NICs generally work pretty good. They're not always perfect but they almost always have good mainline Linux support and reliability issues are definitely an exception case. WiFi is always a bit more persnickety, but things have gotten a lot better. The particular on-board WiFi used by my current motherboard is actually not generally recommended (A MEDIATEK MT7922), but I've been tracking mainline Linux and the only issues I've run into were unrelated. (There was an annoying issue with 6.6.5, it was fixed by 6.6.6 a few days later.)
- Audio: Believe it or not, Linux audio has gotten very compelling lately, after decades of being probably the funniest joke about Linux. Be skeptical if you must, but Pipewire + Wireplumber is probably about as good as Linux audio has been, and has compelling pros and cons when compared even with commercial OSes. Bluetooth will probably work out of the box, and getting low latency "Pro" audio routing features without breaking your desktop audio no longer requires a bunch of specialized knowledge; nominally, it's a dropdown in your audio settings. That having been said... You may face some issues, especially with cutting-edge motherboards and laptops. In general, Linux does not adequately support laptops that need a lot of software DSP in the driver to sound good. In addition, some of the more advanced audio codec setups may not work correctly out of the box. This is NOT the case for the majority of desktop motherboards, but it is worth googling to see if you can find someone reporting that it works.
Software:
There is a lot of heated debates about what the best distro and desktop environments are. Normally, I'd recommend sticking with a stalwart like Debian, but in this changing environment I'm really not sure what I'd personally recommend. I run NixOS, which I think is a fantastic operating system for people who know they need it, but like Linux itself, I would not recommend it to anyone unless they're relatively sure it's something that solves a problem they actually have. I pretty much knew immediately I wanted it when I heard about it.
But even when not considering NixOS, I strongly would recommend trying to find an "immutable" Linux distro that will work for you. This is because immutable Linux has proven to be a much better model for desktop usage that is far more robust and easier to reason about. A good example would be Fedora Silverblue. There was a nice looking project on here recently that may be of particular interest to developers, based on Silverblue, called Bluefin:
I can't necessarily personally recommend it, but the concept seems solid. Recommending a random recent project is a little risky, but it at least has the benefit that it's essentially just a spin of Silverblue that can be reverted back to plain old Silverblue safely as far as I know.
My personal choice for a desktop environment is none. I use SwayWM with a bunch of bespoke configuration. It's... ugly[1]. But it works, and for me, it works quite well.
But that is a horrible recommendation for someone who is trying to get into Linux by any measure. Instead, I recommend KDE. KDE has a fair amount of stability issues, but what I can say about KDE is that it's got a great balance of customization, features and polish for a desktop system, and I think that the current iteration of KDE is going to age well into the future. We'll see how that thought holds up, but I have my hopes.
If you are using an AMD card, I can also say that running KDE/Plasma Desktop in Wayland mode is a pretty good experience, if you can. It does come with some challenges and downsides, but there are some positives, especially if you like high DPI or high refresh rate screens, or you want to use OBS to capture or stream the screen, as it will be a lot more efficient using Pipewire on Wayland. At the very least, it's not a huge commitment since you can log out and jump back in with an X11 session through your display manager if anything goes too wrong.
P.S.: While I don't immediately recommend anyone to go and try NixOS, I can't say the same about Nix. Go and install Nix on your macOS or Linux machines! It's a great tool to have and is generally not obtrusive on a machine. Especially see the `nix develop` command. It drops you in a shell with the development tools used to build a package, so I can do say, `nix develop nixpkgs#_86Box` to get all of the dependencies to build the 86Box emulator, and just start building it with CMake.
It's a completely separate accelerator code, though there's possibility of maybe using it in combined fashion with HIP/ROCm OpenCL (I think there's OpenCL compiler for AIE and AIE-ML, but I'm not sure at how well it works yet)
Yeah, it's Xilinx AIE, stuff that was already supported on Linux - but it was differently connected on 7040: the AIE module being connected direct to Infinity Fabric, masquerading as PCIE device, instead of existing PCIE driver which assumed mediation done by on-fpga CPU etc. At least that's how it looked from what I've read in XRT docs so far.
This driver both provides necessary interface in place of an ARM core talking direct over AXI buses, and the glue logic to enable use of XRT.
As an OSS contributor, that available software (via tarball from their website) allowed me to make public Docker tooling and other projects [2]. I would be less inclined to do so with proprietary binary bundles.
But it wasn't "open collaboration" until recently -- but they were great on support emails! However, new GitHub projects are popping up from them even in the last few months! [3]
And thus, I can independently and openly work on issues [4] and integrate their software with other tools [5], just yesterday.
This all echoes to the recurrent Supabase thread of the value chain of OSS in companies[6]; there's a lot of reasons to do it.
[1] https://web.archive.org/web/20180108154033/http://www.openon... [2] https://github.com/neomantra/docker-onload [3] https://github.com/Xilinx-CNS/sfptpd [4] https://github.com/Xilinx-CNS/sfptpd/issues/6 [5] https://github.com/neomantra/nomad-onload [6] https://news.ycombinator.com/item?id=39087837