Critical high-level stats such as errors should be scraped more frequently than 30 seconds. It’s important to have multiple time granularity scraping intervals, a small set of most critical stats should be scraped closer to 10s or 15s.
Prometheus has as an unaddressed flaw [0], where rate functions must be at least 2x the scrape interval. This means that if you scrape at 30s intervals, your rate charts won’t reflect the change until a minute after.
"Scrape" intervals (and the plumbing through to analysis intervals) are chosen precisely because of the denoising function aggregation provides.
Most scaled analysis systems provide precise control over the type of aggregation used within the analyzed time slices. There are many possibilities, and different purposes for each.
High frequency events are often collected into distributions and the individual timestamps are thrown away.
Easy migration was promised but never delivered. Angular 2 was still full of boilerplate. “Migrating” an AngularJS project to Angular 2 is as much work as porting it to React or anything else.
So yes, people got burnt (when we were told that there will be a migration path), and I will never rely on another Google-backed UI framework.
Instability is one of the biggest but perhaps also the least understood downsides of NixOS, IMHO.
Contrary to the name, even the stable branch of NixOS can have problems while installing routine updates with `nixos-rebuild switch --upgrade`. In fairness, at least with NixOS you can normally roll back to a previous working configuration where you can try to fix or work around the problem if that does happen. It’s still painful if you have to do that, though.
Even if your routine updates all go smoothly, as you mentioned, each stable release is only supported for a very limited time window after the next one is out. NixOS doesn’t have any long-term support branch in the sense that some distros do. Again, you can overcome this to a degree by customising your configuration if you need specific versions of certain packages, but in doing so you’re moving back towards manually setting things up and resolving your own compatibility issues rather than having a distro with compatible packages you can install in whatever combination you want, which reduces the value of using a distro with a package repository in the first place.
To be clear, I’m a big fan of NixOS. I run it as my daily driver on a workstation where I do a lot of work on different projects for different clients. Its ability to have a clean, declarative description of what’s currently installed globally or for any given user or even when working in any given project directory for any given user is extremely valuable to me.
But it’s also fair to say that NixOS is not for everyone. It has been by far the least stable Linux distro I have ever used, in the sense of “If I turn my computer on and install the latest updates from the stable branch, will my computer still work afterwards?”. If you’re looking for a distro you can deploy and then maintain with little more than semi-automatic routine updates for a period of years then, at least for now, it is not the distro for you.
Very interesting to read this. I've never had breakage, but now I'm questioning whether this is the exception, not the rule.
On ubuntu, every new version broke something, sometimes updates make the computer boot to a blank screen... it was a terrible experince for early-days linux users. This was many years ago, but it made me distrust most distros I tried. Except for nixos.
I can only speak anecdotally, so it’s entirely possible that I’ve just been unlucky with this particular box, but I’ve seen a few quite serious issues going back over the past few years since I switched to NixOS as my primary OS.
Not so long ago there was some sort of problem with Hydra builds for a recent version of Node. That seemed to result in trying to build the whole thing locally on every update, taking a huge amount of time and then typically failing there as well.
I’ve seen things with Nvidia drivers vs Linux kernel versions as well. We did have a specific reason for choosing Nvidia for that particular workstation, but otherwise, I’d agree with popular advice to get AMD if you’re building a Linux box, just based on the frequency and severity of Nvidia driver issues we’ve seen here.
I’ve seen a few issues with Ubuntu upgrades over the years as well, and wouldn’t necessarily rate that much higher for stability. That’s always surprised me because IME Debian Stable is the gold standard — something I’ve trusted with our production servers for well over a decade now, from unattended upgrades to several major new releases, and barely seen a flicker of a hint of anything breaking in all that time. To be fair, I haven’t used Debian much on workstations, so I don’t know whether the kinds of issues I’ve experienced with NixOS and Ubuntu would have been more common if I had.
NixOS is mostly a rolling-release distro, like Arch, but it rolls a bit more slowly. You can opt into full rolling release with the "unstable" branch, which is very common. There's not a lot of benefit to "stable" IMO.
Er, no it isn't? Yes, unstable is rolling, but otherwise it has releases, like 25.11, which contain breaking changes. It cuts new releases quite quickly and drops old ones fast, but that doesn't make it a rolling distro.
This is why all the top/good password managers will alert you of: 1) password reuse between sites and 2) weak passwords. One can hope that the users will listen to those suggestions. In an organization, you can enforce compliance.
Max browser security levels and a good ad-blocker will not prevent you from getting phished or hacked more than an encryption-audited cloud-based zero-knowledge vault, where server compromise is irrelevant. All competent #1 cloud-based password managers are like that.
> All competent #1 cloud-based password managers are like that.
If you say so...
Sadly there could potentially also be a supply chain attack that happens to make its way into the client you use to view your supposedly secure vault. Odds are they use npm, btw.
Phish resistant MFA is worth mentioning. You and all your staff with access to critical credentials should have something like YubiKeys, so you can't (as easily) get tricked into entering some TOTP (or email/sms) code into a fraudulent website.
At least that ups the threshold to "someone who can not only poison your dns or MITM your network, but can also generate trusted TLS certs for the website domain they're phishing for".
And SMS should be retired completely for authentication, not simply deprecated as NIST did in SP 800-63B with companies like banks assuming full liability for losses to others if they continue with this unacceptably insecure mechanism.
"The lobby group for Australian telcos has declared that SMS technology should no longer be considered a safe means of verifying the identity of an individual during a banking transaction."
You can only avoid rotation on passwords that are MFA-protected.
If you implement a password manager, you must mandate auto-fill only and actively discourage (via training) copy/paste of credentials to a web site. Train the users to view “auto-fill not working” as a red flag. (This doesn’t apply to non-website credentials). Mandate all passwords to be auto-generated. Mandate that the only manually-entered password is the one for the password manager. Of course, you must have MFA on the password manager entry.
This will allow your users to comply with frequent password rotations much more easily. Auto-fill requirement/culture is critical to reducing phishing success, especially for tired eyes.
Prometheus has as an unaddressed flaw [0], where rate functions must be at least 2x the scrape interval. This means that if you scrape at 30s intervals, your rate charts won’t reflect the change until a minute after.
[0] - https://github.com/prometheus/prometheus/issues/3746
reply