More

klempner · 2026-01-20T06:05:46 1768889146

>HDDs typically have a BER (Bit Error Rate) of 1 in 1015, meaning some incorrect data can be expected around every 100 TiB read. That used to be a lot, but now that is only 3 or 4 full drive reads on modern large-scale drives. Silent corruption is one of those problems you only notice after it has already done damage.

While the advice is sound, this number isn't the right number for this argument.

That 10^15 number is for UREs, which aren't going to cause silent data corruption -- simple naive RAID style mirroring/parity will easily recover from a known error of this sort without any filesystem layer checksumming. The rates for silent errors, where the disk returns the wrong data that benefit from checksumming, are a couple of orders of magnitude lower.

allanjude · 2026-01-22T19:04:35 1769108675

RAID would only be able to recover if it KNEW the data was wrong.

Without a checksum, hardware RAID has no way to KNOW it needs to use the parity to correct the block.

iberator · 2026-01-20T12:21:56 1768911716

This is pure theory. Ber shouldn't be counted per sector etc? We shouldn't tread all disk space as single entity, IMO

thfuran · 2026-01-20T14:08:23 1768918103

Why would that make a difference unless some sectors have higher/lower error rates than others?

Dylan16807 · 2026-01-20T14:55:25 1768920925

For a fixed bit error rate, making your typical error 100x bigger means it will happen 100x less often.

If the typical error is an entire sector, that's over 30 thousand bits. 1:1e15 BER could mean 1 corrupted bit every 100 terabytes or it could mean 1 corrupted sector every 4 exabytes. Or anything in between. If there's any more detailed spec for what that number means, I'd love to see it.

digiown · 2026-01-20T14:14:39 1768918479

This stat is also complete bullshit. If it were true, your scrubs of any 20+TB pool would get at least corrected errors quite frequently. But this is not the case.

The consumer grade drives are often given an even lower spec of 1 in 1e14. For a 20TB drive, that's more than one error every scrub, which does not happen. I don't know about you, but I would not consider a drive to be functional at all if reading it out in full would produce more than one error on average. Pretty much nothing said on that datasheet reflects reality.

alexfoo · 2026-01-20T15:41:23 1768923683

> This stat is also complete bullshit. If it were true, your scrubs of any 20+TB pool would get at least corrected errors quite frequently. But this is not the case.

I would expect the ZFS code is written with the expected BER in mind. If it reads something, computes the checksum and goes "uh oh" then it will probably first re-read the block/sector, see that the result is different, possibly re-read it a third time and if all OK continue on without even bothering to log an obvious BER related error. I would expect it only bothers to log or warn about something when it repeatedly reads the same data that breaks the checksum.

Caveat Reddit but https://www.reddit.com/r/zfs/comments/3gpkm9/statistics_on_r... has some useful info in it. The OP starts off with a similar premise that a BER of 10^-14 is rubbish but then people in charge of very large pools of drives wade in with real world experience to give more context.

digiown · 2026-01-20T19:30:43 1768937443

That's some very old data. I'm curious as to how stuff have changed with all the new advancements like helium drives, HAMR, etc. From the stats Backblaze helpfully publish, I feel like the huge amount of variance between models far outweigh the importance of this specific stat in terms of considering failure risks.

I also thought that it's "URE", i.e. unrecoverable with all the correction mechanisms. I'm aware that drives use various ways to protect against bitrot internally.

klempner · 2026-01-20T03:18:50 1768879130

The almost as interesting takeaway I have (which I am sure is in their internal postmortem) is that they presumably don't have any usage of glibc getaddrinfo clients in their release regression testing.

klempner · 2025-12-23T00:08:31 1766448511

The actual algorithm (which is pretty sensible in the absence of delayed ack) is fundamentally a feature of the TCP stack, which in most cases lives in the kernel. To implement the direct equivalent in userspace against the sockets API would require an API to find out about unacked data and would be clumsy at best.

With that said, I'm pretty sure it is a feature of the TCP stack only because the TCP stack is the layer they were trying to solve this problem at, and it isn't clear at all that "unacked data" is particularly better than a timer -- and of course if you actually do want to implement application layer Nagle directly, delayed acks mean that application level acking is a lot less likely to require an extra packet.

j16sdiz · 2025-12-23T01:30:39 1766453439

If your application need that level of control, you probably want to use UDP and have something like QUIC over it.

BTW, Hardware based TCP offloads engine exists... Don't think they are widely used nowadays though

nly · 2025-12-23T01:57:22 1766455042

Hardware TCP offloads usually deal with the happy fast path - no gaps or out of order inbound packets - and fallback to software when shit gets messy.

Widely used in low latency fields like trading

klempner · 2025-12-15T00:52:46 1765759966

Speaking of Slashdot, some fairly frequent poster had a signature back around 2001/2002 had a signature that was something like

mv /bin/laden /dev/null

and then someone explained how that was broken: even if that succeeds, what you've done is to replace the device file /dev/null with the regular file that was previously at /bin/laden, and then whenever other things redirect their output to /dev/null they'll be overwriting this random file than having output be discarded immediately, which is moderately bad.

Your version will just fail (even assuming root) because mv won't let you replace a file with a directory.

klempner · 2025-12-09T03:50:39 1765252239

Sure, in the middle of a magnitude 9 earthquake I'd rather be in the middle of a suburban golf course (as long as it is far from any coastal tsunami) than any building, but I don't spend the majority of my time outside.

Two issues: 1. If you're making this choice during an earthquake, "outside" is often not a grassy field but rather the fall zone for debris from whatever building you're exiting. 2. If the earthquake is big/strong enough that you're in any real danger of building level issues, the shaking will be strong enough that if you try to run for the outside you're very likely to just fall and injure yourself.

klempner · 2025-12-02T03:24:05 1764645845

As someone who is super nearsighted, the smaller screen on a phone is great for reading, especially in contexts like bedtime reading where I want to have my glasses off.

I have read many hundreds of books this way.

The problem with a tablet is that most tablets, especially the sort that are good for seeing entire as-printed pages at once, are too big for me to keep the entire screen in focus without wearing glasses. (with that said, foldables improve things here, since the aspect ratio bottleneck is typically width so being able to double the width on the fly makes such things more readable.

_jzlw · 2025-12-02T06:49:51 1764658191

Same here! Not to mention having ebooks on my phone means I can read anywhere, anytime. I read more, not less, lol.

klempner · 2025-12-01T01:43:14 1764553394

There's a weird fetishization of long uptimes. I suspect some of this dates from the bad old days when Windows would outright crash after 50 days of uptime.

In the modern era, a lightly (or at least stably) loaded system lasting for hundreds or even thousands of days without crashing or needing a reboot should be a baseline unremarkable expectation -- but that implies that you don't need security updates, which means the system needs to not be exposed to the internet.

On the other hand, every time you do a software update you put the system in a weird spot that is potentially subtly different from where it would be on a fresh reboot, unless you restart all of userspace (at which point you might as well just reboot).

And of course FreeBSD hasn't implemented kernel live patching -- but then, that isn't a "long uptime" solution anyway, the point of live patching is to keep the system running safely until your next maintenance window.

cesarb · 2025-12-01T02:10:58 1764555058

> There's a weird fetishization of long uptimes. I suspect some of this dates from the bad old days when Windows would outright crash after 50 days of uptime.

My recollection is that, usually, it crashed more often than that. The 50 days thing was IIRC only the time for it to be guaranteed to crash (due to some counter overflowing).

> In the modern era, a lightly (or at least stably) loaded system lasting for hundreds or even thousands of days without crashing or needing a reboot should be a baseline unremarkable expectation -- but that implies that you don't need security updates, which means the system needs to not be exposed to the internet.

Or that the part of the system which needs the security updates not be exposed to the Internet. Other than the TCP/IP stack, most of the kernel is not directly accessible from outside the system.

> On the other hand, every time you do a software update you put the system in a weird spot that is potentially subtly different from where it would be on a fresh reboot, unless you restart all of userspace (at which point you might as well just reboot).

You don't need a software update for that. Normal use of the system is enough to make it gradually diverge from its "clean" after-boot state. For instance, if you empty /tmp on boot, any temporary file is already a subtle difference from how it would be on a fresh reboot.

Personally, I consider having to reboot due to a security fix, or even a stability fix, to be a failure. It means that, while the system didn't fail (crash or be compromised), it was vulnerable to failure (crashing or being compromised). We should aim to do better than that.

pezezin · 2025-12-02T02:43:45 1764643425

> My recollection is that, usually, it crashed more often than that. The 50 days thing was IIRC only the time for it to be guaranteed to crash (due to some counter overflowing).

I had forgotten about this issue (never got a Windows 9x survive more than a few days without crashing), and apparently it was a 32-bit millisecond counter that would overflow after 49.7 days:

https://www.cnet.com/culture/windows-may-crash-after-49-7-da...

wahern · 2025-12-01T15:03:36 1764601416

> unless you restart all of userspace (at which point you might as well just reboot).

I can't speak for FreeBSD, but on my OpenBSD system hosting ssh, smtp, http, dns, and chat (prosody) services, restarting userspace is nothing to sweat. Not because restarting a particular service is easier than on a Linux server (`rcctl restart foo` vs `systemctl restart foo`), but because there are far fewer background processes and you know what each of them does; the system is simpler and more transparent, inducing less fear about breaking or missing a service. Moreover, init(1) itself is rarely implicated by a patch, and everything else (rc) is non-resident shell scripts, whereas who knows whether you can avoid restarting any of the constellation of systemd's own services, especially given their many library dependencies.

If you're running pet servers rather than cattle, you may want to avoid a reboot if you can. Maybe a capacitor is about to die and you'd rather deal with it at some future inopportune moment rather than extending the present inopportune moment.

sixdonuts · 2025-12-01T07:44:01 1764575041

There are a lot of OT, safety and security infrastructure that must be run on premise in large orgs and require four to five nines of availability. Much of the underlying network, storage, and compute infra for these OT and SS solutions run proprietary OSs based on a BSD OS. BSD OSs are chosen specifically for their performance, security and stability. These solutions will often run for years without a reboot. If a patch is required to resolve a defect or vulnerability it generally does not require a reboot of the kernel and even so these solutions usually have HA/clustering capabilities to allow for NDU (non disruptive upgrades) and zero downtime of the IT infra solution.

fragmede · 2025-12-01T02:51:46 1764557506

It's from a bygone era. An era when you'd lose hours of work if you didn't go file -> save, (or ctrl-s, if you were obsessive). If you reboot, you lose all of your work, your configuration, that you haven't saved to disk. Computers were scarce, back in those days. There was one in the house, in the den, for the family. These days, I've got a dozen of them and everything autosaves. But so that's where that comes from.

ssl-3 · 2025-12-01T04:53:41 1764564821

Home computers seem more scarce to me today than they did ~25 years ago.

Sure: People have smart TVs and tablets and stuff, which variously count as computing devices. And we've broadly reached saturation on pocket supercomputers adoption.

But while it was once common to walk into a store and find a wide array of computer-oriented furniture for sale, or visit a home and see a PC-like device semi-permanently set up in the den, it seems to be something that almost never happens anymore.

So, sure: Still-usable computers are cheap today. You've got computers wherever you want them, and so do I. But most people? They just use their phone these days.

(The point? Man, I don't have a point sometimes. Sometimes, it's just lamentations.)

pezezin · 2025-12-02T02:46:07 1764643567

> But while it was once common to walk into a store and find a wide array of computer-oriented furniture for sale, or visit a home and see a PC-like device semi-permanently set up in the den, it seems to be something that almost never happens anymore.

My experience is the opposite: due to the increasing popularity in PC gaming, furniture stores now carry gaming-oriented desks and chairs that they didn't sell before.

klempner · 2025-10-26T06:28:47 1761460127

Are you sure you're not thinking of "SmartDay" days that are part of the SmartRate program?

Flex Alerts are CAISO and ultimately about grid stability. SmartRate/SmartDay are ultimately about marginal cost of production on PG&E. The two are certainly correlated -- at the very least, a Flex Alert day is almost guaranteed to be a SmartDay.

Notably, the SmartRate program is capped at 15 days per year, and in practice PG&E will keep a few in reserve for surprise late season events, but even if there are no Flex Alert days they're still going to be called on electricity-is-expensive-even-if-the-grid-is-stable days.

klempner · 2025-10-20T04:12:24 1760933544

> NEVER. EVER. EVER. Leave a test early.

It has been nearly 20 years, but my rule of thumb was that I wouldn't leave until I had done *three* review passes of the test. That is, quadruple checking, completing the exam and then reviewing my answers three times. That is pretty far into the diminishing returns for me catching my own errors.

That *almost* never happens, but there are exceptions -- sometimes they really do give way way more time than you need, especially if you are already strong at the material in question.

With that said, the key point is that the time tradeoff here for leaving early is terrible in typical college classes that have heavy weight on exams. Especially the first 10 -20 minutes of double checking is very likely worth 5+ hours of homework time or study time in terms of points towards the grade.

klempner · 2025-10-17T15:13:07 1760713987

Huh? There are plenty of good reasons to complain about the 5V5A thing, but "fire hazard" is not one of them.

It is even entirely within spec for a PD power supply to offer a 5V5A PDO, as long as it is only doing so with a 5A capable cable (i.e. 100W or 240W). 5V5A is no more a fire hazard than 20V5A.

The spec violation isn't that it negotiates 5V5A when available, but that it isn't willing to buck from 9V or 15V to get those 25W which means that power supply compatibility is incredibly limited.

prism56 · 2025-10-17T15:36:21 1760715381

Shame a 5v "fake" PPS voltage couldn't somehow be obtained or patched in. Loads of chargers would work then.

My pocket PD can request 5v5a from quite a few chargers in PPS mode.

tracker1 · 2025-10-17T17:26:30 1760721990

It's a shame that RPi didn't just adopt a proper PD interface for power. For that matter, if they had USB-C + TB/USB4 with display support, then I could just plug it into my display without any other cables like I do my laptop, with all the peripherals connected to the display.

klempner · 2025-10-20T06:23:00 1760941380

Any currently existing (to say nothing of two years ago) "TB/USB4" chipset would dramatically increase the price of something with a retail price on the order of $50.

With that said, DisplayPort Alternate Mode would be considerably more straightforward.

tracker1 · 2025-10-20T16:16:10 1760976970

Apparently the RPi 5's SoC already supports USB-C display alt-mode.. unfortunately they don't to proper PD negotiation, which would not be considerably more expensive. There are cheap vape pens that support PD negotiation properly.

klempner · 2025-10-22T09:43:38 1761126218

Are you sure these "cheap vape pens" don't just use 5V3A, which doesn't require any PD negotiation at all? (a lot of them screw even that much up, and a lot of people confuse "PD negotiation" with simply having the right resistors on the CC pins)

There is real cost savings here -- the RPi5 avoided the need for a buck circuit, and for that matter probably a dedicated PD controller chip.

In contrast, in the context of a "cheap vape pen" you have a battery which means you need to be able to convert to (and from!) battery voltage, so you need that conversion circuitry anyway.

prism56 · 2025-10-18T11:10:04 1760785804

That would absolutely be a better solution but I meant in hindsight.

throw84944994 · 2025-10-17T17:09:53 1760720993

> negotiates 5V5A

Even the voltage is not matching spec (Pi power supply has 5.1 volts, not 5.0 volts!). That is because historically Pi had shitty cables, with high resistance and voltage drops. 5V5A is not even in spec, limit for 5 volts is 3 amps!

> fire hazard than 20V5A

That would be 100 watts! Many people just grab any usbc cable, and solder it directly to GPIO power pins. But good luck with that!

Initial batches of Pi4 did not even had a resistor, to request 3.0 amps!

toast0 · 2025-10-17T20:19:34 1760732374

5.1 volts is 2% off 5.0 volts. I don't have a copy of the USB specs, but a voltage 2% higher than nominal is almost certainly within specifications.

shadowpho · 2025-10-21T05:05:43 1761023143

Pretty sure it’s 10% and many psu do give out 5.25 or 5.4v out for this exact reason

klempner · 2025-10-20T05:58:02 1760939882

> That would be 100 watts!

The point is that power dissipation in a cable is a function of the current going through it. The cable will get exactly as hot carrying 5 amps with a voltage of 5 volts as it will carrying 20 or 48.

(now, that is more *wasteful* -- you lose the same amount of power to heat carrying 25W at 5V5A as you do at 100W 20V5A, but that's 4x the relative waste in power)

> Many people just grab any usbc cable, and solder it directly to GPIO power pins.

You're not going to get *any* 5 amp mode out of a standard PD power supply unless the cable indicates it is 5 amp capable, which isn't going to happen unless that "any usbc cable" has the right emarker on it.

> limit for 5 volts is 3 amps.

There is no such limit.

What there is is two things: 1. There are a standard set of PDOs a standard "X watt" PD power supply is supposed to provide. 5V3A 9V3A 15V3A 20V5A, (then 28, 36, and 48 volts for EPR) with the highest one limited to the power limit of the supply. These only go up to 3 amps until you get to 20 volts. 2. Devices are supposed to support those standard PDOs.

Anything other than those standard PDOs is optional (at least before 3.2 which starts introducing AVS as a requirement at 27W+). 12V support is common, as for that matter is PPS support. 5A support below 20V in fixed PDOs is 100% allowed but is super rare.

(5A lower voltage PPS is a different story, but unfortunately the RPi5 doesn't know how to negotiate 5V PPS. That is a shame because it would 100x its power supply compatibility because most chargers targeting higher end Samsung phones support it.)

A power supply is 100% allowed to support 5V5A. It just isn't required to. It would have been 100% legitimate for the RPi5 to have a buck circuit to handle a standard 27W 9V3A power supply and then turn that buck off if the power supply and cable support 5V5A.

> Initial batches of Pi4 did not even had a resistor, to request 3.0 amps!

To be precise, it had *a* resistor (connected to the shorted together CC pins) when it was supposed to have one separate resistor for each pin, and that broke cables with emarkers.