Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Back in the day, when I was cutting my teeth on embedded systems, I read an Intel Application Note - probably for the 8051. The section about managing watchdog timers stated that it is often good practice to deliberately let the watchdog time out periodically, at a convenient time. Then the system would be reset to a predictable state regularly, and failure states leading to unplanned watchdog timeouts would occur less frequently.

To this day, all my systems which are intended for prolonged unattended operation reset themselves at least once a day.



The Payment Card Industry (PCI) standards that all credit card readers conform to mandate that they actually reboot at least every 24 hours :)

[edit] More specifically there are a number of things that have to happen every 24 hours - some memory has to be zeroed, firmware integrity verified, etc. Most vendors like Verifone [1] implement this by rebooting the reader at least every 24 hours on a timer.

[1] https://developer.verifone.com/docs/verifone-documentation/e...


I know speculating on the purpose of biological sleep is a common cliche on internet forums and pop science, but what if this is why we sleep ? is there any evidence something in our body "resets" when we do?


State is a well known source of error after all.


> at least every 24 hours on a timer.

Shouldn't that be 'at most' or 'at least once'?

(My English skills are lacking so that was quite the mental stumble for me)


I understand how the way the original comment phrased things sounded, to your ears, like "at least 24 hours should pass between reboots," which is not what was intended. And your suggestion, "at least once every 24 hours," would be perfectly acceptable and idiomatic English.

However, "at least every 24 hours" is also perfectly acceptable and idiomatic English. It is very common in English to use this construction, "at least every X time period," to mean "X often or more often." If you say, "I make sure to call my distant relatives at least every year," you mean with that frequency or more. If you say, "I make sure to stand up and stretch at least every hour," you mean with that frequency or more. If you say, "these machines are required to reboot at least every 24 hours," you mean with that frequency or more.

In other words, "at least" does not qualify the number of hours (such that more hours would also be acceptable), but the frequency (such that greater frequency would also be acceptable).

Hopefully this response helps you to understand why English, fickle language that it is, works this way in this case.


It doesnt explain why that became an idiomatic way to say it. But at least it explains the what.


English is a highly irregular and challenging language unless you grew up with it. You have my sympathies.


'at least once every 24 hours' would be valid.

'at most every 24 hours' would imply reboots_per_day < 1 is bad and reboots_per_day >= 1 is good; i.e. that the standard was focused on preventing unnecessary reboots, but didn't care if the reader didn't reboot regularly, which I think is contrary to the comment's meaning.

'at least every 24 hours' is correct.


> ... would imply reboots_per_day < 1 is bad and reboots_per_day >= 1 is good;

I understood as that being exactly the point of the GP, rather than contrary. Because of the 'most' referring to the 24 hrs, not amount of reboots.

But I guess this is the point where my lack of English skill comes into play.


...I've just realised that I got the < and >= backwards while trying to explain that they were backwards. facepalm

> ... would imply reboots_per_day > 1 is bad and reboots_per_day <= 1 is good;

is what I meant to say!

But now I've taken a look at your phrasing ... I think I read your post as 'at most once every 24 hours', not 'at most every 24 hours' which is actually pretty ambiguous, and not as clear-cut as I was trying to say.


Yeah, my writing was quite ambiguous as well. I shouldve written out the full sentences.


You are correct


I remember a story from a couple of years ago about how 747s couldn't have more than like a week of uptime without running into catastrophic problems, and proggit and hn were in disbelief about the poor quality of the code and wondered how often it became an issue in flight. Turns out never, because the computers were restarted regularly for maintenance, and besides that were designed to safely restart in the air just in case anyway


"FAA: Reboot 787 power once in a while" (2016)

https://news.ycombinator.com/item?id=13094600


This also avoids bugs that are hard to find - like the windows uptime bug that only happened after an integer number of tics overflowed or something.


It was amusing that Windows was so unstable that it took decades for that bug to be discovered.


https://www.cnet.com/culture/windows-may-crash-after-49-7-da... (woah, cnet hosts articles from 20 years ago)


Alas, “only” 7 years to find.


The issue with Windows was not its instability, it is with its design. Unless using Run commands and Command Prompt, every adjustment is layers down a context menu to open some control panel that takes an excruciating 10-20 seconds, just to appear, and Microsoft intentionally hid all the fine controls. While literally everything is solvable without rebooting-- reproducing, tracing, researching and solving it takes too long because of the system design. If the problem isn't appearing often enough to warrant investing the time it takes to do it right, rebooting is the only rational choice. The first thing every Tier I help desk operator instructs is to reboot (because it's in their script, and because 95 times out of 100, it does the trick). "But I already tried that," "Please reboot again anyway." They're not wrong. Rebooting is a terrible solution. It's just that it usually works, and it takes a lot less time than doing things correctly.


The primary modus operandi for that time was turn on your PC, work/game, turn off your PC. As Win9x was never a server OS, nobody treated it as so.

And when people whine about 9x being unstable they don't remember (or even never experienced it themselves) how awful was the hardware it worked on.

I "fondly" remember some combinations of hardware were a literally ticking time bombs, you never knew when it whould BSOD. Though by that time I had enough understanding what if I see CMSXXX.VXD failing it is the problem with a cheap ass sound card drivers, not Windows itself.


Well also Windows systems should be rebooting every month for security updates.


As best as I can recall from the mid-90s, security updates were significantly less common. Many people didn't even have an internet connection.


I do this with cloud VMs these days: it's particularly useful when there's third party code in the mix. The theory is that your uptime represents a known tested amount of operation (roughly), and as such everytime you go beyond that in production but not testing, you're getting into unknown space - systems are too complicated to validate everything (see the classic patriot missile system bug [1]).

So, you should never deploy anything to production which won't be rebooted more frequently then you reboot it in testing. In practice, you should probably reboot much more frequently - as frequently as possible - to keep the delta between "known good" and "mutated" as small as possible.

[1] https://en.wikipedia.org/wiki/MIM-104_Patriot#Failure_at_Dha...


man way back when we bragged about more than 2k days of uptime as evidence of resilience


Half the products people build don’t even last 2000 days nowadays, but any system you build should have a continuous uptime well over 2,000 days. My dns servers for example have been responding to dns for the last 12 years on the same IP without more than a second of downtime.

It might be acceptable for AWS to crash every few months, but it’s not acceptable for my systems to be out for the length of a reboot.


Someone told me about a car where the ECU reset to a known state every revolution of the engine. "Rebooting" that often doesn't even feel like rebooting anymore, but they said that was the only way they felt it would be reliable enough.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: