Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder how did they actually find out the reason for the failure? They had a system which worked perfectly (almost) and probably could be tested in every standard way without showing the problem. They must've had a seriously good logging system that showed something suspicious, or someone had a really interesting "a-ha" moment...

I'd like to hear the story of debugging this one. Also how they managed to identify that this incident was caused by that specific bug.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: