I see discussion about who's at fault: Microsoft or Crowdstrike.
But one thing I don't get about this: what was the role of the enterprise admins?
Most administrators at large companies are cautious about rolling out new software versions to their employees. They (normally?) test before broad deployment.
Seems like one of three things would have had to have happened for this to be missed:
1. Admins ignored testing this update prior to enterprise rollout.
2. Crowdstrike forced the update on unwilling users.
3. Crowdstrike does not provide a framework for such pre-rollout testing, and enterprises chose to use it anyway.
Can anyone offer insight?
[Disclosure: I'm a Microsoft employee, but not an enterprise admin]
> Most administrators at large companies are cautious about rolling out new software versions to their employees. They (normally?) test before broad deployment.
In my experience at both a 70,000 company and a 260,000 person company, both of which I can confirm have outages right now, this just isn't the case.
The security vendor says update and sysadmins say "right away", because the institution has learned that "right away" is the only acceptable answer from auditors, both internal and external.
This story is interesting because there's an entire chain of places you can pass the buck and absolve responsibility if you so choose. You could, if you so desired choose to blame:
1. The crowdstrike developer who pushed the change
2. The developer responsible for the kernel bug
3. crowdstrike as a company for not having better change management
4. microsoft for how they handle kernel access
5. system admins for not owning the update process of their entire body of devices
6. security teams / the CISO for operating on checklists that exist to please auditors rather than treating security as a living, breathing problem
7. Auditors for structuring security audits as a checklist rather than treating security as a living, breathing problem
8. Regulators for using one size fits all audits as the preferred method of determining security compliance
As a previous IT Manager (SRE last decade), honestly, most IT Admins I know literally do nothing beyond click Auto Update checkboxes and let things churn until they break. I hate to put that career down but I worked at all levels of it starting from tech support in a call center. It's a very easy job to get either get complacent with your skill set, comfortable, and really just half ass things. I have a lot of friends that are on IT teams and most of them don't have any interest in what I do, like learning to write golang, rust, python, learn kubernetes, docker, etc.. I tell them all the time about how much money they can make if you really buckle down and learn to program or just learn a cloud and terraform. They all bitch about how much they hate their jobs because they're doing basically line tech support but are fine where they're at. I had horrible IT jobs so I'm super sympathetic to it and always try to hire them when I can. I hired one last year, not a nepotism hire I didn't know him, but he was an IT guy wanting to move into SRE.
They just use Windows and let it do its thing. Its their day job and they don't work on improving so their skillset is super subpar. To them it's all they need to do their job, I don't have that personality, I'm obsessively min-maxing things.
I even worked at MSPs (Managed Service Providers) so I did IT/Network admin work for tons of companies around different cities and every single MSP just either puts everything on auto update or has a strict rule of NOTHING on auto update which just means nothing ever gets updated until a customer calls in for $$ hourly support. You throw the updating in because you get more hours. Or you have a scheduled update ticket every month/etc.
I also saw a comment or a meme about Crowdstrike being able to update whenever wherever, no idea if that's true or not.
I wrote this post from my point of view which is that of being a Windows SysEng at 4-5 big webhosts over a decade. I had to write a TON of my own security and backup and whatever services because Windows was so barebones and at the fleet (tens of thousands) server level I was managing, with the revenue webhosting makes, we definitely couldn't buy expensive software. Most of my customers paid $10/mo and we were cramming thousands of them on one server the licensing was a huge pain for any software.
But one thing I don't get about this: what was the role of the enterprise admins?
Most administrators at large companies are cautious about rolling out new software versions to their employees. They (normally?) test before broad deployment.
Seems like one of three things would have had to have happened for this to be missed:
1. Admins ignored testing this update prior to enterprise rollout.
2. Crowdstrike forced the update on unwilling users.
3. Crowdstrike does not provide a framework for such pre-rollout testing, and enterprises chose to use it anyway.
Can anyone offer insight?
[Disclosure: I'm a Microsoft employee, but not an enterprise admin]