After two major outages in as many weeks — including the CrowdStrike crash — alarm bells are ringing about the world's overreliance on Microsoft. Andrew Chan...
Yes. But what if the world was 1/3rd Linux, 1/3rd windows, 1/3rd OSX? Then potentially the overall failure would have been less, which I think the point of this piece was.
It has a little bit to do with the OS. Windows does not have the same sandboxing capability for modules that Linux provides. The fact that the sensor needs to run in ring 0 is a problem, and eBPF at least mitigates much of the issue in Linux. But I think you meant that CrowdStrike is by no means blameless, and I agree - they have a long history of shitty implementations, and rightly deserve to be the focus of our anger.
I know it has nothing to do with macos. I agree it’s the QA piece. I heard upper managements theme was “two feet on the gas”. Also the CEO was the CTO of McAfee when they had a similar issue back in 2010 if I’m not mistaken. 🙃
Hopefully there are a bunch of programmers there right now standing in a circle around the desk of some manager and bombarding them with a continuous chant of “We told you so!” We knew in the 1990s not to trust stuff coming in off the Internet to be what it claims or reach its destination unmangled, and as I understand it, the software was blindly attempting to parse unverified threat definition files it had downloaded. Doing it all in ring 0 was just that extra crowning touch. This should have been caught before it even got to QA.
The problem with that logic is that this failure was not caused by Microsoft, it was caused by ClownStrike. Their software works on Windows and Linux (not sure about Mac) and they fucked up the linux software a few weeks before the Microsoft incident.
Even if Linux had more market share in the affected endpoints they would still have been affected, just on different timelines I guess.
I’m not claiming it was Microsoft’s fault. I blame crowd strike. But freebsd is not windows. A bad patch could have had a different result on a different system. They’re different.
The only difference might be some linux distros hold two kernels, so you have a backup boot. And some have immutable system like A/B android so if boot fails it auto rollsback to the old working state
Yes. But what if the world was 1/3rd Linux, 1/3rd windows, 1/3rd OSX?
The 1/3 running macOS (they haven’t called in OS X in many years now) wouldn’t have to worry, because Apple provides kernel event access for security tools running in user space. The CrowdStrike Falcon Sensor driver on macOS runs as a System Extension, and runs 100% in user space (“Ring 3” in Intel parlance) only — so if it misbehaves, the kernel can just shut it down and continue on its merry way.
The problem with Windows (and to a certain extend Linux) is that Falcon Sensor needs to run in kernel mode (Ring 0) on those OS’s, and if it fucks up you lose all guarantees that the kernel and all of the apps running on the system haven’t been fucked with, hence the need for a full system crash/shutdown. The driver can (and did) put these systems in an indeterministic state. But that can’t happen on modern macOS with modern System Extensions.
You don’t have to run in Ring 0 to detect events occurring in Ring 0.
Besides which, as kexts are being obsoleted by Apple getting code to run inside Ring 0 in macOS that isn’t from Apple itself is going to be extremely difficult.
Yes. But what if the world was 1/3rd Linux, 1/3rd windows, 1/3rd OSX? Then potentially the overall failure would have been less, which I think the point of this piece was.
And if Crowdstrike had competent management who valued a proper QA department, the overall failure wouldn’t have happened at all.
This has nothing to do with OS. This is a result of corporate fuckery.
It has a little bit to do with the OS. Windows does not have the same sandboxing capability for modules that Linux provides. The fact that the sensor needs to run in ring 0 is a problem, and eBPF at least mitigates much of the issue in Linux. But I think you meant that CrowdStrike is by no means blameless, and I agree - they have a long history of shitty implementations, and rightly deserve to be the focus of our anger.
https://www.theregister.com/2024/07/21/crowdstrike_linux_crashes_restoration_tools/
IIRC those were the non-eBPF versions of the sensor.
I know it has nothing to do with macos. I agree it’s the QA piece. I heard upper managements theme was “two feet on the gas”. Also the CEO was the CTO of McAfee when they had a similar issue back in 2010 if I’m not mistaken. 🙃
Hopefully there are a bunch of programmers there right now standing in a circle around the desk of some manager and bombarding them with a continuous chant of “We told you so!” We knew in the 1990s not to trust stuff coming in off the Internet to be what it claims or reach its destination unmangled, and as I understand it, the software was blindly attempting to parse unverified threat definition files it had downloaded. Doing it all in ring 0 was just that extra crowning touch. This should have been caught before it even got to QA.
The problem with that logic is that this failure was not caused by Microsoft, it was caused by ClownStrike. Their software works on Windows and Linux (not sure about Mac) and they fucked up the linux software a few weeks before the Microsoft incident.
Even if Linux had more market share in the affected endpoints they would still have been affected, just on different timelines I guess.
I’m not claiming it was Microsoft’s fault. I blame crowd strike. But freebsd is not windows. A bad patch could have had a different result on a different system. They’re different.
Yes, they are different but as you can see it wasn’t smooth either: https://www.techspot.com/news/103899-crowdstrike-also-broke-debian-rocky-linux-earlier-year.html
I’m not sure how ClownStrike works on BSD, though .
The only difference might be some linux distros hold two kernels, so you have a backup boot. And some have immutable system like A/B android so if boot fails it auto rollsback to the old working state
The 1/3 running macOS (they haven’t called in OS X in many years now) wouldn’t have to worry, because Apple provides kernel event access for security tools running in user space. The CrowdStrike Falcon Sensor driver on macOS runs as a System Extension, and runs 100% in user space (“Ring 3” in Intel parlance) only — so if it misbehaves, the kernel can just shut it down and continue on its merry way.
The problem with Windows (and to a certain extend Linux) is that Falcon Sensor needs to run in kernel mode (Ring 0) on those OS’s, and if it fucks up you lose all guarantees that the kernel and all of the apps running on the system haven’t been fucked with, hence the need for a full system crash/shutdown. The driver can (and did) put these systems in an indeterministic state. But that can’t happen on modern macOS with modern System Extensions.
I see. How effective is a security tool that can’t stop malicious software that makes itself in ring 0?
You don’t have to run in Ring 0 to detect events occurring in Ring 0.
Besides which, as kexts are being obsoleted by Apple getting code to run inside Ring 0 in macOS that isn’t from Apple itself is going to be extremely difficult.
Right, but part of the appeal of tools like crowd strike and sentinelone is that they can stop them when they’re in ring 0. And rollback changes. Etc.