Microsoft Reveals More Details About Windows CrowdStrike Crash: It Comes Down to a Kernel

About a week after millions of Windows machines displayed the blue screen of death around the world, Microsoft has confirmed the root cause of the incident that grounded thousands of flights and disrupted numerous businesses and public services.

“Our observations confirm CrowdStrike’s analysis that this was a read-out-of-bounds memory safety error in the CrowdStrike developed CSagent.sys driver,” Microsoft explains in its technical analysis of the crash published Saturday.

As PC Mag reports, the analysis notes that CrowdStrike’s driver was a file system filter driver, which are optional drivers that attach to the file software stack and are common for anti-malware agents. These drivers are different from device drivers like GPU drivers designed for a specific piece of hardware. CrowdStrike’s service for Windows machines loads four driver modules, but one specific file is being blamed for the crash.

“We can see the control channel file version 291 specified in the CrowdStrike analysis is also present in the crash indicating the file was read,” Microsoft notes, confirming CrowdStrike’s previous assertion last week that an issue with that 291 channel file caused the IT meltdown.

Microsoft previously estimated that 8.5 million Windows computers were disabled by the CrowdStrike glitch. In its Saturday post, Microsoft shared that it received about 4 million crash reports on July 19 (not all users are opted-in to crash reports).

While Microsoft may be looking to further restrict access to its Windows kernel going forward, the tech giant also explained why it let third-parties access it in the first place. The Windows kernel is a deep layer of its operating system. Kernel-level cybersecurity lets developers do more to protect machines, can perform better, and can be harder for threat actors to alter or disable. When a kernel-level cybersecurity solution loads at the earliest possible time, it gives users the most data and context possible when threats arise.

In the world of competitive video games, for example, kernel-level anticheat systems are sometimes used to stop cheaters who run programs to add an aimbot or alter the physics of their games. But kernel-level anticheat solutions don’t always work, and their wide-ranging permissions is a point of contention among some gamers.

Microsoft acknowledges that the tradeoff of kernel-level cybersecurity products is that if it glitches out, it can’t be easily fixed. “All code operating at kernel level requires extensive validation because it cannot fail and restart like a normal user application,” the company says.

“There is a tradeoff that security vendors must rationalize when it comes to kernel drivers,” Microsoft shared. “Since kernel drivers run at the most trusted level of Windows, where containment and recovery capabilities are by nature constrained, security vendors must carefully balance needs like visibility and tamper resistance with the risk of operating within kernel mode.”


Photo Credit: rafapress / Shutterstock.com