Global IT outage — BSOD and CrowdStrike

Martina Lenić
CyberDnevnik
Published in
3 min readJul 23, 2024

Last week we witnessed one of the largest IT outages affecting the travel, banking, business, and health sectors worldwide in the form of BSOD (Blue screen of death) on Windows machines.

The (in)famous Blue screen of death (known as BSOD, fatal error and bugcheck) indicates the system reached a critical condition from where it cannot continue to operate normally and requires troubleshooting. Possible causes include hardware failures or unexpected termination of a crucial process or thread

Blue screen of death on Windows machine

The cause

System, program and application updates are a common part of information security. They are continuously created, tested and pushed to endpoints. The massive IT outage was caused by one of them, CrowdStrike’s channel file update. The faulty update was pushed into the cloud throughout the night causing Windows machines with Falcon sensor installed to crash showing the BSOD. That marked the starting point of thousands of machines going down and critical systems crashing.

Unlike policy updates triggered from the centralized console that affect the sensor’s version and prevention and detection capabilities, the channel file update is part of Falcon’s behavioral protection mechanisms that influence the sensor’s logic. The channel configuration files are pushed to sensors frequently to stay ahead with discovered TTPs (Tactics, Techniques and Procedures).

In this case, the channel file C-00000291*.sys had faulty logic and disrupted the systems. It affected all Windows machines online on Friday, July 19, 2024, between 04:09 UTC and 05:27 UTC. Systems that were offline at the time were not impacted by the update as CrowdStrike immediately reverted the changes by pulling the file from the cloud. Linux and MacOS were not impacted.

The fix

Identifying impacted hosts

  • Windows hosts showing BSOD
  • New granular status dashboard in the CrowdStrike console

Remediation

CrowdStrike provided the workaround action to enter the safe mode and remove the specified channel file from C:\Windows\System32\drivers\CrowdStrike.

  • For more details on Microsoft’s recommendations on impacted endpoints see here.
  • For more details on Microsoft’s recommendations for impacted servers see here.
  • For more details on AWS’s recommendations for impacted resources see here.
  • For more details on Azure status see here.

Although it was one of the most chaotic Fridays (and weekends) in the history of IT, most companies managed to recover critical systems throughout the day. It is expected the resolution will continue in the next weeks and possibly months. But a lot of questions arose and many are demanding answers on the process itself, change management, proper testing and what measures the company will take to address future changes and updates. The catastrophic event has shown how IT systems, in the end, are fragile and vulnerable and many are wondering how to prevent anything similar from happening again.

CrowdStrike

CrowdStrike is one of the leading EDR/MDR/XDR solutions on the market with thousands of clients, mostly targeting organizations. Its lightweight agent provides detection, prevention, and remediation capabilities while constantly monitoring and gathering data from end machines for further analysis in the cloud. It includes Endpoint Security, Cloud security (CNAPP), Threat intelligence and Hunting, Next-Gen SIEM, Workflow Automation, Exposure Management and Identity Protection.

More information on Windows crashes related to Falcon sensor:

--

--

Martina Lenić
CyberDnevnik

Exploring and sharing ideas, thoughts and knowledge related to : Cybersecurity | Endpoint protection | EDR | Power BI - and more to come!