What Happened?
Summary of the Incident
On July 19, 2024, at 04:09 UTC, CrowdStrike released a routine sensor configuration update for Windows systems as part of its ongoing operations to enhance protection mechanisms. Unfortunately, this update triggered a logic error that led to system crashes and blue screen errors (BSOD) on impacted Windows systems. The problem was swiftly identified and remediated by 05:27 UTC the same day (78 minutes).
Key Details
Date and Time: The issue occurred on July 19, 2024, starting at 04:09 UTC.
Resolution: The problem was resolved by July 19, 2024, 05:27 UTC.
Nature of the Issue: The issue was caused by a logic error in the sensor configuration update, not by a cyberattack.
Based on information provided by CrowdStrike, there was no disruption to security protections during this time.
Total impacted assets globally are unconfirmed currently.
Sekuro has observed approximately 20% of an enterprise’s Windows devices with CrowdStrike Falcon Sensors deployed were negatively impacted by the BSOD error.
Actual assets impacted are less than the total Windows endpoints with the Falcon Sensor deployed, due to what we suspect, but is currently unconfirmed, to be timing. Updates are streamed over time from the Falcon Console to Falcon Sensors and are not deployed globally at the same exact moment in time.
Technical Information
Channel Files
The update involved changes to “Channel Files,” which are integral to the Falcon sensor’s behavioural protection mechanisms. These files are updated several times a day to counteract new cyber threats identified by CrowdStrike. This process has been a standard part of the Falcon sensor’s operation since its inception.
Channel Files Location: %WINDIR%\System32\drivers\CrowdStrike
Problematic File: C-00000291*.sys with a timestamp of 2024-07-19 0409 UTC.
Fixed File: C-00000291*.sys with a timestamp of 2024-07-19 0527 UTC or later.
Misconceptions
There has been misinformation about the affected versions of the Falcon agent. The logic error impacted all versions of the agent from 7.11 and above that received the update during the specified time window.
How to identify affected hosts
On 20 July 2024, CrowdStrike released updated dashboards designed to assist organisations in identifying Windows hosts potentially affected by the recent Falcon sensor content update issue. These dashboards, accessible via the Falcon Console under Next-Gen SIEM > Log Management > Dashboards, provide crucial visibility into the status of affected hosts, thereby facilitating their remediation. Users can locate the relevant dashboard by searching for “Hosts_possibly_impacted_by_windows_crashes”.
The dashboards present detailed information on sensor statuses, channel files, and impacted Company IDs (CIDs). Users can filter the data by CID, Computer Name, and Status to focus on specific hosts. This capability enables organisations to prioritise their remediation efforts effectively. Additionally, the dashboards highlight hosts requiring immediate attention, such as those stuck in a reboot loop or unable to connect to the CrowdStrike cloud to receive the updated channel file. This targeted approach ensures that critical systems are addressed promptly, thereby minimising disruption to business operations.
Operational Guidance
Host prioritisation
In responding to the recent CrowdStrike Falcon sensor update issue, it is crucial to prioritise the remediation of critical systems to ensure continuity of essential business operations. Focus first on high-priority systems, such as those supporting vital business functions, customer-facing applications, and security infrastructure. Leveraging your organisation’s disaster recovery plan to categorise hosts and services based on their importance will enable a strategic and efficient response. By addressing the most critical systems first, you can maintain operational stability and minimise disruption.
Effective communication
Clear and positive communication with senior leadership is vital during incidents like this. Provide a concise update outlining the nature of the incident, its impact on business operations, and the proactive steps being taken to resolve it. Highlight the measures in place to mitigate risks and ensure a swift recovery. Keeping senior leadership informed with timely and relevant information ensures they are reassured and confident in the organisation’s response strategy.
Coordinated remediation efforts
Engage all relevant teams in a coordinated effort to address the issue efficiently. Assign specific roles and responsibilities to team members to ensure a focused and organised approach. Regular check-ins and updates within the team will help to maintain momentum and ensure that any emerging issues are promptly addressed. This collaborative effort will contribute to a faster resolution and reinforce the organisation’s resilience.
Managing IT staff fatigue
Given the intense workload and pressure on IT teams during such incidents, it is important to manage staff fatigue effectively. Rotate staff in and out of shifts to ensure everyone gets adequate rest and breaks. Encourage team members to take regular short breaks to maintain their focus and productivity. Providing support and understanding from management can help maintain morale and prevent burnout. Ensuring the well-being of your IT staff is crucial for sustaining a high level of performance throughout the remediation process.
Security awareness
Malicious actors may seek to exploit this situation through phishing campaigns or other malicious means. It is essential to remain vigilant and obtain information surrounding this incident only from CrowdStrike or your trusted CrowdStrike partner. Never provide administrative credentials, usernames, or any other sensitive information to unverified sources. Avoid installing patches supplied by third parties; product updates will always be provided centrally by CrowdStrike. Do not execute any scripts claiming to “fix” the problem unless they are provided by CrowdStrike or come via a genuine link to a trusted vendor’s guidance or script.
- Phishing and e-crime campaigns: https://www.crowdstrike.com/blog/likely-ecrime-actor-capitalizing-on-falcon-sensor-issues/
- Sekuro has published specific guidance for Proactive Measures you can take to strengthen your cyber resilience.
Remediation Steps
Following the provided vendor documentation will ensure that remediation efforts align with vendor and platform best practices. It is also important to highlight that you follow your organisational change management processes throughout the remediation process.
On-premise Windows hosts
For on-premises Windows laptops, workstations, and servers, the following links provided detailed remediation steps:
- Windows (manual): https://support.microsoft.com/en-us/topic/kb5042421-crowdstrike-issue-impacting-windows-endpoints-causing-an-0x50-or-0x7e-error-message-on-a-blue-screen-b1c700e0-7317-4e95-aeee-5d67dd35b92f
- Windows (automated via bootable USB): https://www.crowdstrike.com/wp-content/uploads/2024/07/Using-the-Microsoft-Recovery-Tool-for-Automated-Host-Remediation.pdf
Cloud-based Windows host
For cloud-based Windows hosts, the major cloud providers have released guidance based on their specific platform. Follow the detailed steps provided in the links below:
- Microsoft Azure: https://azure.status.microsoft/en-gb/status
- Amazon Web Services (AWS): https://repost.aws/en/knowledge-center/ec2-instance-crowdstrike-agent
- Google Cloud Platform (GCP): https://www.crowdstrike.com/wp-content/uploads/2024/07/Automated-Recovery-from-Blue-Screen-on-Windows-Instances-in-GCP.pdf
Public cloud/virtual environments
If you have access to public cloud and virtual environments, follow the detailed steps provided by CrowdStrike below:
Option 1:
- Detach the operating system disk volume from the impacted virtual server
- Create a snapshot or backup of the disk volume before proceeding further as a precaution against unintended changes
- Attach/mount the volume to a new virtual server
- Navigate to the %WINDIR%\System32\drivers\CrowdStrike directory
- Locate the files matching “C-00000291*.sys” (note: the affected file will have a timestamp of 2024-07-19 0409 UTC), and delete them
- Detach the volume from the new virtual server
- Reattach the fixed volume to the impacted virtual server
Option 2:
- Roll back to a snapshot before 2024-07-19 0409 UTC (note: it is important to consider change management processes during this procedure and to discuss potential impacts i.e. data loss with the appropriate system owners and/or SMEs)
More Information
- Official CrowdStrike Thread: https://www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/
- Technical details: https://www.crowdstrike.com/blog/falcon-update-for-windows-hosts-technical-details/
- Identifying hosts via Falcon Console: https://www.crowdstrike.com/wp-content/uploads/2024/07/How-To-Identify-Hosts-Possibly-Impacted-By-Windows-Crashes-2.0.1.pdf
- Recovering BitLocker keys: https://www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/
Further information will be published as the situation develops.