Last Friday morning, the world was hit by one of the most significant IT blackouts in history. Thousands of Windows machines failed to boot up or reboot, disrupting banks, airlines, TV broadcasters, healthcare companies, major retailers, and numerous other businesses globally. While the cause of the outage is known and efforts to restore normalcy are underway, this widespread chaos underscored the world's growing dependence on a handful of technology companies and critical questions remain unanswered. Here’s an in-depth look at what happened and its implications for our technology-reliant society.
When something as massive as what began on Friday, July 19, occurs, wild theories are bound to spread like wildfire. Was it a cyberattack? A glitch in Microsoft’s Windows operating system? Or perhaps artificial intelligence staging a coup? The internet, with its boundless imagination and flair for the dramatic, went into overdrive. Social media buzzed with every possible explanation, ranging from the plausible to the downright absurd. As always, the online world did not disappoint in its creativity.
As far as we can tell at the time of writing, this was neither a cyberattack nor a malicious act. Microsoft, for once, can rest easy as they aren’t to blame, and humanity is still safe from an AI takeover (for now). The culprit was much simpler and far less sinister: a faulty software update from cybersecurity firm CrowdStrike. This update caused thousands of computers running its Falcon software to experience the infamous Blue Screen of Death (BSoD).
Blue Screen of Death (BSoD)
The BSoD, familiar to anyone who’s spent more than a few minutes on a Windows machine, is what happens when a serious problem forces Windows to shut down or restart unexpectedly. The alarm bells started ringing in Australia and quickly escalated as businesses in Europe began their workday. From there, it was worldwide chaos, with computers dropping like flies.
But what exactly is CrowdStrike, and why is it at the core of so many computers across various industries worldwide?
CrowdStrike, a Texas-based cybersecurity firm founded in 2011, develops and sells software designed to detect and block cyber-attacks and intrusions. According to the company's website, CrowdStrike serves around 30,000 customers worldwide, including over 500 Fortune 1000 companies. As of last Thursday evening, CrowdStrike was valued at over $83 billion, though its stock value dropped by around 11% on Friday. This extensive customer base is precisely why the faulty software update caused such significant and widespread chaos when things went sideways.
CrowdStrike is known as an "endpoint security" firm as it leverages cloud technology to provide cyber protection for internet-connected computers. At the heart of last week's debacle was the company's flagship platform, Falcon—a cloud-based solution offering antivirus capabilities, protection, threat detection, and real-time monitoring of company systems and networks.
For software like CrowdStrike's Falcon to function effectively, it requires deep access to a computer's operating system. Think of it as needing full access to every corner of your home to monitor it against intruders, fires, or other issues. This level of access ensures comprehensive protection but also creates a potential vulnerability, or single point of failure, if something goes wrong.
In the case of your home, it could be like a security guard falling asleep with thieves at the door. For CrowdStrike, it was a faulty update that installed problematic software into the core of Windows operating systems running Falcon, causing systems to get stuck in a boot loop, where the system continuously restarts itself.
The necessity for software vendors to frequently update their products through updates is undisputed. Software must be kept current with security patches to address newly discovered vulnerabilities, and new functionalities must be added to remain competitive and adapt to the ever-evolving IT landscape.
Anyone who owns a smartphone, tablet, or computer is familiar with the never-ending dance of updates. As a matter of fact, if you were to refresh your app store right now you'd likely find a few apps needing updates. This is a good moment to remind everyone that you should always apply updates to your devices and applications and that despite the rare fluke with CrowdStrike updates are essential, especially for security. So don't be deterred by this incident and always embrace updates - they're your devices', and ultimaterly your, best defense.
Typically, before rolling out an update, a company's security team should follow three crucial steps: rigorously checking the software for bugs, testing the update on various machines running different versions of the operating system, and gradually releasing the update to small groups of users to identify any potential issues. CrowdStrike clearly failed in this process, leading to significant disruptions. There are still unanswered questions about how this lapse occurred and whether it was an accident or an intentional act of sabotage.
Although CrowdStrike has issued a fix for the faulty update, getting systems back online won't be a simple task. Experts predict it could take days to weeks to fully resolve, as IT administrators might need physical access to devices to repair them. For the technically inclined, the workaround involves booting affected Windows machines into safe mode, navigating to the CrowdStrike directory, deleting a specific system file, and restarting the computer. The recovery speed will also depend on the size of the affected company and its available resources.
Questions still linger about how this bad update slipped through the cracks. CrowdStrike's statement on Wednesday, July 24 pointed to an issue in its validation system, but theories of sabotage persist, veering us back into the realm of cyber-attacks. While we may never know the full truth, companies must stay vigilant against opportunistic threat actors exploiting the chaos. Already, reports have surfaced of fake email campaigns where malicious attackers impersonate CrowdStrike, claiming to send fixes or tools to resolve the situation. Instead, they're distributing Trojan horses, malicious software masquerading as legitimate, aiming to infect systems with ransomware or remote access tools.
Phishing email sent by threat actors
Despite the relatively low percentage of affected systems and CrowdStrike's swift response, fully assessing the damage from this incident will take time. The impact was substantial. Computer crashes led to thousands of delayed or canceled flights, brought hospitals and media organizations to a standstill, and disrupted railways, financial companies, and even emergency services. The Texas-based company announced on Wednesday it has improved its testing processes and enhanced the error-handling mechanisms in its Content Interpreter. Additionally, CrowdStrike plans to implement a staggered deployment strategy for Rapid Response Content.
On a humorous yet ironic note, many less technologically advanced or third-world countries were spared from this incident. For once, their lack of concern about cybersecurity paid off. The simple equation of "no need for cybersecurity = no cybersecurity software" saved them from the chaos. However, it’s crucial to emphasize that information security experts consistently preach the importance of taking cybersecurity seriously. While this incident was a vendor failure and not a cyber-attack, cybersecurity is indeed essential and will only grow more critical with time.
Finally, this incident should prompt reflection on the world's increasing reliance on a shrinking number of software vendors, making a single-point-of-failure scenario more perilous. Perhaps a thought for my next piece.
Read more
Comments