The Crowdstrike Outage & Your Headlines for July 24, 2024 - podcast episode cover

The Crowdstrike Outage & Your Headlines for July 24, 2024

Jul 24, 202425 minSeason 1Ep. 109
--:--
--:--
Listen in podcast apps:

Episode description

These are your cybersecurity headlines for July 24, 2024. Joining me to today is Stuart Bayliss, Director of Product Management for Skyhigh Security, to discuss the massive Crowdstrike outage that sparked hours-long system failures across the globe, in what some reports are referring to as the largest IT outage in history.

Transcript

It's July 20 fourth. I'm here host, Scott Slay, and these are your cybersecurity headlines. At and T paid a ransom of 5.72 bitcoin, around 370000 Us dollars to a hacker who stole a significant amount of data, including metadata from nearly all call logs and texts made by about 109000000 customers. The funds were later traced to various wallets, and a gambling service.

The activist group Noel budge leaked a terabyte of Disney's internal Slack channel data and protest and of Disney's alleged anti artist stance. The group breached Disney's network through a compromised video game on. The A critical remote code execution flaw and Geo servers Geo tools plugin was actively exploited. V server released patches and workarounds while Cia sa mandated federal agencies to update their servers by 08/05/2024.

The third Eu Ukraine cyber dialogue emphasized cooperation on cybersecurity with Ukraine aligning its legislation. With the Eu, Is 2 directive and planning to enhance intel sharing and crisis management with European institutions A threat actor released the email addresses of over 15000000 trello users collected via an unsecured rest Api.

The data was sold on a hacking forum for a nominal fee, posing significant privacy concerns for the affected users, and in the biggest It news story in recent years, businesses and organizations around the world and by extension their customers had quite an awakening on 07/19/2024 when a massive It outage spark hours long system failures across the globe, what some reports are referring to as the largest It outage in history.

The instant was triggered by a faulty software update from Crowds strike, which affected those systems running Microsoft Windows, This outage brought airlines to a standstill, halted financial transactions and even impacted critical services like emergency response and hospital operations. While crowds strike and Microsoft have since addressed the issue. This event had highlighted the vulnerabilities in our interconnected digital infrastructure and the critical

importance of robust cyber cybersecurity measures. Has Joining me today is Stuart Bayless, director of product management for Sky high Security. Hello, Stuart. How are you doing today? Hey, Scott? I'm doing pretty to good. Thanks. Well, thanks for joining us. As you've probably heard, there was a fu, if that's the correct word, last week, Thursday pouring into Friday, global tech outage attributed to crowds strikes update.

And it had immediate effects on many sectors, including travel, banking and health care, to name a few, As it was a global instant, there were many conflicting stories, including some rather wild conspiracy theories about what happened. As a director of product management for Sky security, your insights will go a long way to helping understand what actually happened. So to start, can you explain the timeline of the crowds strike outage and how it caused such a widespread impact?

Sure. Absolutely, a con it was, indeed. The the official time, because crowds strike, have been very informative actually throughout this, you know, so Kudos to them, keeping folks updated. But it officially started on July the nineteenth 20 24 at 04:09 Ut t. So that was about 05:00 in the morning for me. Here in the Uk. And what happened is basically, crowds strike released a sensor configuration update to their windows. Systems.

So sensor configuration updates apart the protection mechanisms of the falcon platform so they distributed this update and the configuration update, triggered a logic error result. In ABS0D, Blue screen death. That lovely. Yeah. That, according to the the the updates that we have right now claiming around about 8000000. Machines are affected, but that's kind of the news as it is right now.

So for me in the Uk, Friday morning, just seemed like in the other Friday when I I woke up, until I logged on to my machine, and and straight away. I mean, we're getting updates from the Bbc around things like flights being canceled. As you said, Scott, banking impact, financial services. We we know that, some folks in the Uk, weren't able to to use

some of their financial services through banking. So it really was kind of rem reminiscent of you know, the, yeah, year, 2000, you know, the I love Uv virus, and things like that. It it really did start our first thing the morning and then kind of snowball throughout the day into and quite epic proportions of the, you know, challenges that everybody was facing that was part of this this triggered update, Yeah. That is a type of nostalgia. I don't think anybody really welcomes in their life.

No. But, I mean... Well, the the the thing is and I, I've been speaking to some other folks about this obviously. It's kind of a hot topic. And you know, I've been in the cybersecurity business for, yeah, 30 odd years now. And really, as you say, you know, this is an sales nostalgia that we don't like to to think about. And these occurrences actually don't happen that often. You know, on this scale, I mean, this is definitely kind of, I would say,

you know, a top 10. So top you know, 10 ones that you could probably weave together over 30 years isn't too bad. And it it, it's not... It really isn't if you think, putting it in that context. Yeah. That's really it really says a lot that there is a handful that, yes, everybody. Remembers, Melissa, I love you. Now it's going to... I don't know what they're gonna name

this 1. The, you know, the the incident the outage Yep, you know, whatever we decide on, but it's really on this global scale, it it is kind of a little bit of a phenomenal thing that we don't have this happen more often and that says a lot about companies like, let's be honest like crowds strikes, Sky high security that they really usually have everything on point where this doesn't happen, but it can. Indeed, it does, and, you know, again from from

my personal perspective. I've had the, the owner of being a product manager on, driving friction for what was Mc mccarthy. And for S, our our current, endpoint, kinda connector to piece. And both of those products actually touched the Kernel as well, And that was the issue with the crowds strike update is that it touched the kernel therefore, you got the blue screen of death because it through a system error. But companies like Sky high, cloud strike and other vendors

within the space. They they know it is a privilege to work at that level, within the operating system of the customer's system. So a lot of diligence is done there. So it, you know, it it's not unprecedented because we have seen issues before. But it is certainly AAA rare occurrence. I mean, it it it really is... Something that we don't expect to see every day. So, yeah, bit a bit of a rare for phenomenon, but it just goes to show 1 small mistake, mis configuration can have widespread effects.

I mean, we're still kinda of trying to reconcile from this as well to see how much more damage has been done. I know personally Has spoken to a lot of our customers customers. Fortunately, Sky high were not affected. I've got kinda point out, but a lot of the customers work. You know, we operate within an ecosystem. You know, so, you know, a lot of these folks are, the partners, friend is, you you know, yeah in the security industry. But... Yeah. Because a lot of people are trying to reconcile

and trying to get over this. I, we ourselves have maintenance book this weekend, and we decided that we would reschedule it to to allow people to recover. And obviously, you know, offering our support where we can because it's it really is, a situation that you don't wish upon anyone. So... Yeah. So how how exactly is it that a single update to a server?

Trickle down and can affect, you know, somebody trying to book a flight or actually check in at the airport or even getting their health records, you know, it it it goes to show that that hierarchy, if you will, of 1 server having an update an incorrect update just will trickle down and affect so many computers under it.

And is that exactly what happened is that the trickle effect is why somebody, a flight is gonna be fine because it's not running on Windows or anything, but trying to check into your flight, If that airline uses windows, they can actually give you a boarding pass. That... That's pretty much it. I mean, obviously the... You know, the update came down in an automated way, and was deployed. But, yes, as you've kind of

set just said. I mean, you know, it's not the case that it's grounding the aircraft because the aircraft can't fly. No planes are affected. As far as I'm aware, but it it was the systems that you used to to do booking, a lot of point of sale systems use Windows you know, Atms use them, so people can't get money out. So anything that was basically using some of the affected window systems couldn't operate I know today when I was checking in on the news here in the Uk.

A lot of their the Gp surgeries, so the local doctors. People are still struggling to get appointments with them. 1 because they can't book an appointment online, but 2 they can't necessarily access all the patient records. So it it has a widespread effect. So it's affected. I mean, a lot of businesses, use windows. You know, they have that dependency on it. Windows is still pretty much the predominant operating system out there. So, yeah, if it affects your your ability to book, manage

process payments, things like that. That's why it had such a a widespread effect. What immediate steps were taken by crowds strike to mitigate. The impact of the outage. As I mentioned earlier on, they were very good at putting out updates on their website, So once they had realized and identify what the challenge was, they first went out and gave out some manual steps, how to remediate it. So that... You know, in itself, yes, they've

got stuff out there. But it was a case really of, you know, the local It folks in a lot of cases getting the hands on machines. Since then, they have gone on further, I've being followed sort quote. Quite a lot of interest. What they're doing is that they're now putting together an opt in, so you can opt in to use it more automated kind of

remediation test. But, yeah, in the the early hours and the the very start of this, it was a very manual process to start you know, emergency triage on some of the most critical machines. And when it actually is patched, if we'll just call it patching it, this patch up date that was having to be done manually.

Did that immediately resolve issues for those servers and anything computer wise that was affected, Windows Pcs that were affected or did it take some time because, you know, it's not gonna trickle down, like, like, when it was affected initially, I guess. So did they actually have to go manually do all the computers on the bottom line and work their way up or vice versa? Yeah. That's what I understand so far. I'm still sort catching up on the...

On all of the details. But, yes, it was very much manual hands on. So there was a, obviously, a bit of noise about that when it was first pushed down, You know, there was chatter on some of the social media platforms that a lot of the It folks had to go into, the office to get their hands on machines. A lot of people had to be calling remote workers. I mean, you know, let's not forget the Covid changed the world. Right? We will set at home for nearly 2 years.

And as part of that hybrid working, has become more popular, within lot businesses. So, you know, you and I Scott, we both work remotely. I mean, I don't go into a ship office every day. So, yeah, there was a lot of folks basically sat on the end of a phone, getting talked through these instructions. Some are more tech savvy than others. So, yeah. It it was quite labor intensive to begin with. But the automation is coming there.

So what potential long term implications of the outage are there on, cybersecurity practices as a whole. I think it's a stark reminder, when we were talking earlier. You know, we were kind of rem about, you know, the Melissa virus and the I love you. And I think what we all have to remember is no matter how well we protect ourselves who put the layers in, things like that. You know, there are still checks and balances that perhaps we should be taking more care of.

I do know speaking to customers over the years. A lot of customers will not just deploy these patches. A lot of customers will pull these down into, to sandbox and they do canary deployments, so they do small groups of machines at a time just to make sure that they don't collide with any of their policies or or cause any these issues. And this is the stark reminder because again, you know, you talk to a lot of the It professionals out there. And a lot of it is about automation. You know, people

want to automate everything. You know, they want to be able to run scripts. They want to be deploying software packages and things like that. Because some of these patches are mission critical sometimes. So, you know, there there is pressure to get them out quickly. But it's a sobering reminder, that, yes, I mean, the software vendors such as ourselves, and the others in the ecosystem, we also have that responsibility too.

So, you know, we have to ensure just the integrity of the software that we're providing, and there is not gonna have this adverse effect. So, you know, it has these ripple effects through the industry and the ecosystem. You know, we should all, you know, be more aware about what we're deploying, installing, validating. So, yeah, it is a sober sobering reminder. These days of, yeah, sometimes dependencies on some systems that we should, perhaps look more deeply doing from time to time.

So it's gonna be a lot of pure speculation on our part, but what do you think this outage is going to... How what kind of impact is the outage, going to have on crowds strikes, reputation and customer trust or even peripheral about Sky high security and how people might look at us and kind of pigeon hole into that same category of the outage. Yeah. That's a great question. I mean, there are immediate effects, of course. I mean, you know, their stock price has... Dropped.

So, you know, stock and equity has been taken away from that. So, yeah, there's gonna be that financial impact right away. There is gonna be the trust piece as well. Whenever anything goes wrong, you know, you have to be accountable to your customers. So they're gonna be... There's gonna be a lot of questions, a lot of reassurance at crowds strike are gonna have to give to their customers as well.

And over time, they will recover, you know, we have seen these things happen in the past through various vendors, outages. So it is... As you said in the question, it's it's more about crowds strike going to be able to respond to the trust. Be able to help their customers out this situation, being able to be there and reliable, had accountable as well. I mean, I know George Kurt has put out a couple of statements already on social media.

Around, you know, the the challenges is that, you know, he's aware that his business has has brought to his customers So there is a great level of accountability, transparency visibility there, You know, so kudos for them for that. But, yeah, I think you know, more generally, as you said for for vendors like us. It's a stark reminder to us as well. You know, that we we have to be accountable. We have to do the the the right Qa, and we have to have that trust with

our customers. And when ultimately, if things do go wrong, then we're gonna be able to, you know, The ecosystem has to be able to respond as does the vendors, you know, to support their customers when they do have these kind of challenges. Well, as you've mentioned earlier, there's no way of us fully grasping the entire impact. Of the out outage right now. But from your knowledge during the time that this happened, were there any other notable security breaches or attacks that occurred?

Not that we're aware of, yeah. Obviously. As time goes on, and we can look more at the Rca, things like this. I mean, III know that they're you know, there's always a discussion about, you know, you know, when this occurred were there any other breaches was there, vulnerabilities that we're being planted things like that. In the fullness of time we will see it. But today, it seems really that it is the instant that they...

They affected the window system, and it it's been nothing more nothing more than recovering these machines at the moment. But but time time would tell. So I guess, the It cybersecurity term of the day is going to be a tie between sandbox and Qa. That being said, it That I think that's probably the primary lesson that everybody's gonna take away from this is to do very adequate and involved Qa test out on the sandbox.

That's probably 1 of the only ways to at least help somewhat prevent something like this happening happening again. Yeah. Indeed. I mean, it it it's doing the checks and the balances. You know, it's best practice, really, you know, on, you know, patched Tuesday, update Tuesday or when and I think major comes along. Really, to to check and validate them then The thing is again working with customers over the years. I mean, customers have so many different kind of configurations. I mean,

they... You know, these big customers aren't running vanilla windows builds. Right? You know, when you're looking at point of sale machine stuff like that. They're stripped down window versions. So there's a lot of nuance, so Generally, you should be testing to make sure that the way that you have your windows system configured all the way that you have your custom apps configured.

Whenever you have an update from from any vendor, you should just, you know, take care to make sure that they've not introduced something that could adversely affect the way that you do your business. So, you know, it's not necessarily always, an event like we've seen last Friday where, you know, the kernel will break, all of your machines, the kernel driver would do that. It's more a case that, you know, companies should be taking care of any major update doesn't affect their business as your

routine. So Again, it's a it's a sobering reminder that, you know, we have a a high dependence on software, and on hardware. And from the vendors within the various ecosystems. So, you know, it's, yeah, time to, you know, look reflect I'm sure there'd be a lot of ringing of hands and soul searching in various It departments and things like that. But you know, it can't be a bad thing. It can't be a bad thing to to revisit what you're doing. Well, Stuart, Again, thank you so so much

for joining us today. I know it's late where you're at. I believe, again, I don't know how time zones works, so I think it's probably... Thursday or, you know, I know it's Monday morning here. I don't know in England it. It's sometime in August, I believe So... It is. Yeah. Appreciate it. It's time. Yep. Yeah. I've I appreciate you talking with me today and and kinda clearing the... The wheat from the cha on some of this because I know like you and I spoke earlier before the

court. My mom saw that it was a hack and just immediately asked me. Is this a hack? Or she was actually very court certain. It was a hack. I was like, no, mom. So it's it's great to have someone like you that has a very intimate effective on this kind of situation and what goes in into doing these updates, like you said at the colonel level. So effective I thank you so much for your time and hopefully, next time you join us, it won't be to discuss another global

incident like this. That well, let's let's hope not. And, yeah, thank you very much for the the opportunity to to chat. It was it was great to have a catch up too, so you take care. You 2. Thanks against her. Just got life for now. You can listen to Sky high security cloud on, I heart Radio, Apple, Spotify, Google, or wherever you listen to podcast. Be sure to hit that subscribe button so you'll never miss an episode. And if you've been enjoying Cloud cast, we'd love to hear

from you. Leave us a review on your favorite platform and let us know what you think. Your feedback helps us improve and reach more listeners just like you. For more information about the podcast or Sky high securities cloud security solution, please visit sky high security dot com.

Transcript source: Provided by creator in RSS feed: download file