Inside a cloud outage
Nov 07, 2025•31 min•Ep. 329
Episode description
The end of October was punctuated with a series of major cloud outages, first at AWS and then at Microsoft, bringing a wide range of websites and business applications offline.
In the previous episode, we spoke about this in a reactive sense – the immediate customers impacted and the likely causes.
But it's also important to break the problem down at a strategic and technical level. Just how do outages at this scale occur – and what’s it like as an insider, fighting to bring services back online?
In this episode Rory speaks to James Kretchmar, SVP & CTO of the cloud technology division at Akamai Technologies, to get an insider’s perspective on cloud outages and how businesses can navigate these incidents.
Read more:
- Amazon Web Services outage live: Hundreds of apps including Slack, mobile carriers, banking services down
- The AWS outage brought much of the web to its knees: Here's how it happened, who it affected, and how much it might cost
- The Microsoft Azure outage explained: What happened, who was impacted, and what can we learn from it?
- Australia internet banking outage blamed on DDoS mitigation service
- Why the CrowdStrike outage was a wakeup call for developer teams
For the best experience, listen in Metacast app for iOS or Android
