Dive into the recent Slack outage and disruptions at Microsoft 365, Grafana Cloud, and Otter.ai—plus, explore key takeaways for ITOps teams. ——— CHAPTERS: 00:00 Intro 00:48 Slack Outage 06:55 Microsoft 365 Outage 11:44 A Pair of Otter.ai Outages 14:21 Grafana Cloud Disruption 15:55 By the Numbers 17:58 Get in Touch ——— To learn more about how to deliver seamless digital experiences in a distributed IT landscape, read this eBook: https://www.thousandeyes.com/resources/guide-to-next-generation-ass...
Mar 08, 2025•19 min•Ep 109•Transcript available on Metacast Outages connected to configuration mishaps were a common theme last year, and we’ve continued to see incidents like these in 2025. Configuration changes triggered two consecutive Asana outages in early February, and configuration or update-related issues may also have contributed to recent disruptions at Barclays, ChatGPT, Jira, and Discord. Tune in to hear The Internet Report’s Mike Hicks unpack these incidents and discuss ways ITOps teams can guard against similar issues. ——— CHAPTERS: 00:00 I...
Feb 28, 2025•31 min•Ep 108•Transcript available on Metacast What does it take to deliver successful digital experiences at major events like concerts and conferences? With special guest Dominic Hampton—Managing Director at attend2IT—we’ll explore the dynamic world of event IT and key takeaways ITOps teams at enterprise companies can apply to their own events as well as in their day-to-day operations. We’ll also discuss insights from recent incidents that impacted Azure, Microsoft 365, and more. CHAPTERS 00:00 Intro 01:34 Behind the Scenes of Event IT: Le...
Jan 31, 2025•32 min•Ep 107•Transcript available on Metacast Configuration changes played an outsized role 2024 outages. Tune in to hear more about this and other outage trends—and learn how ITOps teams should plan accordingly in the year ahead. We’ll also share insights from recent incidents at OpenAI and Google Cloud’s Pub/Sub, and dive deeper into a degradation incident that Netflix experienced at the end of 2024. Read on to learn more, or use the chapters below to jump to the sections that most interest you. CHAPTERS 00:00 Intro 00:58 Cloud Service Pr...
Jan 17, 2025•22 min•Ep 106•Transcript available on Metacast With nearly a year of data available, the topline outage trends for 2024 are coming into focus. Tune in to see what the numbers are showing. The Internet Report team will discuss how Internet service provider (ISP) outage numbers are continuing to increase, while cloud service provider (CSP) outages are also becoming more frequent, indicating a changing landscape in service reliability. They’ll also unpack the recent OpenAI and Meta outages. ——— CHAPTERS: 00:00 Intro 00:49 Outage Trends Across 2...
Dec 20, 2024•20 min•Ep 105•Transcript available on Metacast The past few weeks are somewhat of a representative sample of 2024 from an outage perspective, with connectivity issues and updates at the root of the four recent incidents. Both DigitalOcean and real-time payments provider Worldline experienced connectivity issues to data centers that made services unreachable. Meanwhile, Microsoft and Reddit encountered problems following changes to their systems that appeared to have unexpected user impacts and had to be rolled back. Tune in to hear The Inter...
Dec 14, 2024•16 min•Ep 104•Transcript available on Metacast Powerful things happen when ITOps teams move beyond a break-fix approach and lean into proactive optimization. Instead of just responding to issues as they occur, when teams have independent visibility into their end-to-end service delivery chain, they can proactively identify possible areas for optimization and improvement. For example, streamlining one small part of a complex process could shave seconds off the total transaction time; do this for every part of the process, and the efficiency s...
Nov 27, 2024•19 min•Ep 103•Transcript available on Metacast The Digital Operational Resilience Act (DORA) goes into effect on January 17, 2025, and financial institutions serving the EU will need to meet an enhanced set of requirements related to risk management, network resilience, and incident reporting. While DORA is directly applicable to EU financial institutions, it prompts important discussions about resilience and ensuring digital experiences that are relevant for all IT operations teams, regardless of industry or region. Tune in to the podcast t...
Nov 08, 2024•30 min•Ep 102•Transcript available on Metacast A recent Salesforce outage highlighted the limitations of status pages and the importance of considering a variety of data points when identifying the source of an outage. Tune in to hear The Internet Report team discuss what happened and why. They’ll also share insights from a recent Microsoft Outlook outage and cover the latest Internet outage trends. Listen now or use the chapters below to jump to the sections that most interest you. CHAPTERS 00:00 Intro 00:48 Salesforce Outage 10:00 Microsof...
Oct 25, 2024•18 min•Ep 101•Transcript available on Metacast A recent certificate problem impacted ServiceNow, and other issues prevented users from accessing key cloud services including Microsoft 365, Azure Virtual Desktop, and Workday. Tune in to hear what happened during these incidents and a separate data center fire that caused a Reliance Jio outage for customers across multiple areas of India. Listen now or use the chapters below to jump to the sections that most interest you. CHAPTERS 00:00 Intro 00:59 ServiceNow Outage 03:20 Microsoft 365 Outage ...
Oct 04, 2024•16 min•Ep 100•Transcript available on Metacast During high-traffic seasons like Black Friday or a much-anticipated product launch, maintaining good digital experiences for customers is vital. We’ve all heard tales of floods of eager shoppers crashing a website during a major sale—leaving them unable to make their coveted purchases. To guard against a breakdown like this during high-traffic periods, companies sometimes use various traffic management strategies such as digital waiting rooms. In this episode, The Internet Report team discusses ...
Sep 21, 2024•19 min•Ep 99•Transcript available on Metacast Let’s dive into the fascinating world of subsea cables. With special guest Murray Burling—Executive Director of Oceans and Environment at RPS—we’ll explore the current subsea cable ecosystem and chat about what the future might hold. Tune in for insights on how important subsea cables are for today’s digital experiences, how decisions are made on where to place them, the consequences of cable cuts, and route diversity and Internet resilience. CHAPTERS 00:00 Intro 02:29 Current Subsea Cable Ecosy...
Sep 06, 2024•23 min•Ep 98•Transcript available on Metacast Explore the recent Google Cloud and GitHub outages, plus get insights from a network perspective into the August 12 X livestream event featuring Elon Musk and Donald Trump. In the case of Google Cloud, a power issue in one of its European regions impacted connectivity and affected several services and networking equipment. The problems disrupted connectivity into the region as well as some Partner Interconnect connections and associated routes between other Google regions. Traffic to and from Gi...
Aug 23, 2024•17 min•Ep 97•Transcript available on Metacast This week, The Internet Report team and special guest Dave Anderson—a tech industry veteran and co-host of "A Very Melbourne Podcast," which covers the Australian Football League and more—are chatting about how to assure great digital experiences at major sporting events. Large sporting events are always logistically complex, and today that’s even more the case with digital technology permeating every part of operational and experience delivery. And due to the real-time nature of live sports, an...
Aug 10, 2024•33 min•Ep 96•Transcript available on Metacast On July 19, many organizations around the globe—including airlines, banks, and hospitals—experienced outages as Windows machines reportedly got stuck in a boot loop that ultimately resulted in the Blue Screen of Death (BSOD). These disruptions had a common source: an update from CrowdStrike, a managed detection and response (MDR) service used to protect Windows endpoints from attack. Tune in to hear The Internet Report team’s insights on this CrowdStrike update and the ensuing IT outages. We’ll ...
Jul 26, 2024•18 min•Ep 95•Transcript available on Metacast On May 17, X reached a major milestone when the social media platform completed its full migration from twitter.com to x.com. While the number and frequency of outages did increase after the company’s acquisition by Elon Musk, following the domain migration, there don’t appear to have been any significant disruptions to the X.com platform. In this week’s podcast, The Internet Report team discusses what they observed during (and after) the domain migration, and analyzes X’s performance pre- and p...
Jul 16, 2024•18 min•Ep 94•Transcript available on Metacast Three recent outages at Starlink, Charles Schwab, and the Internet Archive highlight key reminders for NetOps teams around backup options, the role of intelligence, and understanding your end-to-end service delivery chain. A subset of Starlink users were unable to establish a connection; some users of Schwab.com and its apps may have found themselves unable to transact or trade due to an authentication issue; and the Internet Archive and the Wayback Machine were intermittently overwhelmed by une...
Jun 21, 2024•17 min•Ep 93•Transcript available on Metacast Believe it or not, we’re already about halfway through 2024. Looking at the outage data from this year so far, we see continued evolution, following patterns observed over the past few years. Notably, the percentage of cloud service provider (CSP) outages is still increasing—though at a more accelerated rate than seen in recent years. Tune on to learn more about this trend and other themes we’re noticing in the Internet ecosystem, as well as tips for how IT teams can respond to these evolving ch...
Jun 14, 2024•22 min•Ep 92•Transcript available on Metacast When it comes to assuring great digital experiences for your users, intermittent issues can be incredibly difficult to discover and diagnose because the service is both working and not working simultaneously—or, it may simply be running slow. Some users may experience issues, while for others, everything will work just fine. In this week’s episode, The Internet Report team will explore the complexities that intermittent issues can bring by examining two recent incidents at Meta and Salesforce. T...
May 25, 2024•18 min•Ep 91•Transcript available on Metacast Explore what happened during recent outages at google.com, X (formerly Twitter), and CDN service jsDelivr. The Internet Report team will also discuss why a detailed understanding of every component in your service delivery chain is vital to maintain the availability and resiliency of your service. If even one component encounters challenges, the entire service can be impacted. In jsDelivr’s case, for example, the detail at issue was an expired cert, which created problems serving content and imp...
May 10, 2024•18 min•Ep 90•Transcript available on Metacast Go under the hood of a ChatGPT outage, H&R Block’s Tax Day disruption, and more incidents from the past few weeks. The Internet Report team will also discuss Microsoft’s update on recent subsea cable cuts and the latest global outage trends. ——— CHAPTERS: 00:00 Intro 00:57 ChatGPT Outage 03:35 Revisiting West Coast of Africa Cable Cuts 09:07 H&R Block Outage 11:32 Sky Mobile Outage 12:25 Outage on unpkg CDN 14:06 PlayHQ Outage 16:40 Outage Trends: By the Numbers 19:33 Get in Touch ——— Fo...
Apr 27, 2024•20 min•Ep 89•Transcript available on Metacast With tax season coming to a close in the United States, IT teams at tax preparation companies and other organizations in the industry will be taking extra care to make sure that their systems can handle a spike in traffic due to a potential last-minute rush of filings. Tune in to hear The Internet Report hosts discuss how IT teams can navigate major spikes in demand and give customers the best possible digital experience, whether it’s Tax Day, Black Friday, or another high-traffic period. They’l...
Apr 13, 2024•28 min•Ep 88•Transcript available on Metacast The end-to-end delivery of modern digital services can introduce a complex web of dependencies and failure points, which can stem from direct relationships as well as third-party providers, introducing layers of abstraction for operations teams to keep track of. Managing this complex ecosystem can be challenging. Unexpected issues may arise from seemingly insignificant components, surprising even the largest, most technologically sophisticated organizations. For example, in recent weeks, problem...
Mar 30, 2024•17 min•Ep 87•Transcript available on Metacast Over a two-day period this past week, major social media platforms—Meta’s Facebook and Instagram, LinkedIn, and Discord—all experienced disruptions. In the same timeframe, Comcast was also impacted by an outage that affected access to specific services and applications. Meta experienced issues with its log-in process, Discord navigated unexpectedly high load volumes, Comcast dealt with 100% packet loss in part of its backbone, and—the following day—LinkedIn worked its way through a backend issue...
Mar 16, 2024•18 min•Ep 86•Transcript available on Metacast Load is a fundamental but, at times, challenging variable for networks and operations teams to handle. In the past few weeks, ThousandEyes saw various load-related problems impact organizations including Google Cloud, Front, several Australian banks, and Minnesota State University Moorhead. Tune in to learn more about what happened during these incidents, as well as hear our commentary on the recent outage impacting AT&T. Use the timestamps below to jump to the sections that most interest yo...
Mar 04, 2024•16 min•Ep 85•Transcript available on Metacast When outages happen, it’s what you do next that matters. It’s important to have a backup plan in place that you can quickly activate to minimize the impact of an incident. Over the past two weeks, companies initiated a range of resiliency actions, including asking customers to use alternate authentication methods (or to avoid logging out of a service), setting up a new contact center to re-establish lines of communication, and reverting to manual processes. Tune in to learn more about what happe...
Feb 17, 2024•17 min•Ep 84•Transcript available on Metacast The ThousandEyes Internet Intelligence team joins us from Cisco Live in Amsterdam, talking about a major theme from the event—security. Tune in to hear their thoughts on how visibility can help companies in their security efforts, the sovereignty of data in flight, and why you don’t have to choose between security and performance. ——— CHAPTERS 00:00 Intro 01:09 Evolving Security Landscape 04:53 Security Excellence & Optimal Digital Experience 10:13 Sovereignty of Data in Flight 14:57 Key Tak...
Feb 10, 2024•17 min•Ep 83•Transcript available on Metacast What happened during the recent Microsoft Teams and Azure disruptions? Go under the hood of these incidents and also explore other recent disruptions in this week’s Pulse Update. CHAPTERS - 01:03 Network issue leads to Microsoft Teams service disruption - 04:09 Azure Resource Manager exhausts capacity, causing service issues - 06:20 Oracle Cloud experiences network outage - 09:56 Jira users encounter 503s and other errors - 10:30 Sage outage impacts South Africa - 11:08 Red Hat experiences four ...
Feb 03, 2024•17 min•Ep 82•Transcript available on Metacast What caused recent dips in performance for OpenAI’s ChatGPT? Tune in to hear The Internet Report team unpack this and other recent disruptions, including a hack that led to an outage at the Spanish branch of the Orange mobile network, and a blip for customers of the cloud services provider DigitalOcean. They’ll also cover the outage trends they’re seeing in 2024 so far and how extreme cold weather can cause problems for data centers. For more insights on outage trends and analysis of some of the...
Jan 20, 2024•24 min•Ep 81•Transcript available on Metacast As they launch into 2024, organizations are facing a different outage landscape than they had at the start of 2023. The past year saw increases in cloud service provider (CSP) outages, application outages, and the percentage of U.S.-centric outages—all of which point to an evolution in the way outages happen and the need for different strategies to minimize the impact of disruptions. In this episode, Mike Hicks (Principal Solutions Analyst at ThousandEyes) unpacks these trends and shares practic...
Jan 13, 2024•11 min•Ep 80•Transcript available on Metacast