Starting at ~12:12 UTC on Dec 12, 2022, an ISP in the Democratic Republic of Congo leaked a route belonging to the Quad9 DNS service, causing some traffic, including Verizon US customer traffic, to get routed to Africa for ~90 minutes. High traffic loss was observed throughout the incident which was resolved at ~13:40 UTC. 00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. Join our co-hosts Mike Hicks, Principal Solutions Anal...
Dec 14, 2022•27 min•Ep 49•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. In this episode, we unpack four notable outages that impacted WhatsApp, Zscaler, Salesforce, and Facebook, which all appear to have a common theme. Join our co-hosts Mike Hicks, Principal Solutions Analyst at ThousandEyes, and Chris Villemez, Technical Marketing Engineer at ThousandEyes, as they walk through each incident to understand what happened and discuss how network professionals can ...
Nov 04, 2022•29 min•Ep 48•Transcript available on Metacast We're back! 00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On this episode, our newest host, Chris Villemez, is joined by Kemal Sanjta to discuss a BGP-related incident that took down Twitter for many users around the globe on March 28th. 00:36 Under the Hood: Chris Villemez and Kemal Sanjta leverage their extensive operations experience managing the networks of large-scale SaaS, IoT, and cloud providers to analyze this in...
Apr 13, 2022•32 min•Ep 47•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On today’s episode, our newest host and Technical Marketing Engineer, Chris Villemez, is joined by Kemal Sanjta, Principal Engineer, to dive into the details of the recent AWS outages from December 7th, 10th and 15th. They’ll walk through what ThousandEyes saw from its fleet of vantage points, as well as share some insight into what enterprises can learn from these incidents to build resilie...
Dec 17, 2021•44 min•Ep 46•Transcript available on Metacast 00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:15 Headlines: Today we’re going to do a thorough analysis of the major Facebook outage that took place yesterday, Monday, October 4. I’m joined by Gustavo Ramos, ThousandEyes’ in-house expert on Network Engineering. ThousandEyes Blog: https://www.thousandeyes.com/blog/facebook-outage-analysis Analysis from Facebook: https://engineering.fb.com/2021/10/05/networking-traffic...
Oct 06, 2021•27 min•Ep 45•Transcript available on Metacast 00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:08 Headlines: Today, Mike Hicks (Principal Solutions Analyst, ThousandEyes) and I discuss a recent BGP routing incident that had intermittent impacts on Amazon’s services, including Amazon.com and AWS compute resources, during a five-hour period on July 12. 01:04 Under the Hood: When we look into BGP routing at the time, we can see multiple BGP path changes due to a service...
Aug 03, 2021•18 min•Ep 44•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. I’m joined today by Mike Hicks, principal solutions analyst here at ThousandEyes, to cover the outage of Akamai’s DNS service. The outage, which occurred on July 22nd around 3:38 PM UTC (8:38AM PT), struck during the course of business hours in Europe and North America, resulting in widespread impacts to applications and services hosted within Akamai servers. The outage itself was short-live...
Jul 24, 2021•18 min•Ep 43•Transcript available on Metacast 00:00 Welcome:This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:13 Headlines: Today, Kemal and I unpack an interesting BGP incident, in which a large-scale route leak briefly altered traffic patterns across the Internet. 00:58 Under the Hood: The incident began on Thursday, June 3rd at around 10:24 UTC, and resulted in a significant spike in packet loss that was noticeable in ThousandEyes tests. While this packet loss resolved within the...
Jul 02, 2021•21 min•Ep 42•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. I’m joined by ThousandEyes’ BGP expert, Kemal Sanjta, to review the June 16th outage of Prolexic Routed, a DDoS Mitigation Service operated by Akamai. According to a statement from Akamai, the outage was not due to a DDoS attack or system update, but instead a routing table limitation that was inadvertently exceeded. In this episode, Kemal and I analyzed what happened and how customers of Ak...
Jun 25, 2021•26 min•Ep 41•Transcript available on Metacast 00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:12 Headlines: Today, I’m joined by Hans Ashlock, Director of Technology & Innovation at ThousandEyes, to unpack today’s major outage at Fastly, a popular CDN provider. 3:46 Under the Hood: Today, I’m joined by Hans Ashlock, Director of Technology & Innovation at ThousandEyes, to unpack today’s major outage at Fastly, a popular CDN provider. The widespread outage occ...
Jun 09, 2021•40 min•Ep 40•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. I’m joined today by Mike Hicks, Principle Solution Analyst at ThousandEyes, to cover two recent application-related outages. The first occurred on May 19th around 12:50 UTC at Coinbase—a well-known cryptocurrency exchange. Around the time that news broke saying that the Chinese government would be imposing strict regulation on cryptocurrencies, users attempting to execute transactions were u...
Jun 04, 2021•21 min•Ep 39•Transcript available on Metacast 00:00 Welcome 00:14 Headlines: DNS and BGP and DDoS Attacks—Oh, My! This week we cover a couple of recent service degradation incidents involving DNS providers 2:19 Under the Hood: Kemal Sanjta, ThousandEyes’ resident BGP expert, joins us to discuss the May 6th disruption to Neustar’s UltraDNS service, which lasted nearly four hours. We discuss the BGP routing changes we observed during the incident and what they can tell us about the cause of the disruption. We also cover a separate incident in...
May 21, 2021•32 min•Ep 38•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. Today, we focused on an interesting outage that impacted Cloudflare Magic Transit, a relatively new offering from the CDN provider which aims to efficiently route and protect the network traffic of its customers. On May 3rd at approximately 3:00 PM PDT (10:00 PM UTC), ThousandEyes vantage points connecting to sites using Magic Transit began to detect significant packet loss at Cloudflare’s n...
May 07, 2021•10 min•Ep 37•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. We’re joined this week by Hans Ashlock, Director of Technology & Innovation at ThousandEyes, to discuss Tuesday’s Microsoft Teams outage. On Tuesday, April 27th, ThousandEyes tests began to detect an outage affecting the Teams service starting around 3 AM (PT) and lasting approximately 1.5 hours. While the outage occurred in the overnight hours for much of the Americas, the global nature...
Apr 29, 2021•18 min•Ep 36•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On today’s episode, we’re thrilled to be joined by Kemal Sanjta, ThousandEyes’ resident expert on BGP. This week, we’re going under the hood on the April 16th BGP leak at Vodafone India, which leaked more than 30,000 prefixes, causing a major disruption of Internet traffic to some services. While some news outlets reported that the incident lasted approximately 10 minutes (starting around 1:...
Apr 22, 2021•31 min•Ep 35•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. We’re back from a short sabbatical to cover an interesting outage at Facebook in what appears to be an application outage compounded by a series of routing issues. On April 8th, for roughly 40 minutes, the Facebook application became unavailable for users around the globe who were attempting to connect to the service. Despite the short-lived nature of the outage, we observed prolonged perfor...
Apr 15, 2021•25 min•Ep 34•Transcript available on Metacast On today’s episode, we discuss the recent outage on Verizon’s network that had widespread impacts on users in the US. ThousandEyes Broadband Agents detected an outage starting around 11:30am EST that manifested as packet loss across multiple locations concentrated along Verizon backbone in the US east coast and midwest. While the outage was resolved approximately an hour later, users connecting from the Verizon network across the US experienced varying degrees of impact, depending on the service...
Feb 03, 2021•9 min•Ep 33•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. Despite a quiet last couple of weeks on the Internet, we started off our new year with quite the bang. As droves of mildly-caffeinated workers returned to their home offices on Monday after the holiday break, many were surprised to find that Slack was not available. On today’s episode, we go under the hood of Slack’s Monday outage to see what went wrong and how it was resolved. We’re also ex...
Jan 06, 2021•34 min•Ep 32•Transcript available on Metacast In this week's episode of #TheInternetReport... 00:00 Welcome 00:16 Headlines: About Monday’s Google Outage; Plus, Talking Holiday Internet Traffic Trends with Fastly 00:43 Under the Hood: This week, we go under the hood on a recent outage that took down the availability of several Google applications, including YouTube, Gmail and Google Calendar. Yesterday morning at approximately 6:50 AM EST, users around the world were unable to access several Google services for a span of around 40 minutes. ...
Dec 16, 2020•21 min•Ep 31•Transcript available on Metacast If you’re an AWS customer or rely on services that use AWS, you might have noticed the major, hours-long outage last week. On November 25th, at approximately 5:15 am PST, users of Kinesis, a real-time processor of streaming data, began to experience service interruptions. The issue was not network-related, and AWS later issued a detailed incident post-mortem analysis identifying an existing operating system configuration issue that was triggered by a maintenance event that involved adding server...
Dec 01, 2020•15 min•Ep 30•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. This week, we’re pleasantly surprised to say that the network did not break, and there were no major election-night outages to report. However, that’s not to say we didn’t catch performance glitches in the days and weeks around the big night. Watch this week’s episode, as we cover performance issues at a Secretary of State website as well as why CNN’s election map website was so slow to load...
Nov 10, 2020•18 min•Ep 29•Transcript available on Metacast We’ve got an election coming up here in the US, and over the last several weeks, we have been analyzing a dozen or so state election websites to take a closer look at how they’re hosted (e.g., do they use a CDN or are they self-hosted?) and to monitor them for outages. In this episode, we discuss the pros and cons of each hosting method and dive into some examples we’ve seen where election websites have had unexpected performance degradation. Catch this week’s episode to go under the hood on the...
Oct 27, 2020•15 min•Ep 28•Transcript available on Metacast . In this week’s episode, we discuss two notable outages that happened last week. The first, at Twitter, took place on October 15 around 5:30 pm PST and impacted users’ ability to tweet or re-tweet. According to Twitter’s official statement, an internal system error was the culprit—putting to bed any theories of another hack. The second outage took place at the transit provider, Zayo, in the early morning hours of October 13. Although the outage seemed to mostly involve interfaces on the US west...
Oct 20, 2020•7 min•Ep 27•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. In this week’s episode, we dive into a recent outage at Slack that caused intermittent issues for its enterprise users (including ourselves) for nearly a full day. The cause, as noted by Slack, was on the backend and related to an overloaded database. Next, we dig into another outage at Microsoft. According to their statement, a bug in an internal update seems to have revoked the routes to a...
Oct 13, 2020•19 min•Ep 26•Transcript available on Metacast This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On today’s episode, we dive into a recent Azure AD disruption that significantly impacted access to Microsoft cloud services and apps (as well as third-party apps) for nearly three hours. We then went under the hood on a recent BGP hijacking in which Telstra began announcing routes to services that didn’t belong to it, such as Quad9. Catch this episode to hear our take on these incidents, an...
Oct 06, 2020•16 min•Ep 25•Transcript available on Metacast On today’s episode, Angelique and I cover off on a couple outages that occurred over the past week. First, we discuss an application outage at Instagram that occurred on September 17th and lasted around 30 minutes. We also discuss a network outage on September 14th on the AWS backbone near Columbus, Ohio. This outage was a little more widespread, affecting nearly 100 interfaces and lasting around 30 minutes. Next, we dive into the upcoming bans on WeChat and TikTok, which have now been temporari...
Sep 21, 2020•13 min•Ep 24•Transcript available on Metacast It was another quiet week on the Internet, so we wanted to spend some time answering your questions around some recent outages. Catch this episode as we discuss how you can understand the upstream relationships of the services you rely on to assess your risk profile. We also cover why SLAs fall short in protecting your business in the event of an outage, and why you need to proactively collaborate with your providers to solve issues faster.
Sep 14, 2020•11 min•Ep 23•Transcript available on Metacast The Internet held up reasonably well over the past week, all things considered. There were no major outages to report, which is a welcome repose for those impacted by the major outages the week prior. While it’s not an outage that occurred this past week, we did want to spend some time covering the recent Verizon Edgecast outage that occurred on August 21st. Watch this episode as we dive into this application-level outage to understand exactly what happened and who might have been impacted....
Sep 08, 2020•44 min•Ep 22•Transcript available on Metacast This is the Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. It was a rough week on the Internet last week, with outages and incidents across multiple services and providers including Slack, Zoom, AWS, and Verizon. However, in today’s episode we’re going to focus exclusively on Sunday’s CenturyLink / Level 3 outage that according to Cloudflare, caused a significant 3.5% drop in global Internet traffic, making it one of the most significant internet ou...
Sep 01, 2020•44 min•Ep 21•Transcript available on Metacast his is the Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On this week’s episode, Archana and I cover some recent outages that made headlines. This includes the Spotify outage, caused by an expired TLS certificate, that prevented users from accessing its platform. We also cover off on a widespread outage at Cogent during (what seems to be) a maintenance window. Then, we go “under the hood” on the prolonged outage at an IXP on August 18th to understa...
Aug 25, 2020•22 min•Ep 20•Transcript available on Metacast