Published: 2026-03-28OpinionProtocol

CDNs Are a Single Point of Failure

CDN consolidation means a single outage takes down thousands of sites. We traded distributed resilience for centralized convenience.

CDNs were supposed to make the web more resilient.

In one narrow sense, they did. They absorb traffic, cache content close to users, smooth over origin outages, and protect sites that would otherwise fold under load. If your origin dies but the cache can still serve, the CDN looks like a hero.

Then the CDN itself has a bad day.

And suddenly the hero is the blast radius.

The Concentration Is Not Subtle

A handful of CDN providers sit in front of a massive percentage of web traffic. Cloudflare alone proxies traffic for millions of domains. Akamai, Fastly, and Amazon CloudFront handle most of the rest. The market consolidated aggressively because CDNs have enormous economies of scale — more edge nodes means better performance means more customers means more edge nodes.

The result: a significant fraction of the internet depends on three or four companies’ infrastructure staying up.

When They Go Down, Everyone Notices

June 2021. Fastly had a configuration bug triggered by a customer change. For about an hour, major sites were unreachable: Reddit, The New York Times, the UK government site, Amazon, Twitch, Stack Overflow. Not slow. Gone. The same edge that was supposed to protect them became the thing that took them down.

Cloudflare has had similar incidents. A BGP issue in June 2022 took down a significant portion of their network for about 30 minutes. Cloudflare’s incident reports are admirably transparent, which also means we know exactly how many services were affected: thousands of domains, hundreds of millions of users.

These aren’t hypothetical. They happen regularly. The blast radius of a single CDN outage now exceeds what most individual site outages could ever cause.

The Irony

CDNs add resilience against origin failures. Your server goes down, the CDN serves cached content. Your server gets DDoS’d, the CDN absorbs it. For the specific failure mode of “origin can’t handle the load,” CDNs are excellent.

But they introduce a different failure mode: “the CDN itself fails.” And because so many sites share the same CDN, this failure mode is correlated. When Fastly goes down, it’s not one site. It’s thousands of unrelated sites failing simultaneously for the same reason.

The old internet — before CDN consolidation — had uncorrelated failures. Your site going down didn’t affect mine. We had different servers, different networks, different DNS. The failure surface was distributed.

The new internet has correlated failures. My site and your site and ten thousand other sites all go down at the same time because we all chose the same CDN. We gained performance. We lost independence.

The DNS Problem Makes It Worse

Many sites point their DNS directly at CDN infrastructure. Your domain’s A record (or CNAME) points to Cloudflare’s edge. If Cloudflare’s DNS goes down — or if the edge IP stops responding — your domain is unresolvable. Not slow. Gone.

With a traditional setup, if your origin dies, you can change the DNS to point somewhere else. With a CDN-as-DNS setup, the DNS itself is the CDN. There’s no fallback. The CDN is the name resolution and the content delivery and the DDoS protection and the SSL termination. All of it. One vendor. One bill. One point of failure.

Multi-CDN Exists but Nobody Uses It

The obvious solution: use multiple CDNs with DNS-based failover. If Cloudflare goes down, traffic routes to Fastly. Sophisticated. Resilient.

Also complex and expensive. Multi-CDN requires managing cache invalidation across providers, keeping SSL certificates in sync, handling different configuration APIs, and paying two CDN bills. For large enterprises, the cost is justifiable. For the millions of small-to-medium sites that make up most of the internet, it’s overkill for an event that happens a few hours per year.

The rational calculation: accept the risk. A few hours of downtime per year from a CDN outage is cheaper than the engineering cost of multi-CDN. This is correct math. It’s also why the concentration keeps getting worse — the rational individual choice produces a fragile collective outcome.

The Cloud Parallel

This is the same pattern as cloud computing. AWS, Azure, and GCP concentrated compute capacity that used to be distributed across thousands of independent data centers. When us-east-1 has a bad day, a visible fraction of the internet stutters.

CDNs did the same thing for content delivery. The internet used to be a mesh. It’s increasingly a hub-and-spoke where the hubs are three companies.

What Should Change

Honestly? Probably nothing will change. The economics are too compelling. CDNs are genuinely good at what they do. The performance improvement is real. The DDoS protection is real. The operational convenience is real.

But the industry should be honest about the tradeoff. We chose performance and convenience over resilience. Every site behind a single CDN has an implicit dependency on that CDN’s uptime. The SLA says 99.99%. The reality is that when the 0.01% happens, it happens to everyone at once.

The next time a CDN goes down and half the internet breaks, remember: this was a choice. A reasonable one, maybe. But a choice, not an accident. We centralized the web for speed and paid for it in fragility.

Continue the conversation

Discuss on X Discuss on Reddit

← Back to Blog