Cache poisoning gets much more interesting when the cache sits in front of everyone.
That’s what makes CDN cache poisoning such a nasty class of bug. You’re not just tricking one browser or one proxy. You’re getting a shared edge layer to store a response shaped by attacker-controlled input and then replay that poisoned response to every visitor who maps to the same cache key.
One weird request. A lot of collateral.
What the CDN Thinks It’s Doing
A CDN caches responses so it doesn’t have to ask the origin for the same thing every time. Faster pages, less origin load, happier operators. When a request comes in, the CDN computes a cache key — usually derived from the URL path, query string, and maybe a few headers — and checks if a cached response exists for that key. If yes, serve the cache. If no, forward to origin, cache the response, serve it.
The security assumption: two requests with the same cache key should produce the same response.
The attack: find an input that the origin uses to generate the response but the CDN doesn’t include in the cache key. Manipulate that input. The origin returns a poisoned response. The CDN caches it under the normal cache key. Everyone else gets the poisoned version.
The Cache Key Is Everything
The cache key is the CDN’s definition of “same request.” If two requests produce the same cache key, the CDN treats them as interchangeable. Whatever the first one returned, the second one gets from cache.
A typical cache key includes the URL path and query string. It usually does not include most HTTP headers. This is where the gap lives.
The origin server might use a header to decide how to generate the response. The CDN might not include that header in the cache key. The header is “unkeyed” — it influences the response but doesn’t influence which cache entry the response is stored under.
The Unkeyed Header Attack
James Kettle’s foundational research on web cache poisoning (presented at Black Hat 2018 and expanded since) demonstrated this pattern clearly.
Consider X-Forwarded-Host. Some origin servers use this header to generate absolute URLs in the response — for redirects, canonical tags, resource links, or OpenGraph metadata. The origin trusts the header because it expects the CDN or proxy to set it correctly.
An attacker sends a request with X-Forwarded-Host: evil.com. The origin generates a response that includes <link rel="canonical" href="https://evil.com/page"> or, worse, <script src="https://evil.com/payload.js">. The CDN doesn’t include X-Forwarded-Host in its cache key because it’s not part of the URL. So the CDN caches this poisoned response under the normal cache key for /page.
Every subsequent visitor requesting /page gets the cached version — complete with the attacker’s injected content. The attacker’s request is long gone. The poison lives in the cache.
Variations on the Theme
The X-Forwarded-Host attack is the classic example, but the pattern applies to any unkeyed input:
X-Original-URL / X-Rewrite-URL. Some web frameworks (notably older IIS and Symfony configurations) let these headers override the request path. Send a request for /harmless with X-Original-URL: /admin and the origin might process the admin page while the CDN caches the result under /harmless.
Query parameter pollution. Some CDNs normalize or ignore certain query parameters in the cache key. If the origin uses a parameter the CDN ignores, you can inject content through it.
Host header manipulation. If the origin uses the Host header to construct URLs and the CDN doesn’t validate it against the actual request, you can inject arbitrary hostnames into cached responses.
Varying Accept-Language or Cookie. If the origin returns different content based on a header the CDN doesn’t key on, you can cache a response intended for one context and serve it to everyone.
Why This Is Hard to Defend
The root cause is a mismatch between what the CDN considers “the same request” and what the origin considers “the same request.” Any input that influences the response but isn’t in the cache key is a potential attack surface.
Fixing this requires knowing every input your origin uses to generate responses. In practice, that’s difficult:
- Frameworks add headers you didn’t configure
- Middleware injects context from headers you didn’t know about
- Third-party code reads headers for analytics or routing
- The CDN documentation doesn’t fully specify its cache key behavior
The attack surface is the gap between your CDN’s cache key and your origin’s response generation logic. That gap is often undocumented on both sides.
Defenses
Include all response-influencing inputs in the cache key. If the origin uses X-Forwarded-Host to generate URLs, the CDN must include it in the cache key. If the response varies by Accept-Language, the CDN must key on it. The Vary header exists for this — Vary: X-Forwarded-Host tells the CDN to key on that header.
Don’t trust forwarded headers at the origin. If your origin uses X-Forwarded-Host, validate it against an allowlist. Don’t blindly reflect it into HTML.
Strip unknown headers at the CDN edge. Remove headers that the origin doesn’t need. If X-Original-URL never reaches the origin, it can’t influence the response.
Cache defensively. Don’t cache responses that include user-controlled content in the body. If a response includes data derived from request headers, either key on those headers or don’t cache the response at all.
Monitor for cache poisoning. Compare cached responses against expected output. If a cached page suddenly contains a domain you don’t own, your cache has been poisoned.
The Uncomfortable Truth
Most CDN users don’t audit their cache key configuration. They enable caching, accept the defaults, and assume the CDN handles security. The CDN assumes the origin handles security. Nobody audits the gap.
Cache poisoning is subtle because the poisoned response looks normal. It’s served from the CDN. It has the right cache headers. The URL is correct. The only difference is the content — and if the injected content is a script tag or a redirect, visitors won’t notice until it’s too late.
Your CDN is not just a performance layer. It’s an attack surface. One that serves whatever it cached, to everyone, without question.