Negative Caching: How DNS Remembers a Domain Doesn't Exist

You add a DNS record and the world keeps saying it doesn't exist. Nothing is wrong with your record. DNS cached its absence — under a TTL you never set, governed by a field in the SOA that nobody touches.

You add a DNS record. The zone is right, your authoritative server hands it out cleanly, dig @your-ns returns the answer. Then you check from your laptop and the rest of the internet still insists the name does not exist. You didn’t typo anything. The record is there. And for the next hour, a chunk of the world gets NXDOMAIN for a name you can plainly see resolving.

Nothing is wrong with your record. What you’re fighting is the cached memory of its absence.

Everyone knows DNS caches answers — that’s the whole point of TTLs. Far fewer people internalize that DNS caches the lack of an answer with the same conviction, on its own separate clock, and that this is not a bug. It has an RFC.

Why “no” gets cached at all

RFC 2308 — Negative Caching of DNS Queries (DNS NCACHE), M. Andrews, March 1998 — exists because a staggering fraction of DNS traffic is for names that will never resolve. Typos. A mail server retrying lookups for a misspelled domain. Malware beaconing to a sinkholed name. Browsers probing for captive portals. If every one of those queries had to walk the tree to an authoritative server every single time, the root and TLD operators would be absorbing a permanent denial-of-service made entirely of mistakes.

So resolvers cache failure. Ask for a name that doesn’t exist, and the resolver remembers “doesn’t exist” so the next person who makes the same mistake gets answered locally. The optimization is correct and necessary. The footgun is in the details of how long it remembers.

There are two kinds of “no”

This trips people up because they collapse two distinct failures into one word.

NXDOMAIN is RCODE 3, a name error: the name itself does not exist in the zone. There is no widget.example.com, full stop, for any record type.

NODATA is sneakier. The name exists, but not the record type you asked for. You query an A record for a host that only has MX records — the name is real, there’s just nothing of that type. There is no wire code for this. NODATA is, in the RFC’s words, a “pseudo RCODE”: the response comes back NOERROR with an empty answer section, and you’re expected to infer the NODATA condition from the absence. Both get cached as negative answers, and both bite the same way.

The TTL you never set

Here’s the part that ruins afternoons. When a resolver caches a negative answer, what governs its lifetime? Not the TTL of the record you were trying to create — that record may not even exist yet, so it has no TTL to offer.

The lifetime comes from the zone’s SOA record. Specifically, RFC 2308 sets it to the minimum of the SOA’s MINIMUM field and the TTL of the SOA record itself. To make this enforceable, authoritative servers MUST include the SOA in the authority section of every NXDOMAIN and NODATA response. The resolver caches that SOA, and each time it serves the cached “no,” it decrements the SOA’s TTL by how long it’s been sitting there, so the negative answer ages out correctly.

Which means the answer to “how long does my mistake live?” is sitting in the last field of an SOA record almost nobody ever looks at.

And that field has a confusing history. The SOA MINIMUM value was overloaded for years with three different jobs: the minimum TTL of every record in the zone, the default TTL for records that didn’t specify one, and the TTL for negative responses. RFC 2308 cleaned house and threw out the first two. The only surviving meaning is negative caching. So that number you copied from a template in 2014 and never thought about again is, today, the single setting that decides how long the world remembers your outages.

Resolvers don’t fully trust you either

If your SOA minimum is set to something absurd, you might expect week-long outages. In practice, resolvers cap it. BIND clamps negative answers with max-ncache-ttl, which defaults to 3 hours and is hard-limited to 7 days — set it higher and BIND silently truncates. Unbound has cache-max-negative-ttl, defaulting to 1 hour.

That’s the good news and the bad news. A giant SOA minimum won’t strand you for a week. But the defaults — one to three hours — are more than enough to make a freshly created record look broken right through a deploy window, while you stare at a config that is completely correct.

The asymmetry nobody plans for

This is the real lesson, and it’s a mindset bug more than a config bug.

People agonize over the TTL of records they create. They drop it to 60 seconds before a migration, they reason carefully about propagation. Almost nobody thinks about the TTL of records that don’t exist yet — because it feels like there’s nothing there to have a TTL. But the absence has one, and you didn’t choose it. Leaving the SOA minimum at whatever your DNS provider defaulted to isn’t “not deciding.” It’s deciding, by default, how long every premature lookup and every just-deleted record haunts you.

So two practical habits. Before you launch a name that something might query early — a health check, a monitoring probe, a CDN warming its own cache against a hostname that isn’t live yet — either create the record first and let any prior negative answer expire, or lower the SOA minimum a few hours ahead of time so the window is short. And when you’re debugging “I added it but it won’t resolve,” stop and ask whether what’s cached is your record or its absence. They expire on completely different clocks, and you can wait out the wrong one indefinitely.

DNS doesn’t only remember where things are. It remembers, just as stubbornly and on a clock you didn’t set, where things aren’t. The bug you already fixed is still true somewhere — and it has a TTL you never chose.

Continue the conversation

← Back to Blog