Here’s the move every ops person has done before a migration: drop the record’s TTL to 60 seconds the day before, so when you flip the IP, the world follows within a minute. Clean cutover.
Then you flip it, and an hour later a stubborn slice of traffic is still hammering the old box. The TTL said 60. You waited 60. The internet did not.
The mistake isn’t in your config. It’s in believing the number means what it says. A TTL is not a timer the internet agrees to honor. It’s a suggestion you hand to a long chain of caches, each of which reserves the right to round it, cap it, floor it, or quietly ignore it.
What the number technically is
TTL is a field in every DNS record, defined back in RFC 1035. RFC 2181 later pinned down the details: it’s an unsigned value with a maximum of 2,147,483,647 — that’s 2³¹−1, somewhere north of 68 years. If a resolver receives a TTL with the top bit set, it’s told to treat the whole thing as zero. So the legal range is “don’t cache at all” to “cache until well after you’ve retired.”
That’s the contract on paper. The authoritative server is saying: you may cache this answer for up to this many seconds. The key word is may. Nothing in the protocol forces a resolver to keep the answer that long, and — this is the part that bites — nothing forces it to throw the answer away the instant the clock hits zero either.
The ceiling, capped from above
Set a record to a week and watch what comes back downstream: often a day.
Recursive resolvers routinely clamp large TTLs. Unbound, one of the most widely deployed resolvers, defaults cache-max-ttl to 86,400 seconds — exactly one day. Configure your record for a week and Unbound files it under “nice try” and caches it for 24 hours. RFC 8767 even blesses this: it suggests resolvers cap TTLs on the order of days to weeks, with 7 days as a sane ceiling, precisely because that’s what real resolvers already do.
So your long TTL is an upper bound on your intent, not on reality. The resolver gets the final say on the maximum, and it’s usually shorter than what you asked for.
The floor, ignored from below
Fine — so go the other way and set a tiny TTL for fast changes. That gets overridden too.
Unbound has a matching cache-min-ttl knob. The operator’s manual is explicit about the tradeoff: raise it and the resolver will ignore the TTL in the response, keeping records longer than the source intended, to avoid re-querying constantly when a domain sets an aggressively low value. A resolver protecting itself from your 30-second TTL will happily hold the answer for minutes.
Browsers do this even more bluntly. Chromium keeps its own in-process DNS cache, and when it relies on the operating system’s resolver it can’t see your TTL at all — getaddrinfo() simply doesn’t return it. So Chromium caps the entry at a flat 60 seconds, chosen to comfortably outlast a single page load so it doesn’t resolve the same name twice while rendering one page. Set a TTL of 5 seconds; Chrome still holds it for 60. Even when Chromium does its own resolution and can see the TTL, 60 seconds is the floor.
The stack nobody told you about
The picture gets worse when you count the layers. Between your authoritative server and the user’s browser there’s the recursive resolver, the operating system’s stub cache, often a corporate resolver, sometimes the home router doing its own caching, and finally the browser. Each one caches. Each one runs its own clock.
And TTL counts down as it passes through. When a resolver hands an answer to the next layer, it subtracts the time it already held the record. So the TTL you see in dig isn’t the original — it’s whatever’s left after every cache upstream took its cut. When a resolver clamps your week-long TTL to a day, downstream clients see a decrementing countdown from that clamped day, never the week you set. The number you configured and the number a user’s machine acts on are rarely the same number.
The negative-caching twist
There’s a final trap that catches people who forgot it exists. When a name doesn’t resolve — NXDOMAIN — that failure gets cached too, under rules from RFC 2308. How long? Not by any TTL on the record you were trying to create, but by the SOA record’s minimum field for that zone.
So you typo a hostname, a resolver caches the “doesn’t exist” answer against the zone’s SOA minimum, and then you fix the record — and it’s still broken for everyone who already asked, for as long as that SOA minimum says. People burn an afternoon on a record they already fixed because the absence of the record is what got cached, with a lifetime they never thought to check.
Plan around the worst actor, not your config
Once you stop reading TTL as a promise, the operational rules write themselves.
Lower TTLs days ahead of a migration, not hours — you need the old, longer TTL to age out of every cache first before the short one even starts mattering. During the cutover, assume an hour of overlap no matter what number you set, and keep the old endpoint serving until the long tail dies. Don’t bother setting a TTL below about 30–60 seconds; browsers and resolvers floor it, so you pay the extra query load and get none of the responsiveness. And before you debug a “stale record,” check whether what’s actually cached is a negative answer governed by the SOA, not the record itself.
TTL isn’t a lie, exactly. It’s a hint, broadcast to a chain of independent caches that each weigh it against their own load and their own defaults. Engineer for the slowest, most stubborn one in that chain — because that’s the one your users are stuck behind.