Why DNS Uses UDP (and When It Doesn't)

DNS runs the busiest request-response system on the internet on top of a transport that doesn't promise your packet will arrive. That wasn't a shortcut. It was the right call — until the answers got too big.

DNS handles something like two trillion queries a day, and it was built on the one transport that explicitly refuses to guarantee your packet arrives. UDP. Fire it off, hope for the best, no acknowledgement, no retransmission, no connection. For the protocol that the entire internet depends on to find anything, that sounds insane.

It was the right decision. And the story of where it stopped being enough is the story of how DNS actually works under load.

The cost of a conversation

A DNS lookup is the smallest possible transaction. One question — “what’s the A record for this name?” — and one answer that, in the early design, fit in a single packet. The question fits in a packet too.

Now put TCP under that. Before you can ask anything, TCP wants a three-way handshake: SYN, SYN-ACK, ACK. That’s a full round-trip of pure setup before the first byte of your actual question goes out. Then a teardown afterward. To send one tiny question and get one tiny answer, you’d triple the packet count and add a round-trip of latency to every single lookup.

Now multiply by every name every device resolves, aggregated onto one busy resolver. TCP also forces the server to hold connection state — a control block per client, parked in memory through the handshake and timeouts. UDP holds nothing. A packet comes in, an answer goes out, the server forgets you instantly. For a workload that is overwhelmingly tiny, independent, idempotent requests, statelessness isn’t a compromise. It’s the feature.

So DNS picked UDP, accepted that a query might vanish, and pushed reliability up to the client: didn’t hear back in a couple of seconds? Ask again. Cheaper to occasionally re-ask than to set up a connection every time.

The 512-byte ceiling

UDP came with a price written into the original spec. RFC 1035, 1987: a DNS message over UDP is limited to 512 bytes, not counting IP and UDP headers. Everything — the question, the answer, any extra records — had to fit in 512 bytes or it didn’t go.

That number wasn’t arbitrary. 512 bytes was the size everyone was confident any IP path would carry without fragmenting. Stay under it and your answer arrives in exactly one packet, no assembly required.

For a 1987 internet of A and PTR records, 512 bytes was roomy. That assumption aged badly.

The escape hatch was there from day one

The designers weren’t naive about the ceiling, so they built a fallback into the protocol itself. Every DNS response header has a one-bit flag called TC — truncated. When an answer won’t fit in the UDP budget, the server sends back what it can and sets TC=1, which means: “there’s more, ask me again over TCP.”

The client sees the truncation bit, opens a TCP connection, and re-asks. Now it pays the handshake cost — but only for the rare answer too big for a datagram, not for the millions that fit. UDP for the common case, TCP for the overflow.

One job always used TCP regardless: zone transfers. Copying an entire zone between nameservers (AXFR) is bulk data where you genuinely need every byte in order, so it goes over TCP and always has. The split was deliberate: datagrams for lookups, a stream for bulk.

Then the answers got big

Three things blew past 512 bytes.

IPv6 added AAAA records, so names started carrying both v4 and v6 addresses. Email authentication piled TXT records into the zone — SPF, then DKIM keys, then DMARC. And DNSSEC, the security extension, attaches cryptographic signatures to records. Signatures are not small. A DNSSEC-signed response routinely runs into the kilobytes — several times the old ceiling.

The TC-bit fallback still worked, but now it was firing constantly, which meant a TCP retry on a huge fraction of queries — exactly the overhead UDP was chosen to avoid. The ceiling needed to move.

EDNS0: raise the ceiling, hit a wall

The fix was EDNS0 (RFC 6891). It lets a client advertise, in the query, how big a UDP response it’s willing to accept — a number that can go well past 512, commonly 4096. The server, seeing that, can pack a much larger answer into UDP and skip the TCP dance.

It bought years. It also created a subtler problem: those big UDP datagrams have to cross the network as IP fragments, and IP fragmentation is fragile. Middleboxes and firewalls drop fragments. Only the first fragment carries the UDP port numbers, so filters get confused. Worse, fragments are a forgery vector — an attacker can race a spoofed second fragment into a response. In 2020 the IETF published RFC 8900, titled, with unusual bluntness, “IP Fragmentation Considered Fragile.” Relying on fragmented UDP to deliver DNS was building on sand.

DNS Flag Day 2020: make UDP small again

The operator community’s answer was counterintuitive: stop trying to push huge answers through UDP. DNS Flag Day 2020 recommended capping the EDNS buffer at 1232 bytes.

The number has a clean derivation. IPv6 guarantees a minimum MTU of 1280 bytes. Subtract 40 for the IPv6 header and 8 for the UDP header and you get 1232 — the largest DNS payload that fits in one IPv6 packet on essentially any path, no fragmentation. Keep UDP responses under 1232; for anything bigger, set the truncation bit and fall back to TCP. Back to the original design, with a smarter ceiling.

Which only works if TCP is actually there.

TCP stopped being optional

For decades, plenty of implementations treated DNS-over-TCP as a nice-to-have, and some networks blocked port 53 TCP entirely on the theory that “DNS is UDP.” RFC 7766 ended that debate in 2016: general-purpose DNS implementations MUST support both UDP and TCP. Its rationale is almost wry — it notes that the MTU in the core of the internet is routinely exceeded by DNSSEC-signed responses, and that “the future that was anticipated in RFC 1123 has arrived.” Translation: the day we always knew was coming, when DNS would genuinely need TCP, is here.

If your firewall still blocks DNS over TCP, you don’t have a working DNS setup. You have one that works right up until the first answer that doesn’t fit — and then fails in a way nobody thinks to check.

The bill for being connectionless

UDP’s statelessness has a security tax, and DNS has paid it for years. Because there’s no handshake, a UDP packet’s source address is trivial to forge, and that enables two classic attacks.

Cache poisoning: an attacker floods a resolver with forged responses, guessing the query’s transaction ID, hoping a fake answer lands before the real one. Dan Kaminsky’s 2008 work showed how practical this was, and the stopgap — randomizing the source port to widen the ID space attackers must guess — is now standard. DNS cookies (RFC 7873) added another lightweight check. DNSSEC fixes it properly by signing answers, if you deploy it.

Reflection and amplification: spoof a victim’s address as the source, send a small query that triggers a large answer, and the resolver blasts that answer at the victim. A short request becomes a flood. DNS has been one of the favorite engines of volumetric DDoS for exactly this reason — the same small-query-big-answer asymmetry that makes UDP efficient makes it a weapon.

The recent encrypted-DNS transports — DNS-over-TLS and DNS-over-HTTPS — run over TCP, but they’re solving privacy, not size. They just happen to inherit a real connection along the way.

The right call, and the long-overdue correction

UDP was correct in 1987 and it’s still correct for the overwhelming majority of lookups. A connectionless datagram is the natural shape of a one-shot question with a one-shot answer, and nothing about modern traffic changes that for the common case. The mistake was never UDP. The mistake was treating the TCP fallback — written into the protocol from the very first draft — as decorative, and letting the 512-byte assumption ossify into “DNS doesn’t need TCP.”

It always did. We just didn’t hit the wall until the answers got heavy.

Continue the conversation

← Back to Blog