The 120 second interval is the minimum from RFC4787 Sec. 4.3 REQ-5. (There is an...

simmons · on Aug 27, 2014

I just took a look at Linux's nf_conntrack_proto_udp.c to see what's happening. It looks like Linux starts the UDP timeout at 30 seconds, but extends it to 180 seconds when it sees traffic both ways, assuming it's "some kind of stream."

dvanduzer · on Aug 27, 2014

Privileged ports were a broken security model to begin with.

Consider a a distributed system that relies on that minimum 120 second interval when determining whether to send a keepalive. Failure to send one within this interval would create an otherwise easily avoided network partition.

Consider someone adapting this system to operate across a restricted network which only permits traffic on privileged ports to traverse its border. Suddenly idle connections can be purged well under that previously safe two minute window. Instant partition.

Whether Linux implements the conditional to distinguish between privileged and unprivileged ports is irrelevant. Any serious systems designer must consider the timeout interval to be 30 seconds.

zrm · on Aug 27, 2014

It isn't about privileged ports. It's about the protocols those ports are assigned to, and it's really only referring to port 53 for DNS because it's the only protocol that requires it. Using a shorter mapping timeout for port 22 or port 80 is not permitted.

The reason for DNS to be special is that DNS has to use a random source port for every query to reduce vulnerability to the Kaminsky attack. Most other UDP applications use a single socket (and thus port and thus NAT mapping) for all peers. So if you have many DNS clients behind a NAT you're going to run out of source ports if you try to keep mappings for 120 seconds. Meanwhile the DNS protocol only requires the mapping to persist until the query response is received which is almost immediately.

So DNS gets a special rule. Which isn't a 30 second timeout. There are real NATs in production that use 5 second mapping timeouts on port 53, which is completely reasonable and necessary if your site is making 10,000 DNS queries a second.

The sensible way to deal with all of this is to assume the 120 second timeout initially and if you encounter a hard disconnect that allows an immediate reconnection then reduce the keepalive interval. And don't run non-DNS protocols on port 53 unless you're prepared for the consequences.

dvanduzer · on Aug 27, 2014

Any NAT deployment that large will included local DNS caching servers. Out of the box consumer NAT configurations (WRT-54G, etc) typically run a DNS cache also. Source port randomization in response to Kaminsky's paper in 2008 is definitely the sort of thing that would fill NAT tables faster. But RFC4787 was published in 2007.

The behavior chosen by the implementers of the Linux netfilter code (see simmons' post) elegantly covers most cases without getting as specific as REQ-5 in the RFC, and not nearly as specific as you're describing. Network operators just manually configure a faster timeout for port 53. That's what the RFC permits: tuning in response to real conditions.

A hard disconnect is a major event that usually goes all the way back up the stack. A systems designer wants to avoid such a thing, and the sensible thing is to look at the source (of the most commonly deployed implementations).

zrm · on Aug 27, 2014

> Any NAT deployment that large will included local DNS caching servers

A DNS cache only reduces the number queries by answering some from the cache, it still sends queries and requires source ports proportional to the number of client queries being made.

> Source port randomization in response to Kaminsky's paper in 2008 is definitely the sort of thing that would fill NAT tables faster. But RFC4787 was published in 2007.

The weakness of the 16-bit DNS query ID has been known since long before Kaminsky's actual paper. Sensible DNS caches like Daniel Bernstein's have been using source port randomization since around the turn of the century. Kaminsky just set a fire under the ones that weren't by publishing a devastating proof of concept, so now they all [had better] do it.

> The behavior chosen by the implementers of the Linux netfilter code (see simmons' post) elegantly covers most cases without getting as specific as REQ-5 in the RFC, and not nearly as specific as you're describing.

What Linux is actually doing seems reasonable. There is no requirement that the DNS timeout has to be short, only that it can be. For smaller networks it shouldn't make any real difference.

> Network operators just manually configure a faster timeout for port 53. That's what the RFC permits: tuning in response to real conditions.

Your point was that you have deal with systems that exist in the wild. If some significant percentage of large sites are using very short mapping timeouts for port 53 and you want to handle that, it doesn't much matter if it's a result of manufacturer defaults or manual configuration.

> A hard disconnect is a major event that usually goes all the way back up the stack. A systems designer wants to avoid such a thing, and the sensible thing is to look at the source (of the most commonly deployed implementations).

Sometimes all you have is a dog's breakfast. If it's a common scenario that DNS is the only open UDP port and the mapping timeout for the DNS port can be as little as 5 seconds then you can either always assume that you need a 5 second refresh interval or you can default to a longer one and then have to detect when you're wrong. Do you see any better alternative?

roghummal · on Aug 27, 2014

This has zero to do with the ports being "privileged" and everything to do with their state as a "well-known port range" used by "IANA-registered application(s)".

The idea being you can adjust the timeout because you can infer the way the port is being used.