diff mbox

[RFC,1/2] udp: add non-linear uniform port allocation scheme option /proc/sys/net/ipv4/udp_port_randomization

Message ID 4B2933EF.9060606@ixiacom.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Lucian Adrian Grijincu Dec. 16, 2009, 7:24 p.m. UTC
When we allocate ports with a (really) high frequency, randomization
does more harm as some values tend to repeat with a higher frequency
than they would if allocated uniformly, while others are selected more
rarely.

This patch does not allocate ports linearly as older kernels used to do,
but it allocates the port with an uniform frequency.

For example: assuming UDP_HTABLE_SIZE=8, hint=3, low=0, high=32
This leads to:
> first=3, last=3+8=11, rand=(1 | 1) * UDP_HTABLE_SIZE=8

The port selection code is similar to:
> for first in [3..11):
>     snum = first
>     do if (!good(snum)) snum+=8 while(snum!=first)

Will give the following sequence for snum (skipping `modulo 32` for brevity)
   3,  3+8,  3+8+8,  3+8+8+8,
   4,  4+8,  4+8+8,  4+8+8+8,
  ...
   9,  9+8,  9+8+8,  9+8+8+8,
  10, 10+8, 10+8+8, 10+8+8+8,

This will generate all numbers in the [low..high) interval with the
same frequency. This leads to better performance when most ports are
already allocated.

Randomization is still enabled by default for normal setups that will
most likely not encounter such situations.

Signed-off-by: Lucian Adrian Grijincu <lgrijincu@ixiacom.com>
---
  include/net/udp.h          |    1 +
  net/ipv4/sysctl_net_ipv4.c |    7 +++++++
  net/ipv4/udp.c             |   18 +++++++++++++++---
  3 files changed, 23 insertions(+), 3 deletions(-)

Comments

Eric Dumazet Dec. 17, 2009, 2:07 p.m. UTC | #1
Le 16/12/2009 20:24, Lucian Adrian Grijincu a écrit :
> 
> When we allocate ports with a (really) high frequency, randomization
> does more harm as some values tend to repeat with a higher frequency
> than they would if allocated uniformly, while others are selected more
> rarely.
> 
> This patch does not allocate ports linearly as older kernels used to do,
> but it allocates the port with an uniform frequency.
> 
> For example: assuming UDP_HTABLE_SIZE=8, hint=3, low=0, high=32
> This leads to:
>> first=3, last=3+8=11, rand=(1 | 1) * UDP_HTABLE_SIZE=8
> 
> The port selection code is similar to:
>> for first in [3..11):
>>     snum = first
>>     do if (!good(snum)) snum+=8 while(snum!=first)
> 
> Will give the following sequence for snum (skipping `modulo 32` for
> brevity)
>   3,  3+8,  3+8+8,  3+8+8+8,
>   4,  4+8,  4+8+8,  4+8+8+8,
>  ...
>   9,  9+8,  9+8+8,  9+8+8+8,
>  10, 10+8, 10+8+8, 10+8+8+8,
> 
> This will generate all numbers in the [low..high) interval with the
> same frequency. This leads to better performance when most ports are
> already allocated.
> 
> Randomization is still enabled by default for normal setups that will
> most likely not encounter such situations.
> 

I dont like this patch, I am not convinced at all by your example.

Your example is flawed, since UDP_HTABLE_SIZE >= 128 (128 is the minimum hash size)

I cannot understand why chosing a "one" increment instead of a "random odd" increment
can give a more uniform allocation frequency.

If port randomization is not good enough, please make it better, not reverting to
plain sequential old behavior... (your application can do its own port allocation by the way)

BTW, net-next-2.6 is not yet open, this is not the right time to submit non bug fixes patches.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Octavian Purdila Dec. 17, 2009, 3:16 p.m. UTC | #2
On Thursday 17 December 2009 16:07:55 you wrote:
> Le 16/12/2009 20:24, Lucian Adrian Grijincu a écrit :
> > When we allocate ports with a (really) high frequency, randomization
> > does more harm as some values tend to repeat with a higher frequency
> > than they would if allocated uniformly, while others are selected more
> > rarely.
> >
> > This patch does not allocate ports linearly as older kernels used to do,
> > but it allocates the port with an uniform frequency.
> >
> > For example: assuming UDP_HTABLE_SIZE=8, hint=3, low=0, high=32
> >
> > This leads to:
> >> first=3, last=3+8=11, rand=(1 | 1) * UDP_HTABLE_SIZE=8
> >
> > The port selection code is similar to:
> >> for first in [3..11):
> >>     snum = first
> >>     do if (!good(snum)) snum+=8 while(snum!=first)
> >
> > Will give the following sequence for snum (skipping `modulo 32` for
> > brevity)
> >   3,  3+8,  3+8+8,  3+8+8+8,
> >   4,  4+8,  4+8+8,  4+8+8+8,
> >  ...
> >   9,  9+8,  9+8+8,  9+8+8+8,
> >  10, 10+8, 10+8+8, 10+8+8+8,
> >
> > This will generate all numbers in the [low..high) interval with the
> > same frequency. This leads to better performance when most ports are
> > already allocated.
> >
> > Randomization is still enabled by default for normal setups that will
> > most likely not encounter such situations.
> 
> I dont like this patch, I am not convinced at all by your example.
> 
> Your example is flawed, since UDP_HTABLE_SIZE >= 128 (128 is the minimum
>  hash size)
> 
> I cannot understand why chosing a "one" increment instead of a "random odd"
>  increment can give a more uniform allocation frequency.
> 
> If port randomization is not good enough, please make it better, not
>  reverting to plain sequential old behavior... (your application can do its
>  own port allocation by the way)
> 

Thanks for reviewing Eric. In this thread 

http://kerneltrap.org/mailarchive/linux-netdev/2009/5/8/5667204 (ports being 
reused too fast)

Stephen observed that port randomization effects on same port allocation 
frequency are explained by the birthday paradox. 

The RFC suggesting port randomization recognizes this issue and suggest a way 
to overcome it, but on a first glance it looks expensive. 

Adding a sysctl to sequencial port allocation might not be the best option, 
but we thought of kicking the discussion about this issue with this patch.

> BTW, net-next-2.6 is not yet open, this is not the right time to submit non
>  bug fixes patches.

Yes, we know that, but we are still learning the details. For instance, should 
we refrain from sending RFC patches (as in patches we are not sure that are 
right and want to get early feedback on) as well during the merge window?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Dec. 17, 2009, 4 p.m. UTC | #3
Le 17/12/2009 16:16, Octavian Purdila a écrit :

> Thanks for reviewing Eric. In this thread 
> 
> http://kerneltrap.org/mailarchive/linux-netdev/2009/5/8/5667204 (ports being 
> reused too fast)
> 
> Stephen observed that port randomization effects on same port allocation 
> frequency are explained by the birthday paradox. 

But this was with TCP, not UDP. Without NAT, UDP has no timewait concept.

> 
> The RFC suggesting port randomization recognizes this issue and suggest a way 
> to overcome it, but on a first glance it looks expensive. 
> 
> Adding a sysctl to sequencial port allocation might not be the best option, 
> but we thought of kicking the discussion about this issue with this patch.

Before sending patches, you might first describe the issue ?

> 
>> BTW, net-next-2.6 is not yet open, this is not the right time to submit non
>>  bug fixes patches.
> 
> Yes, we know that, but we are still learning the details. For instance, should 
> we refrain from sending RFC patches (as in patches we are not sure that are 
> right and want to get early feedback on) as well during the merge window?

You always *can* send RFC/patches, but we are pretty busy to correct bugs,
and take some free time to rest :=)

Thanks
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Octavian Purdila Dec. 18, 2009, 1:12 p.m. UTC | #4
On Thursday 17 December 2009 18:00:12 you wrote:
> Le 17/12/2009 16:16, Octavian Purdila a écrit :
> > Thanks for reviewing Eric. In this thread
> >
> > http://kerneltrap.org/mailarchive/linux-netdev/2009/5/8/5667204 (ports
> > being reused too fast)
> >
> > Stephen observed that port randomization effects on same port allocation
> > frequency are explained by the birthday paradox.
> 
> But this was with TCP, not UDP. Without NAT, UDP has no timewait concept.
> 
> > The RFC suggesting port randomization recognizes this issue and suggest a
> > way to overcome it, but on a first glance it looks expensive.
> >
> > Adding a sysctl to sequencial port allocation might not be the best
> > option, but we thought of kicking the discussion about this issue with
> > this patch.
> 
> Before sending patches, you might first describe the issue ?
> 

We really didn't think this through, sorry for the noise. We don't actually 
have encountered an issue with UDP, we just extrapolated from the TCP issue we 
have seen. So, will come back on this one if we really ran into issues.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/udp.h b/include/net/udp.h
index 5348d80..925535e 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -111,6 +111,7 @@  extern atomic_t udp_memory_allocated;
 extern int sysctl_udp_mem[3];
 extern int sysctl_udp_rmem_min;
 extern int sysctl_udp_wmem_min;
+extern int sysctl_udp_port_randomization;
 
 struct sk_buff;
 
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 7e3712c..ef811c3 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -599,6 +599,13 @@  static struct ctl_table ipv4_table[] = {
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= &zero
 	},
+	{
+		.procname	= "udp_port_randomization",
+		.data		= &sysctl_udp_port_randomization,
+		.maxlen		= sizeof(sysctl_udp_port_randomization),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+	},
 	{ }
 };
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 1f95348..f437d9d 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -182,6 +182,9 @@  static int udp_lib_lport_inuse2(struct net *net, __u16 num,
 	return res;
 }
 
+int sysctl_udp_port_randomization = 1;
+EXPORT_SYMBOL(sysctl_udp_port_randomization);
+
 /**
  *  udp_lib_get_port  -  UDP/-Lite port lookup for IPv4 and IPv6
  *
@@ -202,6 +205,7 @@  int udp_lib_get_port(struct sock *sk, unsigned short snum,
 	struct net *net = sock_net(sk);
 
 	if (!snum) {
+		static int hint;
 		int low, high, remaining;
 		unsigned rand;
 		unsigned short first, last;
@@ -210,8 +214,13 @@  int udp_lib_get_port(struct sock *sk, unsigned short snum,
 		inet_get_local_port_range(&low, &high);
 		remaining = (high - low) + 1;
 
-		rand = net_random();
-		first = (((u64)rand * remaining) >> 32) + low;
+		if (likely(sysctl_udp_port_randomization)) {
+			rand = net_random();
+			first = (((u64)rand * remaining) >> 32) + low;
+		} else {
+			rand = 1;
+			first = hint;
+		}
 		/*
 		 * force rand to be an odd multiple of UDP_HTABLE_SIZE
 		 */
@@ -233,8 +242,11 @@  int udp_lib_get_port(struct sock *sk, unsigned short snum,
 			 */
 			do {
 				if (low <= snum && snum <= high &&
-				    !test_bit(snum >> udptable->log, bitmap))
+				    !test_bit(snum >> udptable->log, bitmap)) {
+					if (unlikely(!sysctl_udp_port_randomization))
+						hint = snum;
 					goto found;
+				}
 				snum += rand;
 			} while (snum != first);
 			spin_unlock_bh(&hslot->lock);