Patchwork tcp: avoid a possible divide by zero

login
register
mail settings
Submitter Eric Dumazet
Date Dec. 7, 2010, 10:03 p.m.
Message ID <1291759435.5324.25.camel@edumazet-laptop>
Download mbox | patch
Permalink /patch/74605/
State Accepted
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - Dec. 7, 2010, 10:03 p.m.
Le mardi 07 décembre 2010 à 21:32 +0000, Ben Hutchings a écrit :
> On Tue, 2010-12-07 at 22:28 +0100, Eric Dumazet wrote:
> [...]
> > Thanks
> > 
> > Great, I feel we are going to fix all sysctls, one by one then :(
> > 
> > lkml removed from Cc
> > 
> > 
> > [PATCH] tcp: avoid a possible divide by zero
> > 
> > sysctl_tcp_tso_win_divisor might be set to zero while one cpu runs in
> > tcp_tso_should_defer(). Make sure we dont allow a divide by zero by
> > reading sysctl_tcp_tso_win_divisor once.
> > 
> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > ---
> >  net/ipv4/tcp_output.c |    6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> > index 05b1ecf..0281223 100644
> > --- a/net/ipv4/tcp_output.c
> > +++ b/net/ipv4/tcp_output.c
> > @@ -1513,6 +1513,7 @@ static int tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb)
> >  	struct tcp_sock *tp = tcp_sk(sk);
> >  	const struct inet_connection_sock *icsk = inet_csk(sk);
> >  	u32 send_win, cong_win, limit, in_flight;
> > +	int win_divisor;
> >  
> >  	if (TCP_SKB_CB(skb)->flags & TCPHDR_FIN)
> >  		goto send_now;
> > @@ -1544,13 +1545,14 @@ static int tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb)
> >  	if ((skb != tcp_write_queue_tail(sk)) && (limit >= skb->len))
> >  		goto send_now;
> >  
> > -	if (sysctl_tcp_tso_win_divisor) {
> > +	win_divisor = sysctl_tcp_tso_win_divisor;
> 
> You need to use ACCESS_ONCE(sysctl_tcp_tso_win_divisor).  Otherwise the
> compiler may eliminate the local variable and read the global twice.

Yes, I knew that, of course :)

I wonder how many bugs like that we have in sysctls

Thanks

[PATCH v2] tcp: avoid a possible divide by zero

sysctl_tcp_tso_win_divisor might be set to zero while one cpu runs in
tcp_tso_should_defer(). Make sure we dont allow a divide by zero by
reading sysctl_tcp_tso_win_divisor exactly once.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
v2: Use ACCESS_ONCE() as Ben suggested

 net/ipv4/tcp_output.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martin Steigerwald - Dec. 8, 2010, 8:23 a.m.
Am Dienstag 07 Dezember 2010 schrieb Eric Dumazet:
> Le mardi 07 décembre 2010 à 21:32 +0000, Ben Hutchings a écrit :
> > On Tue, 2010-12-07 at 22:28 +0100, Eric Dumazet wrote:
> > [...]
> > 
> > > Thanks
> > > 
> > > Great, I feel we are going to fix all sysctls, one by one then :(

Are there so many sysctls which are likely to freeze the kernel when fed 
with wrong value? Once could argue for sysctls where invalid values don't 
cause any serious harm, its not so important to fix it. I probably could 
have next weeks training members a go at poking creative values in other 
controls as well to see what happens.

> > > lkml removed from Cc
> > > 
> > > 
> > > [PATCH] tcp: avoid a possible divide by zero
> > > 
> > > sysctl_tcp_tso_win_divisor might be set to zero while one cpu runs
> > > in tcp_tso_should_defer(). Make sure we dont allow a divide by
> > > zero by reading sysctl_tcp_tso_win_divisor once.
> > > 
> > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > > ---
[...]
> > > +	win_divisor = sysctl_tcp_tso_win_divisor;
> > 
> > You need to use ACCESS_ONCE(sysctl_tcp_tso_win_divisor).  Otherwise
> > the compiler may eliminate the local variable and read the global
> > twice.
> 
> Yes, I knew that, of course :)
> 
> I wonder how many bugs like that we have in sysctls
> 
> Thanks
> 
> [PATCH v2] tcp: avoid a possible divide by zero
> 
> sysctl_tcp_tso_win_divisor might be set to zero while one cpu runs in
> tcp_tso_should_defer(). Make sure we dont allow a divide by zero by
> reading sysctl_tcp_tso_win_divisor exactly once.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> v2: Use ACCESS_ONCE() as Ben suggested
> 
>  net/ipv4/tcp_output.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 05b1ecf..0464d70 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1513,6 +1513,7 @@ static int tcp_tso_should_defer(struct sock *sk,
> struct sk_buff *skb) struct tcp_sock *tp = tcp_sk(sk);
>  	const struct inet_connection_sock *icsk = inet_csk(sk);
>  	u32 send_win, cong_win, limit, in_flight;
> +	int win_divisor;
[...]
> -	if (sysctl_tcp_tso_win_divisor) {
> +	win_divisor = ACCESS_ONCE(sysctl_tcp_tso_win_divisor);
> +	if (win_divisor) {
>  		u32 chunk = min(tp->snd_wnd, tp->snd_cwnd * tp->mss_cache);
> 
>  		/* If at least some fraction of a window is available,
>  		 * just use it.
>  		 */
> -		chunk /= sysctl_tcp_tso_win_divisor;
> +		chunk /= win_divisor;
>  		if (limit >= chunk)
>  			goto send_now;
>  	} else {

So this patch helps other cases as well? Or is it, as I think just a 
different approach, to fix the issue my training member brought up, by its 
cause instead of or additional to limiting its range?

Want to check whether I basically understood the patch. Do you want me to 
test it? 

Thanks,
Eric Dumazet - Dec. 8, 2010, 8:33 a.m.
Le mercredi 08 décembre 2010 à 09:23 +0100, Martin Steigerwald a écrit :

> Are there so many sysctls which are likely to freeze the kernel when fed 
> with wrong value? Once could argue for sysctls where invalid values don't 
> cause any serious harm, its not so important to fix it. I probably could 
> have next weeks training members a go at poking creative values in other 
> controls as well to see what happens.
> 

We have many sysctls that can lead to non working machine.

Any kind of limits actually. Just set them to 0 (or maybe a negative
number :( )

0 socket, 0 file descriptor, 0 memory, 0 speed limit, 0 lengthes ...



> So this patch helps other cases as well? Or is it, as I think just a 
> different approach, to fix the issue my training member brought up, by its 
> cause instead of or additional to limiting its range?
> 
> Want to check whether I basically understood the patch. Do you want me to 
> test it? 

It has nothing to do with the issue you raised, and is a completely
different subject. I got it while spending 5 minutes yesterday night
grep-ing some sysctls in network tree. 0 value is one of expected value
for this sysctl, but the test was not safe.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Dec. 8, 2010, 8:35 p.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 07 Dec 2010 23:03:55 +0100

> [PATCH v2] tcp: avoid a possible divide by zero
> 
> sysctl_tcp_tso_win_divisor might be set to zero while one cpu runs in
> tcp_tso_should_defer(). Make sure we dont allow a divide by zero by
> reading sysctl_tcp_tso_win_divisor exactly once.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 05b1ecf..0464d70 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1513,6 +1513,7 @@  static int tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb)
 	struct tcp_sock *tp = tcp_sk(sk);
 	const struct inet_connection_sock *icsk = inet_csk(sk);
 	u32 send_win, cong_win, limit, in_flight;
+	int win_divisor;
 
 	if (TCP_SKB_CB(skb)->flags & TCPHDR_FIN)
 		goto send_now;
@@ -1544,13 +1545,14 @@  static int tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb)
 	if ((skb != tcp_write_queue_tail(sk)) && (limit >= skb->len))
 		goto send_now;
 
-	if (sysctl_tcp_tso_win_divisor) {
+	win_divisor = ACCESS_ONCE(sysctl_tcp_tso_win_divisor);
+	if (win_divisor) {
 		u32 chunk = min(tp->snd_wnd, tp->snd_cwnd * tp->mss_cache);
 
 		/* If at least some fraction of a window is available,
 		 * just use it.
 		 */
-		chunk /= sysctl_tcp_tso_win_divisor;
+		chunk /= win_divisor;
 		if (limit >= chunk)
 			goto send_now;
 	} else {