diff mbox

v3 for tcp friends?

Message ID 1358923606.12374.746.camel@edumazet-glaptop
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Jan. 23, 2013, 6:46 a.m. UTC
On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote:
> Oops, this hang is not since TCP friends patch!
> 
> sk_sndbuf_get() is broken by 32 bits integer overflow
> because of so large value in net.ipv4.tcp_{rmem,wmem}.
> 
> but this hang also can be found in net-next.git
> (3.8.0-rc3+), if we run below commands, then all new
> TCP connections stop working!
> 
> # when TCP friends is disabled
> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB
> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296"

Right we need to make sure we dont overflow.

Try the following fix :



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Li Yu Jan. 23, 2013, 7:21 a.m. UTC | #1
于 2013年01月23日 14:46, Eric Dumazet 写道:
> On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote:
>> Oops, this hang is not since TCP friends patch!
>>
>> sk_sndbuf_get() is broken by 32 bits integer overflow
>> because of so large value in net.ipv4.tcp_{rmem,wmem}.
>>
>> but this hang also can be found in net-next.git
>> (3.8.0-rc3+), if we run below commands, then all new
>> TCP connections stop working!
>>
>> # when TCP friends is disabled
>> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB
>> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296"
>
> Right we need to make sure we dont overflow.
>
> Try the following fix :
>
> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
> index a25e1d2..1459145 100644
> --- a/net/ipv4/sysctl_net_ipv4.c
> +++ b/net/ipv4/sysctl_net_ipv4.c
> @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = {
>   		.data		= &sysctl_tcp_wmem,
>   		.maxlen		= sizeof(sysctl_tcp_wmem),
>   		.mode		= 0644,
> -		.proc_handler	= proc_dointvec
> +		.extra1		= &zero,
> +		.proc_handler	= proc_dointvec_minmax
>   	},
>   	{
>   		.procname	= "tcp_rmem",
>   		.data		= &sysctl_tcp_rmem,
>   		.maxlen		= sizeof(sysctl_tcp_rmem),
>   		.mode		= 0644,
> -		.proc_handler	= proc_dointvec
> +		.extra1		= &zero,
> +		.proc_handler	= proc_dointvec_minmax
>   	},
>   	{
>   		.procname	= "tcp_app_win",
>
>
>
Thanks for so quick reply, I will test it soon.

however I suspect whether this patch could fix overflow if we merged TCP 
friends patch in future.

With TCP friends, we have source like below, or TCP friends should care 
this ?

static inline int sk_sndbuf_get(const struct sock *sk)
{
         if (sk->sk_friend)
                 return sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf;
         else
                 return sk->sk_sndbuf;
}

static inline bool sk_stream_memory_free(const struct sock *sk)
{
         return sk_wmem_queued_get(sk) < sk_sndbuf_get(sk);
}

So sk_sndbuf_get() still may be overflow when we have right value in 
net.ipv4.tcp_{rmem,wmem}.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Li Yu Jan. 23, 2013, 7:58 a.m. UTC | #2
于 2013年01月23日 15:21, Li Yu 写道:
> 于 2013年01月23日 14:46, Eric Dumazet 写道:
>> On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote:
>>> Oops, this hang is not since TCP friends patch!
>>>
>>> sk_sndbuf_get() is broken by 32 bits integer overflow
>>> because of so large value in net.ipv4.tcp_{rmem,wmem}.
>>>
>>> but this hang also can be found in net-next.git
>>> (3.8.0-rc3+), if we run below commands, then all new
>>> TCP connections stop working!
>>>
>>> # when TCP friends is disabled
>>> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB
>>> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296"
>>
>> Right we need to make sure we dont overflow.
>>
>> Try the following fix :
>>
>> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
>> index a25e1d2..1459145 100644
>> --- a/net/ipv4/sysctl_net_ipv4.c
>> +++ b/net/ipv4/sysctl_net_ipv4.c
>> @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = {
>>           .data        = &sysctl_tcp_wmem,
>>           .maxlen        = sizeof(sysctl_tcp_wmem),
>>           .mode        = 0644,
>> -        .proc_handler    = proc_dointvec
>> +        .extra1        = &zero,
>> +        .proc_handler    = proc_dointvec_minmax
>>       },
>>       {
>>           .procname    = "tcp_rmem",
>>           .data        = &sysctl_tcp_rmem,
>>           .maxlen        = sizeof(sysctl_tcp_rmem),
>>           .mode        = 0644,
>> -        .proc_handler    = proc_dointvec
>> +        .extra1        = &zero,
>> +        .proc_handler    = proc_dointvec_minmax
>>       },
>>       {
>>           .procname    = "tcp_app_win",
>>
>>
>>
> Thanks for so quick reply, I will test it soon.
>
> however I suspect whether this patch could fix overflow if we merged TCP
> friends patch in future.
>

Tested on 3.7.0-rc1+, but bug is still alive
with disabled TCP friends ...

Thanks

Yu

> With TCP friends, we have source like below, or TCP friends should care
> this ?
>
> static inline int sk_sndbuf_get(const struct sock *sk)
> {
>          if (sk->sk_friend)
>                  return sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf;
>          else
>                  return sk->sk_sndbuf;
> }
>
> static inline bool sk_stream_memory_free(const struct sock *sk)
> {
>          return sk_wmem_queued_get(sk) < sk_sndbuf_get(sk);
> }
>
> So sk_sndbuf_get() still may be overflow when we have right value in
> net.ipv4.tcp_{rmem,wmem}.
>
> Thanks

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Li Yu Jan. 23, 2013, 9:39 a.m. UTC | #3
于 2013年01月23日 15:58, Li Yu 写道:
> 于 2013年01月23日 15:21, Li Yu 写道:
>> 于 2013年01月23日 14:46, Eric Dumazet 写道:
>>> On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote:
>>>> Oops, this hang is not since TCP friends patch!
>>>>
>>>> sk_sndbuf_get() is broken by 32 bits integer overflow
>>>> because of so large value in net.ipv4.tcp_{rmem,wmem}.
>>>>
>>>> but this hang also can be found in net-next.git
>>>> (3.8.0-rc3+), if we run below commands, then all new
>>>> TCP connections stop working!
>>>>
>>>> # when TCP friends is disabled
>>>> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB
>>>> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296"
>>>
>>> Right we need to make sure we dont overflow.
>>>
>>> Try the following fix :
>>>
>>> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
>>> index a25e1d2..1459145 100644
>>> --- a/net/ipv4/sysctl_net_ipv4.c
>>> +++ b/net/ipv4/sysctl_net_ipv4.c
>>> @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = {
>>>           .data        = &sysctl_tcp_wmem,
>>>           .maxlen        = sizeof(sysctl_tcp_wmem),
>>>           .mode        = 0644,
>>> -        .proc_handler    = proc_dointvec
>>> +        .extra1        = &zero,

If we added below:

+static int one = 1;
+static int int_max = INT_MAX;
....
+               .extra1     = &one,
+		.extra2	    = &int_max,

The bug is fixed without TCP friends.

Maybe, TCP friends need to use long integer to record result
of "sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf" ?

BTW: This overflow problem also breaks UDP sockets.

Thanks

Yu

>>> +        .proc_handler    = proc_dointvec_minmax
>>>       },
>>>       {
>>>           .procname    = "tcp_rmem",
>>>           .data        = &sysctl_tcp_rmem,
>>>           .maxlen        = sizeof(sysctl_tcp_rmem),
>>>           .mode        = 0644,
>>> -        .proc_handler    = proc_dointvec
>>> +        .extra1        = &zero,
>>> +        .proc_handler    = proc_dointvec_minmax
>>>       },
>>>       {
>>>           .procname    = "tcp_app_win",
>>>
>>>
>>>
>> Thanks for so quick reply, I will test it soon.
>>
>> however I suspect whether this patch could fix overflow if we merged TCP
>> friends patch in future.
>>
>
> Tested on 3.7.0-rc1+, but bug is still alive
> with disabled TCP friends ...
>
> Thanks
>
> Yu
>
>> With TCP friends, we have source like below, or TCP friends should care
>> this ?
>>
>> static inline int sk_sndbuf_get(const struct sock *sk)
>> {
>>          if (sk->sk_friend)
>>                  return sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf;
>>          else
>>                  return sk->sk_sndbuf;
>> }
>>
>> static inline bool sk_stream_memory_free(const struct sock *sk)
>> {
>>          return sk_wmem_queued_get(sk) < sk_sndbuf_get(sk);
>> }
>>
>> So sk_sndbuf_get() still may be overflow when we have right value in
>> net.ipv4.tcp_{rmem,wmem}.
>>
>> Thanks
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Li Yu Jan. 23, 2013, 9:52 a.m. UTC | #4
于 2013年01月23日 17:39, Li Yu 写道:
> 于 2013年01月23日 15:58, Li Yu 写道:
>> 于 2013年01月23日 15:21, Li Yu 写道:
>>> 于 2013年01月23日 14:46, Eric Dumazet 写道:
>>>> On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote:
>>>>> Oops, this hang is not since TCP friends patch!
>>>>>
>>>>> sk_sndbuf_get() is broken by 32 bits integer overflow
>>>>> because of so large value in net.ipv4.tcp_{rmem,wmem}.
>>>>>
>>>>> but this hang also can be found in net-next.git
>>>>> (3.8.0-rc3+), if we run below commands, then all new
>>>>> TCP connections stop working!
>>>>>
>>>>> # when TCP friends is disabled
>>>>> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB
>>>>> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296"
>>>>
>>>> Right we need to make sure we dont overflow.
>>>>
>>>> Try the following fix :
>>>>
>>>> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
>>>> index a25e1d2..1459145 100644
>>>> --- a/net/ipv4/sysctl_net_ipv4.c
>>>> +++ b/net/ipv4/sysctl_net_ipv4.c
>>>> @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = {
>>>>           .data        = &sysctl_tcp_wmem,
>>>>           .maxlen        = sizeof(sysctl_tcp_wmem),
>>>>           .mode        = 0644,
>>>> -        .proc_handler    = proc_dointvec
>>>> +        .extra1        = &zero,
>
> If we added below:
>
> +static int one = 1;
> +static int int_max = INT_MAX;
> ....
> +               .extra1     = &one,
> +        .extra2        = &int_max,
>

The "int_max" may be unnecessary here :)

> The bug is fixed without TCP friends.
>
> Maybe, TCP friends need to use long integer to record result
> of "sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf" ?
>
> BTW: This overflow problem also breaks UDP sockets.

Unix domain sockets is broken too, and any other ?

thanks

Yu

>
> Thanks
>
> Yu
>
>>>> +        .proc_handler    = proc_dointvec_minmax
>>>>       },
>>>>       {
>>>>           .procname    = "tcp_rmem",
>>>>           .data        = &sysctl_tcp_rmem,
>>>>           .maxlen        = sizeof(sysctl_tcp_rmem),
>>>>           .mode        = 0644,
>>>> -        .proc_handler    = proc_dointvec
>>>> +        .extra1        = &zero,
>>>> +        .proc_handler    = proc_dointvec_minmax
>>>>       },
>>>>       {
>>>>           .procname    = "tcp_app_win",
>>>>
>>>>
>>>>
>>> Thanks for so quick reply, I will test it soon.
>>>
>>> however I suspect whether this patch could fix overflow if we merged TCP
>>> friends patch in future.
>>>
>>
>> Tested on 3.7.0-rc1+, but bug is still alive
>> with disabled TCP friends ...
>>
>> Thanks
>>
>> Yu
>>
>>> With TCP friends, we have source like below, or TCP friends should care
>>> this ?
>>>
>>> static inline int sk_sndbuf_get(const struct sock *sk)
>>> {
>>>          if (sk->sk_friend)
>>>                  return sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf;
>>>          else
>>>                  return sk->sk_sndbuf;
>>> }
>>>
>>> static inline bool sk_stream_memory_free(const struct sock *sk)
>>> {
>>>          return sk_wmem_queued_get(sk) < sk_sndbuf_get(sk);
>>> }
>>>
>>> So sk_sndbuf_get() still may be overflow when we have right value in
>>> net.ipv4.tcp_{rmem,wmem}.
>>>
>>> Thanks
>>
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Pan(潘卫平) Jan. 23, 2013, 1:16 p.m. UTC | #5
On 01/23/2013 05:52 PM, Li Yu wrote:
> 于 2013年01月23日 17:39, Li Yu 写道:
>> 于 2013年01月23日 15:58, Li Yu 写道:
>>> 于 2013年01月23日 15:21, Li Yu 写道:
>>>> 于 2013年01月23日 14:46, Eric Dumazet 写道:
>>>>> On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote:
>>>>>> Oops, this hang is not since TCP friends patch!
>>>>>>
>>>>>> sk_sndbuf_get() is broken by 32 bits integer overflow
>>>>>> because of so large value in net.ipv4.tcp_{rmem,wmem}.
>>>>>>
>>>>>> but this hang also can be found in net-next.git
>>>>>> (3.8.0-rc3+), if we run below commands, then all new
>>>>>> TCP connections stop working!
>>>>>>
>>>>>> # when TCP friends is disabled
>>>>>> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB
>>>>>> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296"
>>>>>
>>>>> Right we need to make sure we dont overflow.
>>>>>
>>>>> Try the following fix :
>>>>>
>>>>> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
>>>>> index a25e1d2..1459145 100644
>>>>> --- a/net/ipv4/sysctl_net_ipv4.c
>>>>> +++ b/net/ipv4/sysctl_net_ipv4.c
>>>>> @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = {
>>>>>           .data        = &sysctl_tcp_wmem,
>>>>>           .maxlen        = sizeof(sysctl_tcp_wmem),
>>>>>           .mode        = 0644,
>>>>> -        .proc_handler    = proc_dointvec
>>>>> +        .extra1        = &zero,
>>
>> If we added below:
>>
>> +static int one = 1;
>> +static int int_max = INT_MAX;
>> ....
>> +               .extra1     = &one,
>> +        .extra2        = &int_max,
>>
>
> The "int_max" may be unnecessary here :)
Hi, Li Yu,

I tested that your patch works fine.
Can you post a complete patch ?

thanks
Weiping Pan

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiaotian Feng Jan. 24, 2013, 2:04 a.m. UTC | #6
On Wed, Jan 23, 2013 at 5:39 PM, Li Yu <raise.sail@gmail.com> wrote:
> 于 2013年01月23日 15:58, Li Yu 写道:
>
>> 于 2013年01月23日 15:21, Li Yu 写道:
>>>
>>> 于 2013年01月23日 14:46, Eric Dumazet 写道:
>>>>
>>>> On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote:
>>>>>
>>>>> Oops, this hang is not since TCP friends patch!
>>>>>
>>>>> sk_sndbuf_get() is broken by 32 bits integer overflow
>>>>> because of so large value in net.ipv4.tcp_{rmem,wmem}.
>>>>>
>>>>> but this hang also can be found in net-next.git
>>>>> (3.8.0-rc3+), if we run below commands, then all new
>>>>> TCP connections stop working!
>>>>>
>>>>> # when TCP friends is disabled
>>>>> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB
>>>>> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296"
>>>>
>>>>
>>>> Right we need to make sure we dont overflow.
>>>>
>>>> Try the following fix :
>>>>
>>>> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
>>>> index a25e1d2..1459145 100644
>>>> --- a/net/ipv4/sysctl_net_ipv4.c
>>>> +++ b/net/ipv4/sysctl_net_ipv4.c
>>>> @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = {
>>>>           .data        = &sysctl_tcp_wmem,
>>>>           .maxlen        = sizeof(sysctl_tcp_wmem),
>>>>           .mode        = 0644,
>>>> -        .proc_handler    = proc_dointvec
>>>> +        .extra1        = &zero,
>
>
> If we added below:
>
> +static int one = 1;

one is defined in kernel/sysctl.c

> +static int int_max = INT_MAX;
> ....
> +               .extra1     = &one,
> +               .extra2     = &int_max,

I believe INT_MAX does not have actual effect, because any value
bigger than INT_MAX will be cast to negative value, which is prevented
by extra1...

>
> The bug is fixed without TCP friends.
>
> Maybe, TCP friends need to use long integer to record result
> of "sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf" ?
>
> BTW: This overflow problem also breaks UDP sockets.
>
> Thanks
>
> Yu
>
>
>>>> +        .proc_handler    = proc_dointvec_minmax
>>>>       },
>>>>       {
>>>>           .procname    = "tcp_rmem",
>>>>           .data        = &sysctl_tcp_rmem,
>>>>           .maxlen        = sizeof(sysctl_tcp_rmem),
>>>>           .mode        = 0644,
>>>> -        .proc_handler    = proc_dointvec
>>>> +        .extra1        = &zero,
>>>> +        .proc_handler    = proc_dointvec_minmax
>>>>       },
>>>>       {
>>>>           .procname    = "tcp_app_win",
>>>>
>>>>
>>>>
>>> Thanks for so quick reply, I will test it soon.
>>>
>>> however I suspect whether this patch could fix overflow if we merged TCP
>>> friends patch in future.
>>>
>>
>> Tested on 3.7.0-rc1+, but bug is still alive
>> with disabled TCP friends ...
>>
>> Thanks
>>
>> Yu
>>
>>> With TCP friends, we have source like below, or TCP friends should care
>>> this ?
>>>
>>> static inline int sk_sndbuf_get(const struct sock *sk)
>>> {
>>>          if (sk->sk_friend)
>>>                  return sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf;
>>>          else
>>>                  return sk->sk_sndbuf;
>>> }
>>>
>>> static inline bool sk_stream_memory_free(const struct sock *sk)
>>> {
>>>          return sk_wmem_queued_get(sk) < sk_sndbuf_get(sk);
>>> }
>>>
>>> So sk_sndbuf_get() still may be overflow when we have right value in
>>> net.ipv4.tcp_{rmem,wmem}.
>>>
>>> Thanks
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index a25e1d2..1459145 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -549,14 +549,16 @@  static struct ctl_table ipv4_table[] = {
 		.data		= &sysctl_tcp_wmem,
 		.maxlen		= sizeof(sysctl_tcp_wmem),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec
+		.extra1		= &zero,
+		.proc_handler	= proc_dointvec_minmax
 	},
 	{
 		.procname	= "tcp_rmem",
 		.data		= &sysctl_tcp_rmem,
 		.maxlen		= sizeof(sysctl_tcp_rmem),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec
+		.extra1		= &zero,
+		.proc_handler	= proc_dointvec_minmax
 	},
 	{
 		.procname	= "tcp_app_win",