diff mbox

net: Fix sk reference counting in ip_push_pending_frames and ip6_push_pending_frames

Message ID 1247334370.7128.6.camel@Maple
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

John Dykstra July 11, 2009, 5:46 p.m. UTC
Commit 2b85a34e911bf483c27cfdd124aeb1605145dc80 "net: No more expensive
sock_hold()/sock_put() on each tx" used sk_wmem_alloc rather than the struct sock reference
count to track in-flight transmit-path packets.  However, it missed the __sock_put() calls
in ip_push_pending_frames() and ip6_push_pending_frames().  This results in too-small
reference counts when UDP or RAW sockets are used to send more than one MTU of data.  This 
in turn could lead to struct sock being freed and reused while it is still part of an
active socket.

A wide variety of socket symptoms may be fixed by this patch.  It also fixes one cause 
of WARN_ON's in sk_del_node_init() and sk_nulls_del_node_init_rcu().

Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
---
 net/ipv4/ip_output.c  |    1 -
 net/ipv6/ip6_output.c |    1 -
 2 files changed, 0 insertions(+), 2 deletions(-)

Comments

Eric Dumazet July 11, 2009, 7:39 p.m. UTC | #1
John Dykstra a écrit :
> Commit 2b85a34e911bf483c27cfdd124aeb1605145dc80 "net: No more expensive
> sock_hold()/sock_put() on each tx" used sk_wmem_alloc rather than the struct sock reference
> count to track in-flight transmit-path packets.  However, it missed the __sock_put() calls
> in ip_push_pending_frames() and ip6_push_pending_frames().  This results in too-small
> reference counts when UDP or RAW sockets are used to send more than one MTU of data.  This 
> in turn could lead to struct sock being freed and reused while it is still part of an
> active socket.
> 
> A wide variety of socket symptoms may be fixed by this patch.  It also fixes one cause 
> of WARN_ON's in sk_del_node_init() and sk_nulls_del_node_init_rcu().
> 
> Signed-off-by: John Dykstra <john.dykstra1@gmail.com>

Nice, but are you aware same patch was already posted, and is waiting for David
approval ?

http://patchwork.ozlabs.org/patch/29618/


> ---
>  net/ipv4/ip_output.c  |    1 -
>  net/ipv6/ip6_output.c |    1 -
>  2 files changed, 0 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 2470262..7d08210 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -1243,7 +1243,6 @@ int ip_push_pending_frames(struct sock *sk)
>  		skb->len += tmp_skb->len;
>  		skb->data_len += tmp_skb->len;
>  		skb->truesize += tmp_skb->truesize;
> -		__sock_put(tmp_skb->sk);
>  		tmp_skb->destructor = NULL;
>  		tmp_skb->sk = NULL;
>  	}
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 7c76e3d..87f8419 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1484,7 +1484,6 @@ int ip6_push_pending_frames(struct sock *sk)
>  		skb->len += tmp_skb->len;
>  		skb->data_len += tmp_skb->len;
>  		skb->truesize += tmp_skb->truesize;
> -		__sock_put(tmp_skb->sk);
>  		tmp_skb->destructor = NULL;
>  		tmp_skb->sk = NULL;
>  	}

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Dykstra July 11, 2009, 8 p.m. UTC | #2
On Sat, 2009-07-11 at 21:39 +0200, Eric Dumazet wrote:
> John Dykstra a écrit :
> > Commit 2b85a34e911bf483c27cfdd124aeb1605145dc80 "net: No more
> expensive
> > sock_hold()/sock_put() on each tx" used sk_wmem_alloc rather than
> the struct sock reference
> > count to track in-flight transmit-path packets.  However, it missed
> the __sock_put() calls
> > in ip_push_pending_frames() and ip6_push_pending_frames().  This
> results in too-small
> > reference counts when UDP or RAW sockets are used to send more than
> one MTU of data.  This 
> > in turn could lead to struct sock being freed and reused while it is
> still part of an
> > active socket.
> > 
> > A wide variety of socket symptoms may be fixed by this patch.  It
> also fixes one cause 
> > of WARN_ON's in sk_del_node_init() and sk_nulls_del_node_init_rcu().
> > 
> > Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
> 
> Nice, but are you aware same patch was already posted, and is waiting
> for David
> approval ?
> 
> http://patchwork.ozlabs.org/patch/29618/

<sigh>  No, I wasn't.  It took me a while to track down where the
reference counts were going wrong, and during that time I wasn't
tracking netdev traffic.

At least it's fixed.

  --  John

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 2470262..7d08210 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1243,7 +1243,6 @@  int ip_push_pending_frames(struct sock *sk)
 		skb->len += tmp_skb->len;
 		skb->data_len += tmp_skb->len;
 		skb->truesize += tmp_skb->truesize;
-		__sock_put(tmp_skb->sk);
 		tmp_skb->destructor = NULL;
 		tmp_skb->sk = NULL;
 	}
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 7c76e3d..87f8419 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1484,7 +1484,6 @@  int ip6_push_pending_frames(struct sock *sk)
 		skb->len += tmp_skb->len;
 		skb->data_len += tmp_skb->len;
 		skb->truesize += tmp_skb->truesize;
-		__sock_put(tmp_skb->sk);
 		tmp_skb->destructor = NULL;
 		tmp_skb->sk = NULL;
 	}