Patchwork BUG: IPv4: Attempt to release TCP socket in state 1

login
register
mail settings
Submitter Eric Dumazet
Date March 16, 2013, 5:36 p.m.
Message ID <1363455366.29475.66.camel@edumazet-glaptop>
Download mbox | patch
Permalink /patch/228231/
State RFC
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - March 16, 2013, 5:36 p.m.
On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:

> Thanks thats really useful, we might miss to increment socket refcount
> in a timer setup.
> 

Hmm, please add following debugging patch as well



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - March 16, 2013, 5:44 p.m.
On Sat, 2013-03-16 at 10:36 -0700, Eric Dumazet wrote:
> On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:
> 
> > Thanks thats really useful, we might miss to increment socket refcount
> > in a timer setup.
> > 
> 
> Hmm, please add following debugging patch as well
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 14f6e9d..fe7c8a6 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -530,7 +530,9 @@ static inline void sock_hold(struct sock *sk)
>   */
>  static inline void __sock_put(struct sock *sk)
>  {
> -	atomic_dec(&sk->sk_refcnt);
> +	int newref = atomic_dec_return(&sk->sk_refcnt);
> +
> +	BUG_ON(newref <= 0);
>  }
>  
>  static inline bool sk_del_node_init(struct sock *sk)
> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> index 786d97a..a445e15 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -739,7 +739,7 @@ void inet_csk_prepare_forced_close(struct sock *sk)
>  {
>  	/* sk_clone_lock locked the socket and set refcnt to 2 */
>  	bh_unlock_sock(sk);
> -	sock_put(sk);
> +	__sock_put(sk);
>  
>  	/* The below has to be done to allow calling inet_csk_destroy_sock */
>  	sock_set_flag(sk, SOCK_DEAD);
> @@ -835,13 +835,13 @@ void inet_csk_listen_stop(struct sock *sk)
>  			 * tcp_v4_destroy_sock().
>  			 */
>  			tcp_sk(child)->fastopen_rsk = NULL;
> -			sock_put(sk);
> +			__sock_put(sk);
>  		}
>  		inet_csk_destroy_sock(child);
>  
>  		bh_unlock_sock(child);
>  		local_bh_enable();
> -		sock_put(child);
> +		__sock_put(child);
>  

Please don't include the last line : this should stay as

 sock_put(child);



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
dormando - March 16, 2013, 8:16 p.m.
> On Sat, 2013-03-16 at 10:36 -0700, Eric Dumazet wrote:
> > On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:
> >
> > > Thanks thats really useful, we might miss to increment socket refcount
> > > in a timer setup.
> > >
> >
> > Hmm, please add following debugging patch as well
> >
> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index 14f6e9d..fe7c8a6 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -530,7 +530,9 @@ static inline void sock_hold(struct sock *sk)
> >   */
> >  static inline void __sock_put(struct sock *sk)
> >  {
> > -	atomic_dec(&sk->sk_refcnt);
> > +	int newref = atomic_dec_return(&sk->sk_refcnt);
> > +
> > +	BUG_ON(newref <= 0);
> >  }
> >
> >  static inline bool sk_del_node_init(struct sock *sk)
> > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> > index 786d97a..a445e15 100644
> > --- a/net/ipv4/inet_connection_sock.c
> > +++ b/net/ipv4/inet_connection_sock.c
> > @@ -739,7 +739,7 @@ void inet_csk_prepare_forced_close(struct sock *sk)
> >  {
> >  	/* sk_clone_lock locked the socket and set refcnt to 2 */
> >  	bh_unlock_sock(sk);
> > -	sock_put(sk);
> > +	__sock_put(sk);
> >
> >  	/* The below has to be done to allow calling inet_csk_destroy_sock */
> >  	sock_set_flag(sk, SOCK_DEAD);
> > @@ -835,13 +835,13 @@ void inet_csk_listen_stop(struct sock *sk)
> >  			 * tcp_v4_destroy_sock().
> >  			 */
> >  			tcp_sk(child)->fastopen_rsk = NULL;
> > -			sock_put(sk);
> > +			__sock_put(sk);
> >  		}
> >  		inet_csk_destroy_sock(child);
> >
> >  		bh_unlock_sock(child);
> >  		local_bh_enable();
> > -		sock_put(child);
> > +		__sock_put(child);
> >
>
> Please don't include the last line : this should stay as
>
>  sock_put(child);
>

thanks! Will take at least 24 hours to get the trace.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa - March 17, 2013, 6:39 a.m.
On Sat, Mar 16, 2013 at 10:36:06AM -0700, Eric Dumazet wrote:
> On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:
> 
> > Thanks thats really useful, we might miss to increment socket refcount
> > in a timer setup.
> > 
> 
> Hmm, please add following debugging patch as well
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 14f6e9d..fe7c8a6 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -530,7 +530,9 @@ static inline void sock_hold(struct sock *sk)
>   */
>  static inline void __sock_put(struct sock *sk)
>  {
> -	atomic_dec(&sk->sk_refcnt);
> +	int newref = atomic_dec_return(&sk->sk_refcnt);
> +
> +	BUG_ON(newref <= 0);
>  }

Couldn't it also be a free from sock_wfree where the wmem accounting went
wrong? It does not care about reference counts there.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa - March 17, 2013, 7:53 a.m.
On Sun, Mar 17, 2013 at 07:39:48AM +0100, Hannes Frederic Sowa wrote:
> On Sat, Mar 16, 2013 at 10:36:06AM -0700, Eric Dumazet wrote:
> > On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:
> > 
> > > Thanks thats really useful, we might miss to increment socket refcount
> > > in a timer setup.
> > > 
> > 
> > Hmm, please add following debugging patch as well
> > 
> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index 14f6e9d..fe7c8a6 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -530,7 +530,9 @@ static inline void sock_hold(struct sock *sk)
> >   */
> >  static inline void __sock_put(struct sock *sk)
> >  {
> > -	atomic_dec(&sk->sk_refcnt);
> > +	int newref = atomic_dec_return(&sk->sk_refcnt);
> > +
> > +	BUG_ON(newref <= 0);
> >  }
> 
> Couldn't it also be a free from sock_wfree where the wmem accounting went
> wrong? It does not care about reference counts there.

nvm, it had to be in the stacktrace then.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
dormando - March 17, 2013, 9:21 a.m.
> On Sat, 2013-03-16 at 10:36 -0700, Eric Dumazet wrote:
> > On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote:
> >
> > > Thanks thats really useful, we might miss to increment socket refcount
> > > in a timer setup.
> > >
> >
> > Hmm, please add following debugging patch as well
> >
> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index 14f6e9d..fe7c8a6 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -530,7 +530,9 @@ static inline void sock_hold(struct sock *sk)
> >   */
> >  static inline void __sock_put(struct sock *sk)
> >  {
> > -	atomic_dec(&sk->sk_refcnt);
> > +	int newref = atomic_dec_return(&sk->sk_refcnt);
> > +
> > +	BUG_ON(newref <= 0);
> >  }
> >
> >  static inline bool sk_del_node_init(struct sock *sk)
> > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> > index 786d97a..a445e15 100644
> > --- a/net/ipv4/inet_connection_sock.c
> > +++ b/net/ipv4/inet_connection_sock.c
> > @@ -739,7 +739,7 @@ void inet_csk_prepare_forced_close(struct sock *sk)
> >  {
> >  	/* sk_clone_lock locked the socket and set refcnt to 2 */
> >  	bh_unlock_sock(sk);
> > -	sock_put(sk);
> > +	__sock_put(sk);
> >
> >  	/* The below has to be done to allow calling inet_csk_destroy_sock */
> >  	sock_set_flag(sk, SOCK_DEAD);
> > @@ -835,13 +835,13 @@ void inet_csk_listen_stop(struct sock *sk)
> >  			 * tcp_v4_destroy_sock().
> >  			 */
> >  			tcp_sk(child)->fastopen_rsk = NULL;
> > -			sock_put(sk);
> > +			__sock_put(sk);
> >  		}
> >  		inet_csk_destroy_sock(child);
> >
> >  		bh_unlock_sock(child);
> >  		local_bh_enable();
> > -		sock_put(child);
> > +		__sock_put(child);
> >
>
> Please don't include the last line : this should stay as
>
>  sock_put(child);

Hope you don't mind a screenshot:
http://www.dormando.me/p/3.8.2-trace-crash.jpg

(I put the patches on 3.8.2). box is on another continent so screenshot
via IPMI is what I get. If this isn't enough or isn't right I'll try
harder to get the trace logged, I guess?

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - March 17, 2013, 4:33 p.m.
On Sun, 2013-03-17 at 02:21 -0700, dormando wrote:

> Hope you don't mind a screenshot:
> http://www.dormando.me/p/3.8.2-trace-crash.jpg
> 
> (I put the patches on 3.8.2). box is on another continent so screenshot
> via IPMI is what I get. If this isn't enough or isn't right I'll try
> harder to get the trace logged, I guess?

Thanks a lot, this gives another useful input, no need for more traces
for the moment.

Could you send me the disassembly of tcp_release_cb() ?

(objdump -d vmlinux | filter to get tcp_release_cb() body)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - March 17, 2013, 4:52 p.m.
On Sun, 2013-03-17 at 09:33 -0700, Eric Dumazet wrote:
> On Sun, 2013-03-17 at 02:21 -0700, dormando wrote:
> 
> > Hope you don't mind a screenshot:
> > http://www.dormando.me/p/3.8.2-trace-crash.jpg
> > 
> > (I put the patches on 3.8.2). box is on another continent so screenshot
> > via IPMI is what I get. If this isn't enough or isn't right I'll try
> > harder to get the trace logged, I guess?
> 
> Thanks a lot, this gives another useful input, no need for more traces
> for the moment.
> 
> Could you send me the disassembly of tcp_release_cb() ?
> 
> (objdump -d vmlinux | filter to get tcp_release_cb() body)
> 
> 

It seems we can have a refcnt imbalance because of

atomic_set(&newsk->sk_refcnt, 2);

we do in sk_clone_lock()

Somehow, something is wrong, because at this point the socket should not
be found.

We perform a sock_hold() somewhere while the socket is already dead.

I'll send a patch asap



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/include/net/sock.h b/include/net/sock.h
index 14f6e9d..fe7c8a6 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -530,7 +530,9 @@  static inline void sock_hold(struct sock *sk)
  */
 static inline void __sock_put(struct sock *sk)
 {
-	atomic_dec(&sk->sk_refcnt);
+	int newref = atomic_dec_return(&sk->sk_refcnt);
+
+	BUG_ON(newref <= 0);
 }
 
 static inline bool sk_del_node_init(struct sock *sk)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 786d97a..a445e15 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -739,7 +739,7 @@  void inet_csk_prepare_forced_close(struct sock *sk)
 {
 	/* sk_clone_lock locked the socket and set refcnt to 2 */
 	bh_unlock_sock(sk);
-	sock_put(sk);
+	__sock_put(sk);
 
 	/* The below has to be done to allow calling inet_csk_destroy_sock */
 	sock_set_flag(sk, SOCK_DEAD);
@@ -835,13 +835,13 @@  void inet_csk_listen_stop(struct sock *sk)
 			 * tcp_v4_destroy_sock().
 			 */
 			tcp_sk(child)->fastopen_rsk = NULL;
-			sock_put(sk);
+			__sock_put(sk);
 		}
 		inet_csk_destroy_sock(child);
 
 		bh_unlock_sock(child);
 		local_bh_enable();
-		sock_put(child);
+		__sock_put(child);
 
 		sk_acceptq_removed(sk);
 		__reqsk_free(req);