diff mbox

3.3.0, 3.4-rc1 reproducible tun Oops

Message ID 4F8D5FAD.10304@parallels.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Stanislav Kinsbursky April 17, 2012, 12:18 p.m. UTC
17.04.2012 06:08, Simon Kirby пишет:
> On Thu, Apr 05, 2012 at 04:41:04AM +0200, Eric Dumazet wrote:
>
>> Hmm, is it happening if you remove the nvidia module ?
>>
>> If yes, please try to add slub_debug=FZPU
>
> Finally got annoyed enough at this to bisect it. It doesn't happen every
> time and I got a bit confused, but I finally tracked it down to:
>
> 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d is the first bad commit
> commit 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d
> Author: Stanislav Kinsbursky<skinsbursky@parallels.com>
> Date:   Mon Mar 12 02:59:41 2012 +0000
>
>      tun: don't hold network namespace by tun sockets
>
>      v3: added previously removed sock_put() to the tun_release() callback, because
>      sk_release_kernel() doesn't drop the socket reference.
>
>      v2: sk_release_kernel() used for socket release. Dummy tun_release() is
>      required for sk_release_kernel() --->  sock_release() --->  sock->ops->release()
>      call.
>
>      TUN was designed to destroy it's socket on network namesapce shutdown. But this
>      will never happen for persistent device, because it's socket holds network
>      namespace.
>      This patch removes of holding network namespace by TUN socket and replaces it
>      by creating socket in init_net and then changing it's net it to desired one. On
>      shutdown socket is moved back to init_net prior to final put.
>
>      Signed-off-by: Stanislav Kinsbursky<skinsbursky@parallels.com>
>      Signed-off-by: David S. Miller<davem@davemloft.net>
>
> ...With this reverted on top of 3.4-rc3, I no longer see crashes when I
> keep making and breaking the SSH tunnel while running "vmstat 1" in an
> SSH session over a socket that is running through that tunnel.
>
> Simon-

Hi, Simon.
Could you please try to apply the patch below on top of your the tree (with 
1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d applied) and check does it fix the problem:


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Simon Kirby April 17, 2012, 6:35 p.m. UTC | #1
On Tue, Apr 17, 2012 at 04:18:53PM +0400, Stanislav Kinsbursky wrote:

> 17.04.2012 06:08, Simon Kirby ??????????:
> >On Thu, Apr 05, 2012 at 04:41:04AM +0200, Eric Dumazet wrote:
> >
> >>Hmm, is it happening if you remove the nvidia module ?
> >>
> >>If yes, please try to add slub_debug=FZPU
> >
> >Finally got annoyed enough at this to bisect it. It doesn't happen every
> >time and I got a bit confused, but I finally tracked it down to:
> >
> >1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d is the first bad commit
> >commit 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d
> >Author: Stanislav Kinsbursky<skinsbursky@parallels.com>
> >Date:   Mon Mar 12 02:59:41 2012 +0000
> >
> >     tun: don't hold network namespace by tun sockets
> >
> >     v3: added previously removed sock_put() to the tun_release() callback, because
> >     sk_release_kernel() doesn't drop the socket reference.
> >
> >     v2: sk_release_kernel() used for socket release. Dummy tun_release() is
> >     required for sk_release_kernel() --->  sock_release() --->  sock->ops->release()
> >     call.
> >
> >     TUN was designed to destroy it's socket on network namesapce shutdown. But this
> >     will never happen for persistent device, because it's socket holds network
> >     namespace.
> >     This patch removes of holding network namespace by TUN socket and replaces it
> >     by creating socket in init_net and then changing it's net it to desired one. On
> >     shutdown socket is moved back to init_net prior to final put.
> >
> >     Signed-off-by: Stanislav Kinsbursky<skinsbursky@parallels.com>
> >     Signed-off-by: David S. Miller<davem@davemloft.net>
> >
> >...With this reverted on top of 3.4-rc3, I no longer see crashes when I
> >keep making and breaking the SSH tunnel while running "vmstat 1" in an
> >SSH session over a socket that is running through that tunnel.
> >
> >Simon-
> 
> Hi, Simon.
> Could you please try to apply the patch below on top of your the
> tree (with 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d applied) and
> check does it fix the problem:
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index bb8c72c..1fc4622 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1540,13 +1540,10 @@ static int tun_chr_close(struct inode
> *inode, struct file *file)
>  			if (dev->reg_state == NETREG_REGISTERED)
>  				unregister_netdevice(dev);
>  			rtnl_unlock();
> -		}
> +		} else
> +			sock_put(tun->socket.sk);
>  	}
> 
> -	tun = tfile->tun;
> -	if (tun)
> -		sock_put(tun->socket.sk);
> -
>  	put_net(tfile->net);
>  	kfree(tfile);

(Whitespace-damaged patch, applied manually)

Yes, I no longer see crashes with this applied. I haven't tried with
kmemleak or similar, but it seems to work.

Thanks,

Simon-
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stanislav Kinsbursky April 17, 2012, 6:49 p.m. UTC | #2
17.04.2012 22:35, Simon Kirby написал:
> On Tue, Apr 17, 2012 at 04:18:53PM +0400, Stanislav Kinsbursky wrote:
>
>> 17.04.2012 06:08, Simon Kirby ??????????:
>>> On Thu, Apr 05, 2012 at 04:41:04AM +0200, Eric Dumazet wrote:
>>>
>>>> Hmm, is it happening if you remove the nvidia module ?
>>>>
>>>> If yes, please try to add slub_debug=FZPU
>>> Finally got annoyed enough at this to bisect it. It doesn't happen every
>>> time and I got a bit confused, but I finally tracked it down to:
>>>
>>> 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d is the first bad commit
>>> commit 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d
>>> Author: Stanislav Kinsbursky<skinsbursky@parallels.com>
>>> Date:   Mon Mar 12 02:59:41 2012 +0000
>>>
>>>      tun: don't hold network namespace by tun sockets
>>>
>>>      v3: added previously removed sock_put() to the tun_release() callback, because
>>>      sk_release_kernel() doesn't drop the socket reference.
>>>
>>>      v2: sk_release_kernel() used for socket release. Dummy tun_release() is
>>>      required for sk_release_kernel() --->   sock_release() --->   sock->ops->release()
>>>      call.
>>>
>>>      TUN was designed to destroy it's socket on network namesapce shutdown. But this
>>>      will never happen for persistent device, because it's socket holds network
>>>      namespace.
>>>      This patch removes of holding network namespace by TUN socket and replaces it
>>>      by creating socket in init_net and then changing it's net it to desired one. On
>>>      shutdown socket is moved back to init_net prior to final put.
>>>
>>>      Signed-off-by: Stanislav Kinsbursky<skinsbursky@parallels.com>
>>>      Signed-off-by: David S. Miller<davem@davemloft.net>
>>>
>>> ...With this reverted on top of 3.4-rc3, I no longer see crashes when I
>>> keep making and breaking the SSH tunnel while running "vmstat 1" in an
>>> SSH session over a socket that is running through that tunnel.
>>>
>>> Simon-
>> Hi, Simon.
>> Could you please try to apply the patch below on top of your the
>> tree (with 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d applied) and
>> check does it fix the problem:
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index bb8c72c..1fc4622 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -1540,13 +1540,10 @@ static int tun_chr_close(struct inode
>> *inode, struct file *file)
>>   			if (dev->reg_state == NETREG_REGISTERED)
>>   				unregister_netdevice(dev);
>>   			rtnl_unlock();
>> -		}
>> +		} else
>> +			sock_put(tun->socket.sk);
>>   	}
>>
>> -	tun = tfile->tun;
>> -	if (tun)
>> -		sock_put(tun->socket.sk);
>> -
>>   	put_net(tfile->net);
>>   	kfree(tfile);
> (Whitespace-damaged patch, applied manually)
>
> Yes, I no longer see crashes with this applied. I haven't tried with
> kmemleak or similar, but it seems to work.

Sorry for whitespaces.
And thanks, Simon.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 18, 2012, 2:38 a.m. UTC | #3
From: Stanislav Kinsbursky <skinsbursky@parallels.com>
Date: Tue, 17 Apr 2012 22:49:06 +0400

> Sorry for whitespaces.
> And thanks, Simon.

Please submit this fix formally, with Simon's Tested-by:
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stanislav Kinsbursky April 18, 2012, 11:32 a.m. UTC | #4
17.04.2012 22:35, Simon Kirby пишет:
> On Tue, Apr 17, 2012 at 04:18:53PM +0400, Stanislav Kinsbursky wrote:
>
>> 17.04.2012 06:08, Simon Kirby ??????????:
>>> On Thu, Apr 05, 2012 at 04:41:04AM +0200, Eric Dumazet wrote:
>>>
>>>> Hmm, is it happening if you remove the nvidia module ?
>>>>
>>>> If yes, please try to add slub_debug=FZPU
>>>
>>> Finally got annoyed enough at this to bisect it. It doesn't happen every
>>> time and I got a bit confused, but I finally tracked it down to:
>>>
>>> 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d is the first bad commit
>>> commit 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d
>>> Author: Stanislav Kinsbursky<skinsbursky@parallels.com>
>>> Date:   Mon Mar 12 02:59:41 2012 +0000
>>>
>>>      tun: don't hold network namespace by tun sockets
>>>
>>>      v3: added previously removed sock_put() to the tun_release() callback, because
>>>      sk_release_kernel() doesn't drop the socket reference.
>>>
>>>      v2: sk_release_kernel() used for socket release. Dummy tun_release() is
>>>      required for sk_release_kernel() --->   sock_release() --->   sock->ops->release()
>>>      call.
>>>
>>>      TUN was designed to destroy it's socket on network namesapce shutdown. But this
>>>      will never happen for persistent device, because it's socket holds network
>>>      namespace.
>>>      This patch removes of holding network namespace by TUN socket and replaces it
>>>      by creating socket in init_net and then changing it's net it to desired one. On
>>>      shutdown socket is moved back to init_net prior to final put.
>>>
>>>      Signed-off-by: Stanislav Kinsbursky<skinsbursky@parallels.com>
>>>      Signed-off-by: David S. Miller<davem@davemloft.net>
>>>
>>> ...With this reverted on top of 3.4-rc3, I no longer see crashes when I
>>> keep making and breaking the SSH tunnel while running "vmstat 1" in an
>>> SSH session over a socket that is running through that tunnel.
>>>
>>> Simon-
>>
>> Hi, Simon.
>> Could you please try to apply the patch below on top of your the
>> tree (with 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d applied) and
>> check does it fix the problem:
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index bb8c72c..1fc4622 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -1540,13 +1540,10 @@ static int tun_chr_close(struct inode
>> *inode, struct file *file)
>>   			if (dev->reg_state == NETREG_REGISTERED)
>>   				unregister_netdevice(dev);
>>   			rtnl_unlock();
>> -		}
>> +		} else
>> +			sock_put(tun->socket.sk);
>>   	}
>>
>> -	tun = tfile->tun;
>> -	if (tun)
>> -		sock_put(tun->socket.sk);
>> -
>>   	put_net(tfile->net);
>>   	kfree(tfile);
>
> (Whitespace-damaged patch, applied manually)
>
> Yes, I no longer see crashes with this applied. I haven't tried with
> kmemleak or similar, but it seems to work.
>
> Thanks,
>

This bug looks like double free, but I can't understand how does this can happen...
Simon, would be really great, if you'll describe in details some simple way, how 
to reproduce the bug.
Simon Kirby May 19, 2012, 1:07 a.m. UTC | #5
On Wed, Apr 18, 2012 at 03:32:27PM +0400, Stanislav Kinsbursky wrote:

> 17.04.2012 22:35, Simon Kirby ??????????:
> >On Tue, Apr 17, 2012 at 04:18:53PM +0400, Stanislav Kinsbursky wrote:
> >>
> >>Hi, Simon.
> >>Could you please try to apply the patch below on top of your the
> >>tree (with 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d applied) and
> >>check does it fix the problem:
> >>
> >>diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> >>index bb8c72c..1fc4622 100644
> >>--- a/drivers/net/tun.c
> >>+++ b/drivers/net/tun.c
> >>@@ -1540,13 +1540,10 @@ static int tun_chr_close(struct inode
> >>*inode, struct file *file)
> >>  			if (dev->reg_state == NETREG_REGISTERED)
> >>  				unregister_netdevice(dev);
> >>  			rtnl_unlock();
> >>-		}
> >>+		} else
> >>+			sock_put(tun->socket.sk);
> >>  	}
> >>
> >>-	tun = tfile->tun;
> >>-	if (tun)
> >>-		sock_put(tun->socket.sk);
> >>-
> >>  	put_net(tfile->net);
> >>  	kfree(tfile);
> >
> >(Whitespace-damaged patch, applied manually)
> >
> >Yes, I no longer see crashes with this applied. I haven't tried with
> >kmemleak or similar, but it seems to work.
> >
> >Thanks,
> >
> 
> This bug looks like double free, but I can't understand how does this can happen...
> Simon, would be really great, if you'll describe in details some
> simple way, how to reproduce the bug.

Oh, sorry, I did not see this until now. I just noticed it was still
floating in my tree with no upstream changes yet, then found your email.
I still have not seen any issues since applying your patch.

I was definitely seeing the issue on 3.4-rc3. I can try and see if it
still occurs with your patch removed, if that would help.

Do you have a box on which you can set up an SSH tunnel? In my case, I
can reproduce it easily with three boxes. From home, I run ssh to my work
box to establish the layer 2 tunnel. This goes through a ProxyCommand to
jump through an entry box, but I don't think that should matter. I use a
cheap tunnel start script similar to this:

work_net=10.0.0.0/8
work_tun_ip=10.x.x.x
home_tun_ip=10.x.x.x
echo 1 > /proc/sys/net/ipv4/conf/eth0/proxy_arp
ssh -w any:any <work box> "ifconfig tun0 $work_tun_ip pointopoint
$home_tun_ip; echo 'ifconfig tun0 $home_tun_ip pointopoint $work_tun_ip
&& ip route add $work_net via $work_tun_ip'; sleep 1d" | sh -v

...there's probably a better way, but it works. To reproduce, I log in
to a third box over this tunnel, and start a "vmstat 1", so that packets
keep coming back to the tunnel host. ^C on the SSH session will then
produce an Oops within a second.

With CONFIG_SLUB_DEBUG=y and booting with slub_debug=FZPU, I got the
Redzone overwritten notice. Without it, the box usually Oopses and
hangs immediately. Sometimes, I might have to reconnect the tunnel and
^C it once more. If I don't have that vmstat session open, it usually
doesn't crash.

Does this work for you?

Simon-
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stanislav Kinsbursky May 21, 2012, 2:51 p.m. UTC | #6
On 19.05.2012 05:07, Simon Kirby wrote:
> On Wed, Apr 18, 2012 at 03:32:27PM +0400, Stanislav Kinsbursky wrote:
>
>> 17.04.2012 22:35, Simon Kirby ??????????:
>>> On Tue, Apr 17, 2012 at 04:18:53PM +0400, Stanislav Kinsbursky wrote:
>>>>
>>>> Hi, Simon.
>>>> Could you please try to apply the patch below on top of your the
>>>> tree (with 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d applied) and
>>>> check does it fix the problem:
>>>>
>>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>>>> index bb8c72c..1fc4622 100644
>>>> --- a/drivers/net/tun.c
>>>> +++ b/drivers/net/tun.c
>>>> @@ -1540,13 +1540,10 @@ static int tun_chr_close(struct inode
>>>> *inode, struct file *file)
>>>>   			if (dev->reg_state == NETREG_REGISTERED)
>>>>   				unregister_netdevice(dev);
>>>>   			rtnl_unlock();
>>>> -		}
>>>> +		} else
>>>> +			sock_put(tun->socket.sk);
>>>>   	}
>>>>
>>>> -	tun = tfile->tun;
>>>> -	if (tun)
>>>> -		sock_put(tun->socket.sk);
>>>> -
>>>>   	put_net(tfile->net);
>>>>   	kfree(tfile);
>>>
>>> (Whitespace-damaged patch, applied manually)
>>>
>>> Yes, I no longer see crashes with this applied. I haven't tried with
>>> kmemleak or similar, but it seems to work.
>>>
>>> Thanks,
>>>
>>
>> This bug looks like double free, but I can't understand how does this can happen...
>> Simon, would be really great, if you'll describe in details some
>> simple way, how to reproduce the bug.
>
> Oh, sorry, I did not see this until now. I just noticed it was still
> floating in my tree with no upstream changes yet, then found your email.
> I still have not seen any issues since applying your patch.
>
> I was definitely seeing the issue on 3.4-rc3. I can try and see if it
> still occurs with your patch removed, if that would help.
>
> Do you have a box on which you can set up an SSH tunnel? In my case, I
> can reproduce it easily with three boxes. From home, I run ssh to my work
> box to establish the layer 2 tunnel. This goes through a ProxyCommand to
> jump through an entry box, but I don't think that should matter. I use a
> cheap tunnel start script similar to this:
>
> work_net=10.0.0.0/8
> work_tun_ip=10.x.x.x
> home_tun_ip=10.x.x.x
> echo 1>  /proc/sys/net/ipv4/conf/eth0/proxy_arp
> ssh -w any:any<work box>  "ifconfig tun0 $work_tun_ip pointopoint
> $home_tun_ip; echo 'ifconfig tun0 $home_tun_ip pointopoint $work_tun_ip
> &&  ip route add $work_net via $work_tun_ip'; sleep 1d" | sh -v
>
> ...there's probably a better way, but it works. To reproduce, I log in
> to a third box over this tunnel, and start a "vmstat 1", so that packets
> keep coming back to the tunnel host. ^C on the SSH session will then
> produce an Oops within a second.
>
> With CONFIG_SLUB_DEBUG=y and booting with slub_debug=FZPU, I got the
> Redzone overwritten notice. Without it, the box usually Oopses and
> hangs immediately. Sometimes, I might have to reconnect the tunnel and
> ^C it once more. If I don't have that vmstat session open, it usually
> doesn't crash.
>
> Does this work for you?
>

Hello, Simon.
Thanks for details.
I still can't reproduce the issue.
Here is my configuration:
1) three nodes: A, B and C.
2) A and B connected with a tunnel (your script - slightly modified).
3) Packets to C from A are routed through the tunnel.
4) Node B has 3.4.0-rc2 based kernel. A and C - rhel6 kernel.

So, I login to C from A by ssh, run "vmstat 1" and then cut off (^C) the tunnel 
between A and B. Connection hanged. No panic or oops occurred.

Is it the same you've done when panic occurred?
Or I'm doing something wrong?

> Simon-
diff mbox

Patch

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index bb8c72c..1fc4622 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1540,13 +1540,10 @@  static int tun_chr_close(struct inode *inode, struct 
file *file)
  			if (dev->reg_state == NETREG_REGISTERED)
  				unregister_netdevice(dev);
  			rtnl_unlock();
-		}
+		} else
+			sock_put(tun->socket.sk);
  	}

-	tun = tfile->tun;
-	if (tun)
-		sock_put(tun->socket.sk);
-
  	put_net(tfile->net);
  	kfree(tfile);