diff mbox

[net] vhost_net: don't continue to call the recvmsg when meet errors

Message ID 1480507857-22976-1-git-send-email-wangyunjian@huawei.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Yunjian Wang Nov. 30, 2016, 12:10 p.m. UTC
When we meet an error(err=-EBADFD) recvmsg, the error handling in vhost
handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 drivers/vhost/net.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jason Wang Nov. 30, 2016, 1:07 p.m. UTC | #1
On 2016年11月30日 20:10, Yunjian Wang wrote:
> When we meet an error(err=-EBADFD) recvmsg, the error handling in vhost
> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
>
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>   drivers/vhost/net.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 5dc128a..edc470b 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
>   			pr_debug("Discarded rx packet: "
>   				 " len %d, expected %zd\n", err, sock_len);
>   			vhost_discard_vq_desc(vq, headcount);
> +			/* Don't continue to do, when meet errors. */
> +			if (err < 0)
> +				goto out;
>   			continue;
>   		}
>   		/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */

Acked-by: Jason Wang <jasowang@redhat.com>

We may want to rename vhost_discard_vq_desc() in the future, since it 
does not discard the desc in fact.
Michael S. Tsirkin Nov. 30, 2016, 1:40 p.m. UTC | #2
On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
> When we meet an error(err=-EBADFD) recvmsg,

How do you get EBADFD? Won't vhost_net_rx_peek_head_len
return 0 in this case, breaking the loop?

> the error handling in vhost
> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
> 
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>  drivers/vhost/net.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 5dc128a..edc470b 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
>  			pr_debug("Discarded rx packet: "
>  				 " len %d, expected %zd\n", err, sock_len);
>  			vhost_discard_vq_desc(vq, headcount);
> +			/* Don't continue to do, when meet errors. */
> +			if (err < 0)
> +				goto out;

You might get e.g. EAGAIN and I think you need to retry
in this case.

>  			continue;
>  		}
>  		/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
> -- 
> 1.9.5.msysgit.1
>
Michael S. Tsirkin Nov. 30, 2016, 1:43 p.m. UTC | #3
On Wed, Nov 30, 2016 at 09:07:16PM +0800, Jason Wang wrote:
> 
> 
> On 2016年11月30日 20:10, Yunjian Wang wrote:
> > When we meet an error(err=-EBADFD) recvmsg, the error handling in vhost
> > handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
> > 
> > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > ---
> >   drivers/vhost/net.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > index 5dc128a..edc470b 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
> >   			pr_debug("Discarded rx packet: "
> >   				 " len %d, expected %zd\n", err, sock_len);
> >   			vhost_discard_vq_desc(vq, headcount);
> > +			/* Don't continue to do, when meet errors. */
> > +			if (err < 0)
> > +				goto out;
> >   			continue;
> >   		}
> >   		/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
> 
> Acked-by: Jason Wang <jasowang@redhat.com>
> 
> We may want to rename vhost_discard_vq_desc() in the future, since it does
> not discard the desc in fact.

To me this looks a bit too aggressive. I would also like commit log
to explain better what is going on, to make sure we are
not just treating the symptoms.
Yunjian Wang Dec. 1, 2016, 2:48 a.m. UTC | #4
>-----Original Message-----
>From: Michael S. Tsirkin [mailto:mst@redhat.com] 
>Sent: Wednesday, November 30, 2016 9:41 PM
>To: wangyunjian
>Cc: jasowang@redhat.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; caihe
>Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
>
>On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
>> When we meet an error(err=-EBADFD) recvmsg,
>
>How do you get EBADFD? Won't vhost_net_rx_peek_head_len
>return 0 in this case, breaking the loop?

We started many guest VMs while attaching/detaching some virtio-net nics for loop. 
The soft lockup might happened. The err is -EBADFD.

meesage log:
kernel:[609608.510180]BUG: soft lockup - CPU#18 stuck for 23s! [vhost-60898:126093]
call trace:
[<fffffffa0132967>]vhost_get_vq_desc+0x1e7/0x984 [vhost]
[<fffffffa02037e6>]handle_rx+0x226/0x810 [vhost_net]
[<fffffffa0203de5>]handle_rx_net+0x15/0x20 [vhost_net]
[<fffffffa013160b>]vhost_worker+0xfb/0x1e0 [vhost]
[<fffffffa0131510>]? vhost_dev_reset_owner+0x50/0x50 [vhost]
[<fffffff810a5c7f>]kthread+0xcf/0xe0
[<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
[<fffffff81648898>]ret_from_fork+0x58/0x90
[<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140

>> the error handling in vhost
>> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
>> 
>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
>> ---
>>  drivers/vhost/net.c | 3 +++
>>  1 file changed, 3 insertions(+)
>> 
>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> index 5dc128a..edc470b 100644
>> --- a/drivers/vhost/net.c
>> +++ b/drivers/vhost/net.c
>> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
>>  			pr_debug("Discarded rx packet: "
>>  				 " len %d, expected %zd\n", err, sock_len);
>>  			vhost_discard_vq_desc(vq, headcount);
>> +			/* Don't continue to do, when meet errors. */
>> +			if (err < 0)
>> +				goto out;
>
>You might get e.g. EAGAIN and I think you need to retry
>in this case.
>
>>  			continue;
>>  		}
>>  		/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
>> -- 
>> 1.9.5.msysgit.1
>>
Jason Wang Dec. 1, 2016, 3:13 a.m. UTC | #5
On 2016年12月01日 10:48, wangyunjian wrote:
>> -----Original Message-----
>> From: Michael S. Tsirkin [mailto:mst@redhat.com]
>> Sent: Wednesday, November 30, 2016 9:41 PM
>> To: wangyunjian
>> Cc: jasowang@redhat.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; caihe
>> Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
>>
>> On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
>>> When we meet an error(err=-EBADFD) recvmsg,
>> How do you get EBADFD? Won't vhost_net_rx_peek_head_len
>> return 0 in this case, breaking the loop?
> We started many guest VMs while attaching/detaching some virtio-net nics for loop.
> The soft lockup might happened. The err is -EBADFD.

How did you do the attaching/detaching? AFAIK, the -EBADFD can only 
happens when you deleting tun device during vhost_net transmission.

>
> meesage log:
> kernel:[609608.510180]BUG: soft lockup - CPU#18 stuck for 23s! [vhost-60898:126093]
> call trace:
> [<fffffffa0132967>]vhost_get_vq_desc+0x1e7/0x984 [vhost]
> [<fffffffa02037e6>]handle_rx+0x226/0x810 [vhost_net]
> [<fffffffa0203de5>]handle_rx_net+0x15/0x20 [vhost_net]
> [<fffffffa013160b>]vhost_worker+0xfb/0x1e0 [vhost]
> [<fffffffa0131510>]? vhost_dev_reset_owner+0x50/0x50 [vhost]
> [<fffffff810a5c7f>]kthread+0xcf/0xe0
> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
> [<fffffff81648898>]ret_from_fork+0x58/0x90
> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
>
>>> the error handling in vhost
>>> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
>>>
>>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
>>> ---
>>>   drivers/vhost/net.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>>> index 5dc128a..edc470b 100644
>>> --- a/drivers/vhost/net.c
>>> +++ b/drivers/vhost/net.c
>>> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
>>>   			pr_debug("Discarded rx packet: "
>>>   				 " len %d, expected %zd\n", err, sock_len);
>>>   			vhost_discard_vq_desc(vq, headcount);
>>> +			/* Don't continue to do, when meet errors. */
>>> +			if (err < 0)
>>> +				goto out;
>> You might get e.g. EAGAIN and I think you need to retry
>> in this case.
>>
>>>   			continue;
>>>   		}
>>>   		/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
>>> -- 
>>> 1.9.5.msysgit.1
>>>
Jason Wang Dec. 1, 2016, 3:15 a.m. UTC | #6
On 2016年11月30日 21:40, Michael S. Tsirkin wrote:
> On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
>> When we meet an error(err=-EBADFD) recvmsg,
> How do you get EBADFD? Won't vhost_net_rx_peek_head_len
> return 0 in this case, breaking the loop?
>
>> the error handling in vhost
>> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
>>
>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
>> ---
>>   drivers/vhost/net.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> index 5dc128a..edc470b 100644
>> --- a/drivers/vhost/net.c
>> +++ b/drivers/vhost/net.c
>> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
>>   			pr_debug("Discarded rx packet: "
>>   				 " len %d, expected %zd\n", err, sock_len);
>>   			vhost_discard_vq_desc(vq, headcount);
>> +			/* Don't continue to do, when meet errors. */
>> +			if (err < 0)
>> +				goto out;
> You might get e.g. EAGAIN and I think you need to retry
> in this case.

Yes.

>>   			continue;
>>   		}
>>   		/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
>> -- 
>> 1.9.5.msysgit.1
>>
Michael S. Tsirkin Dec. 1, 2016, 3:21 a.m. UTC | #7
On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:
> 
> >-----Original Message-----
> >From: Michael S. Tsirkin [mailto:mst@redhat.com] 
> >Sent: Wednesday, November 30, 2016 9:41 PM
> >To: wangyunjian
> >Cc: jasowang@redhat.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; caihe
> >Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
> >
> >On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
> >> When we meet an error(err=-EBADFD) recvmsg,
> >
> >How do you get EBADFD? Won't vhost_net_rx_peek_head_len
> >return 0 in this case, breaking the loop?
> 
> We started many guest VMs while attaching/detaching some virtio-net nics for loop. 
> The soft lockup might happened. The err is -EBADFD.
> 

OK, I'd like to figure out what happened here. why don't
we get 0 when we peek at the head?

EBADFD is from here:
        struct tun_struct *tun = __tun_get(tfile);
...
        if (!tun)
                return -EBADFD;

but then:
static int tun_peek_len(struct socket *sock)
{       

...

        struct tun_struct *tun;
...
        
        tun = __tun_get(tfile);
        if (!tun) 
                return 0;


so peek len should return 0.

then while will exit:
        while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk)))
...


> meesage log:
> kernel:[609608.510180]BUG: soft lockup - CPU#18 stuck for 23s! [vhost-60898:126093]
> call trace:
> [<fffffffa0132967>]vhost_get_vq_desc+0x1e7/0x984 [vhost]
> [<fffffffa02037e6>]handle_rx+0x226/0x810 [vhost_net]
> [<fffffffa0203de5>]handle_rx_net+0x15/0x20 [vhost_net]
> [<fffffffa013160b>]vhost_worker+0xfb/0x1e0 [vhost]
> [<fffffffa0131510>]? vhost_dev_reset_owner+0x50/0x50 [vhost]
> [<fffffff810a5c7f>]kthread+0xcf/0xe0
> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
> [<fffffff81648898>]ret_from_fork+0x58/0x90
> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140

So somehow you keep seeing something in tun when we peek.
IMO this is the problem we should try to fix.
Could you try debugging the root cause here?

> >> the error handling in vhost
> >> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
> >> 
> >> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> >> ---
> >>  drivers/vhost/net.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >> 
> >> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> >> index 5dc128a..edc470b 100644
> >> --- a/drivers/vhost/net.c
> >> +++ b/drivers/vhost/net.c
> >> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
> >>  			pr_debug("Discarded rx packet: "
> >>  				 " len %d, expected %zd\n", err, sock_len);
> >>  			vhost_discard_vq_desc(vq, headcount);
> >> +			/* Don't continue to do, when meet errors. */
> >> +			if (err < 0)
> >> +				goto out;
> >
> >You might get e.g. EAGAIN and I think you need to retry
> >in this case.
> >
> >>  			continue;
> >>  		}
> >>  		/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
> >> -- 
> >> 1.9.5.msysgit.1
> >>
Jason Wang Dec. 1, 2016, 3:26 a.m. UTC | #8
On 2016年12月01日 11:21, Michael S. Tsirkin wrote:
> On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:
>>> -----Original Message-----
>>> From: Michael S. Tsirkin [mailto:mst@redhat.com]
>>> Sent: Wednesday, November 30, 2016 9:41 PM
>>> To: wangyunjian
>>> Cc: jasowang@redhat.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; caihe
>>> Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
>>>
>>> On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
>>>> When we meet an error(err=-EBADFD) recvmsg,
>>> How do you get EBADFD? Won't vhost_net_rx_peek_head_len
>>> return 0 in this case, breaking the loop?
>> We started many guest VMs while attaching/detaching some virtio-net nics for loop.
>> The soft lockup might happened. The err is -EBADFD.
>>
> OK, I'd like to figure out what happened here. why don't
> we get 0 when we peek at the head?
>
> EBADFD is from here:
>          struct tun_struct *tun = __tun_get(tfile);
> ...
>          if (!tun)
>                  return -EBADFD;
>
> but then:
> static int tun_peek_len(struct socket *sock)
> {
>
> ...
>
>          struct tun_struct *tun;
> ...
>          
>          tun = __tun_get(tfile);
>          if (!tun)
>                  return 0;
>
>
> so peek len should return 0.
>
> then while will exit:
>          while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk)))
> ...
>

Consider this case: user do ip link del link tap0 before recvmsg() but 
after tun_peek_len() ?

>> meesage log:
>> kernel:[609608.510180]BUG: soft lockup - CPU#18 stuck for 23s! [vhost-60898:126093]
>> call trace:
>> [<fffffffa0132967>]vhost_get_vq_desc+0x1e7/0x984 [vhost]
>> [<fffffffa02037e6>]handle_rx+0x226/0x810 [vhost_net]
>> [<fffffffa0203de5>]handle_rx_net+0x15/0x20 [vhost_net]
>> [<fffffffa013160b>]vhost_worker+0xfb/0x1e0 [vhost]
>> [<fffffffa0131510>]? vhost_dev_reset_owner+0x50/0x50 [vhost]
>> [<fffffff810a5c7f>]kthread+0xcf/0xe0
>> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
>> [<fffffff81648898>]ret_from_fork+0x58/0x90
>> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
> So somehow you keep seeing something in tun when we peek.
> IMO this is the problem we should try to fix.
> Could you try debugging the root cause here?
>
>>>> the error handling in vhost
>>>> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
>>>>
>>>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
>>>> ---
>>>>   drivers/vhost/net.c | 3 +++
>>>>   1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>>>> index 5dc128a..edc470b 100644
>>>> --- a/drivers/vhost/net.c
>>>> +++ b/drivers/vhost/net.c
>>>> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
>>>>   			pr_debug("Discarded rx packet: "
>>>>   				 " len %d, expected %zd\n", err, sock_len);
>>>>   			vhost_discard_vq_desc(vq, headcount);
>>>> +			/* Don't continue to do, when meet errors. */
>>>> +			if (err < 0)
>>>> +				goto out;
>>> You might get e.g. EAGAIN and I think you need to retry
>>> in this case.
>>>
>>>>   			continue;
>>>>   		}
>>>>   		/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
>>>> -- 
>>>> 1.9.5.msysgit.1
>>>>
Michael S. Tsirkin Dec. 1, 2016, 3:27 a.m. UTC | #9
On Thu, Dec 01, 2016 at 11:26:21AM +0800, Jason Wang wrote:
> 
> 
> On 2016年12月01日 11:21, Michael S. Tsirkin wrote:
> > On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:
> > > > -----Original Message-----
> > > > From: Michael S. Tsirkin [mailto:mst@redhat.com]
> > > > Sent: Wednesday, November 30, 2016 9:41 PM
> > > > To: wangyunjian
> > > > Cc: jasowang@redhat.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; caihe
> > > > Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
> > > > 
> > > > On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
> > > > > When we meet an error(err=-EBADFD) recvmsg,
> > > > How do you get EBADFD? Won't vhost_net_rx_peek_head_len
> > > > return 0 in this case, breaking the loop?
> > > We started many guest VMs while attaching/detaching some virtio-net nics for loop.
> > > The soft lockup might happened. The err is -EBADFD.
> > > 
> > OK, I'd like to figure out what happened here. why don't
> > we get 0 when we peek at the head?
> > 
> > EBADFD is from here:
> >          struct tun_struct *tun = __tun_get(tfile);
> > ...
> >          if (!tun)
> >                  return -EBADFD;
> > 
> > but then:
> > static int tun_peek_len(struct socket *sock)
> > {
> > 
> > ...
> > 
> >          struct tun_struct *tun;
> > ...
> >          tun = __tun_get(tfile);
> >          if (!tun)
> >                  return 0;
> > 
> > 
> > so peek len should return 0.
> > 
> > then while will exit:
> >          while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk)))
> > ...
> > 
> 
> Consider this case: user do ip link del link tap0 before recvmsg() but after
> tun_peek_len() ?

Sure, this can happen, but I think we'll just exit on the next loop,
won't we?

> > > meesage log:
> > > kernel:[609608.510180]BUG: soft lockup - CPU#18 stuck for 23s! [vhost-60898:126093]
> > > call trace:
> > > [<fffffffa0132967>]vhost_get_vq_desc+0x1e7/0x984 [vhost]
> > > [<fffffffa02037e6>]handle_rx+0x226/0x810 [vhost_net]
> > > [<fffffffa0203de5>]handle_rx_net+0x15/0x20 [vhost_net]
> > > [<fffffffa013160b>]vhost_worker+0xfb/0x1e0 [vhost]
> > > [<fffffffa0131510>]? vhost_dev_reset_owner+0x50/0x50 [vhost]
> > > [<fffffff810a5c7f>]kthread+0xcf/0xe0
> > > [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
> > > [<fffffff81648898>]ret_from_fork+0x58/0x90
> > > [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
> > So somehow you keep seeing something in tun when we peek.
> > IMO this is the problem we should try to fix.
> > Could you try debugging the root cause here?
> > 
> > > > > the error handling in vhost
> > > > > handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
> > > > > 
> > > > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > > > > ---
> > > > >   drivers/vhost/net.c | 3 +++
> > > > >   1 file changed, 3 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > > > > index 5dc128a..edc470b 100644
> > > > > --- a/drivers/vhost/net.c
> > > > > +++ b/drivers/vhost/net.c
> > > > > @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
> > > > >   			pr_debug("Discarded rx packet: "
> > > > >   				 " len %d, expected %zd\n", err, sock_len);
> > > > >   			vhost_discard_vq_desc(vq, headcount);
> > > > > +			/* Don't continue to do, when meet errors. */
> > > > > +			if (err < 0)
> > > > > +				goto out;
> > > > You might get e.g. EAGAIN and I think you need to retry
> > > > in this case.
> > > > 
> > > > >   			continue;
> > > > >   		}
> > > > >   		/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
> > > > > -- 
> > > > > 1.9.5.msysgit.1
> > > > >
Jason Wang Dec. 1, 2016, 3:37 a.m. UTC | #10
On 2016年12月01日 11:27, Michael S. Tsirkin wrote:
> On Thu, Dec 01, 2016 at 11:26:21AM +0800, Jason Wang wrote:
>> >
>> >
>> >On 2016年12月01日 11:21, Michael S. Tsirkin wrote:
>>> > >On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:
>>>>> > > > >-----Original Message-----
>>>>> > > > >From: Michael S. Tsirkin [mailto:mst@redhat.com]
>>>>> > > > >Sent: Wednesday, November 30, 2016 9:41 PM
>>>>> > > > >To: wangyunjian
>>>>> > > > >Cc:jasowang@redhat.com;netdev@vger.kernel.org;linux-kernel@vger.kernel.org; caihe
>>>>> > > > >Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
>>>>> > > > >
>>>>> > > > >On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
>>>>>> > > > > >When we meet an error(err=-EBADFD) recvmsg,
>>>>> > > > >How do you get EBADFD? Won't vhost_net_rx_peek_head_len
>>>>> > > > >return 0 in this case, breaking the loop?
>>>> > > >We started many guest VMs while attaching/detaching some virtio-net nics for loop.
>>>> > > >The soft lockup might happened. The err is -EBADFD.
>>>> > > >
>>> > >OK, I'd like to figure out what happened here. why don't
>>> > >we get 0 when we peek at the head?
>>> > >
>>> > >EBADFD is from here:
>>> > >          struct tun_struct *tun = __tun_get(tfile);
>>> > >...
>>> > >          if (!tun)
>>> > >                  return -EBADFD;
>>> > >
>>> > >but then:
>>> > >static int tun_peek_len(struct socket *sock)
>>> > >{
>>> > >
>>> > >...
>>> > >
>>> > >          struct tun_struct *tun;
>>> > >...
>>> > >          tun = __tun_get(tfile);
>>> > >          if (!tun)
>>> > >                  return 0;
>>> > >
>>> > >
>>> > >so peek len should return 0.
>>> > >
>>> > >then while will exit:
>>> > >          while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk)))
>>> > >...
>>> > >
>> >
>> >Consider this case: user do ip link del link tap0 before recvmsg() but after
>> >tun_peek_len() ?
> Sure, this can happen, but I think we'll just exit on the next loop,
> won't we?
>

Right, this is the only case I can image for -EBADFD, let's wait for the 
author to the steps.
Yunjian Wang Dec. 1, 2016, 4:41 a.m. UTC | #11
>-----Original Message-----

>From: Jason Wang [mailto:jasowang@redhat.com] 

>Sent: Thursday, December 01, 2016 11:37 AM

>To: Michael S. Tsirkin

>Cc: wangyunjian; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; caihe

>Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

>

>

>

>On 2016年12月01日 11:27, Michael S. Tsirkin wrote:

>> On Thu, Dec 01, 2016 at 11:26:21AM +0800, Jason Wang wrote:

>>> >

>>> >

>>> >On 2016年12月01日 11:21, Michael S. Tsirkin wrote:

>>>> > >On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:

>>>>>> > > > >-----Original Message-----

>>>>>> > > > >From: Michael S. Tsirkin [mailto:mst@redhat.com]

>>>>>> > > > >Sent: Wednesday, November 30, 2016 9:41 PM

>>>>>> > > > >To: wangyunjian

>>>>>> > > > >Cc:jasowang@redhat.com;netdev@vger.kernel.org;linux-kernel@

>>>>>> > > > >vger.kernel.org; caihe

>>>>>> > > > >Subject: Re: [PATCH net] vhost_net: don't continue to call 

>>>>>> > > > >the recvmsg when meet errors

>>>>>> > > > >

>>>>>> > > > >On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:

>>>>>>> > > > > >When we meet an error(err=-EBADFD) recvmsg,

>>>>>> > > > >How do you get EBADFD? Won't vhost_net_rx_peek_head_len 

>>>>>> > > > >return 0 in this case, breaking the loop?

>>>>> > > >We started many guest VMs while attaching/detaching some virtio-net nics for loop.

>>>>> > > >The soft lockup might happened. The err is -EBADFD.

>>>>> > > >

>>>> > >OK, I'd like to figure out what happened here. why don't we get 0 

>>>> > >when we peek at the head?

>>>> > >

>>>> > >EBADFD is from here:

>>>> > >          struct tun_struct *tun = __tun_get(tfile); ...

>>>> > >          if (!tun)

>>>> > >                  return -EBADFD;

>>>> > >

>>>> > >but then:

>>>> > >static int tun_peek_len(struct socket *sock) {

>>>> > >

>>>> > >...

>>>> > >

>>>> > >          struct tun_struct *tun; ...

>>>> > >          tun = __tun_get(tfile);

>>>> > >          if (!tun)

>>>> > >                  return 0;

>>>> > >

>>>> > >

>>>> > >so peek len should return 0.

>>>> > >

>>>> > >then while will exit:

>>>> > >          while ((sock_len = vhost_net_rx_peek_head_len(net, 

>>>> > >sock->sk))) ...

>>>> > >

>>> >

>>> >Consider this case: user do ip link del link tap0 before recvmsg() 

>>> >but after

>>> >tun_peek_len() ?

>> Sure, this can happen, but I think we'll just exit on the next loop, 

>> won't we?

>>

>

>Right, this is the only case I can image for -EBADFD, let's wait for the author to the steps.

>


Thanks, I understand it don't happen in the latest kernel version.
My problem happened using kernel version 3.10.0-xx

The peek len willn't return 0.

static int peek_head_len(struct sock *sk)
{
	struct sk_buff *head;
	int len = 0;
	unsigned long flags;

	spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
	head = skb_peek(&sk->sk_receive_queue);
	if (likely(head)) {
		len = head->len;
		if (skb_vlan_tag_present(head))
			len += VLAN_HLEN;
	}

	spin_unlock_irqrestore(&sk->sk_receive_queue.lock, flags);
	return len;
}
Michael S. Tsirkin Dec. 1, 2016, 4:56 a.m. UTC | #12
On Thu, Dec 01, 2016 at 04:41:40AM +0000, wangyunjian wrote:
> >-----Original Message-----
> >From: Jason Wang [mailto:jasowang@redhat.com] 
> >Sent: Thursday, December 01, 2016 11:37 AM
> >To: Michael S. Tsirkin
> >Cc: wangyunjian; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; caihe
> >Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
> >
> >
> >
> >On 2016年12月01日 11:27, Michael S. Tsirkin wrote:
> >> On Thu, Dec 01, 2016 at 11:26:21AM +0800, Jason Wang wrote:
> >>> >
> >>> >
> >>> >On 2016年12月01日 11:21, Michael S. Tsirkin wrote:
> >>>> > >On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:
> >>>>>> > > > >-----Original Message-----
> >>>>>> > > > >From: Michael S. Tsirkin [mailto:mst@redhat.com]
> >>>>>> > > > >Sent: Wednesday, November 30, 2016 9:41 PM
> >>>>>> > > > >To: wangyunjian
> >>>>>> > > > >Cc:jasowang@redhat.com;netdev@vger.kernel.org;linux-kernel@
> >>>>>> > > > >vger.kernel.org; caihe
> >>>>>> > > > >Subject: Re: [PATCH net] vhost_net: don't continue to call 
> >>>>>> > > > >the recvmsg when meet errors
> >>>>>> > > > >
> >>>>>> > > > >On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
> >>>>>>> > > > > >When we meet an error(err=-EBADFD) recvmsg,
> >>>>>> > > > >How do you get EBADFD? Won't vhost_net_rx_peek_head_len 
> >>>>>> > > > >return 0 in this case, breaking the loop?
> >>>>> > > >We started many guest VMs while attaching/detaching some virtio-net nics for loop.
> >>>>> > > >The soft lockup might happened. The err is -EBADFD.
> >>>>> > > >
> >>>> > >OK, I'd like to figure out what happened here. why don't we get 0 
> >>>> > >when we peek at the head?
> >>>> > >
> >>>> > >EBADFD is from here:
> >>>> > >          struct tun_struct *tun = __tun_get(tfile); ...
> >>>> > >          if (!tun)
> >>>> > >                  return -EBADFD;
> >>>> > >
> >>>> > >but then:
> >>>> > >static int tun_peek_len(struct socket *sock) {
> >>>> > >
> >>>> > >...
> >>>> > >
> >>>> > >          struct tun_struct *tun; ...
> >>>> > >          tun = __tun_get(tfile);
> >>>> > >          if (!tun)
> >>>> > >                  return 0;
> >>>> > >
> >>>> > >
> >>>> > >so peek len should return 0.
> >>>> > >
> >>>> > >then while will exit:
> >>>> > >          while ((sock_len = vhost_net_rx_peek_head_len(net, 
> >>>> > >sock->sk))) ...
> >>>> > >
> >>> >
> >>> >Consider this case: user do ip link del link tap0 before recvmsg() 
> >>> >but after
> >>> >tun_peek_len() ?
> >> Sure, this can happen, but I think we'll just exit on the next loop, 
> >> won't we?
> >>
> >
> >Right, this is the only case I can image for -EBADFD, let's wait for the author to the steps.
> >
> 
> Thanks, I understand it don't happen in the latest kernel version.
> My problem happened using kernel version 3.10.0-xx
> The peek len willn't return 0.
> 
> static int peek_head_len(struct sock *sk)
> {
> 	struct sk_buff *head;
> 	int len = 0;
> 	unsigned long flags;
> 
> 	spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
> 	head = skb_peek(&sk->sk_receive_queue);

Should return NULL, should it not?
Maybe sk_receive_queue was not purged on detach back then.


> 	if (likely(head)) {
> 		len = head->len;
> 		if (skb_vlan_tag_present(head))
> 			len += VLAN_HLEN;
> 	}
> 
> 	spin_unlock_irqrestore(&sk->sk_receive_queue.lock, flags);
> 	return len;
> }
diff mbox

Patch

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 5dc128a..edc470b 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -717,6 +717,9 @@  static void handle_rx(struct vhost_net *net)
 			pr_debug("Discarded rx packet: "
 				 " len %d, expected %zd\n", err, sock_len);
 			vhost_discard_vq_desc(vq, headcount);
+			/* Don't continue to do, when meet errors. */
+			if (err < 0)
+				goto out;
 			continue;
 		}
 		/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */