Message ID | 20120502034254.11782.27314.stgit@amd-6168-8-1.englab.nay.redhat.com |
---|---|
State | Awaiting Upstream, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 2012-05-02 at 11:42 +0800, Jason Wang wrote: > We add used and signal guest in worker thread but did not poll the > virtqueue > during the zero copy callback. This may lead the missing of adding and > signalling during zerocopy. Solve this by polling the virtqueue and > let it > wakeup the worker during callback. > > Signed-off-by: Jason Wang <jasowang@redhat.com> > --- > drivers/vhost/vhost.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > index 947f00d..7b75fdf 100644 > --- a/drivers/vhost/vhost.c > +++ b/drivers/vhost/vhost.c > @@ -1604,6 +1604,7 @@ void vhost_zerocopy_callback(void *arg) > struct vhost_ubuf_ref *ubufs = ubuf->arg; > struct vhost_virtqueue *vq = ubufs->vq; > > + vhost_poll_queue(&vq->poll); > /* set len = 1 to mark this desc buffers done DMA */ > vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN; > kref_put(&ubufs->kref, vhost_zerocopy_done_signal); Doing so, we might have redundant vhost_poll_queue(). Do you know in which scenario there might be missing of adding and signaling during zerocopy? Thanks Shirley -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/16/2012 12:50 AM, Shirley Ma wrote: > On Wed, 2012-05-02 at 11:42 +0800, Jason Wang wrote: >> We add used and signal guest in worker thread but did not poll the >> virtqueue >> during the zero copy callback. This may lead the missing of adding and >> signalling during zerocopy. Solve this by polling the virtqueue and >> let it >> wakeup the worker during callback. >> >> Signed-off-by: Jason Wang<jasowang@redhat.com> >> --- >> drivers/vhost/vhost.c | 1 + >> 1 files changed, 1 insertions(+), 0 deletions(-) >> >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c >> index 947f00d..7b75fdf 100644 >> --- a/drivers/vhost/vhost.c >> +++ b/drivers/vhost/vhost.c >> @@ -1604,6 +1604,7 @@ void vhost_zerocopy_callback(void *arg) >> struct vhost_ubuf_ref *ubufs = ubuf->arg; >> struct vhost_virtqueue *vq = ubufs->vq; >> >> + vhost_poll_queue(&vq->poll); >> /* set len = 1 to mark this desc buffers done DMA */ >> vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN; >> kref_put(&ubufs->kref, vhost_zerocopy_done_signal); > Doing so, we might have redundant vhost_poll_queue(). Do you know in > which scenario there might be missing of adding and signaling during > zerocopy? Yes, as we only do signaling and adding during tx work, if there's no tx work when the skb were sent, we may lose the opportunity to let guest know about the completion. It's easy to be reproduced with netperf test. Thanks > Thanks > Shirley > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2012-05-16 at 10:58 +0800, Jason Wang wrote: > >> drivers/vhost/vhost.c | 1 + > >> 1 files changed, 1 insertions(+), 0 deletions(-) > >> > >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > >> index 947f00d..7b75fdf 100644 > >> --- a/drivers/vhost/vhost.c > >> +++ b/drivers/vhost/vhost.c > >> @@ -1604,6 +1604,7 @@ void vhost_zerocopy_callback(void *arg) > >> struct vhost_ubuf_ref *ubufs = ubuf->arg; > >> struct vhost_virtqueue *vq = ubufs->vq; > >> > >> + vhost_poll_queue(&vq->poll); > >> /* set len = 1 to mark this desc buffers done DMA */ > >> vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN; > >> kref_put(&ubufs->kref, vhost_zerocopy_done_signal); > > Doing so, we might have redundant vhost_poll_queue(). Do you know in > > which scenario there might be missing of adding and signaling during > > zerocopy? > > Yes, as we only do signaling and adding during tx work, if there's no > tx > work when the skb were sent, we may lose the opportunity to let guest > know about the completion. It's easy to be reproduced with netperf > test. The reason which host signals guest is to free guest tx buffers, if there is no tx work, then it's not necessary to signal the guest unless guest runs out of memory. The pending buffers will be released virtio_net device gone. What's the behavior of netperf test when you hit this situation? Thanks Shirley -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 16, 2012 at 08:10:27AM -0700, Shirley Ma wrote: > On Wed, 2012-05-16 at 10:58 +0800, Jason Wang wrote: > > >> drivers/vhost/vhost.c | 1 + > > >> 1 files changed, 1 insertions(+), 0 deletions(-) > > >> > > >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > > >> index 947f00d..7b75fdf 100644 > > >> --- a/drivers/vhost/vhost.c > > >> +++ b/drivers/vhost/vhost.c > > >> @@ -1604,6 +1604,7 @@ void vhost_zerocopy_callback(void *arg) > > >> struct vhost_ubuf_ref *ubufs = ubuf->arg; > > >> struct vhost_virtqueue *vq = ubufs->vq; > > >> > > >> + vhost_poll_queue(&vq->poll); > > >> /* set len = 1 to mark this desc buffers done DMA */ > > >> vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN; > > >> kref_put(&ubufs->kref, vhost_zerocopy_done_signal); > > > Doing so, we might have redundant vhost_poll_queue(). Do you know in > > > which scenario there might be missing of adding and signaling during > > > zerocopy? > > > > Yes, as we only do signaling and adding during tx work, if there's no > > tx > > work when the skb were sent, we may lose the opportunity to let guest > > know about the completion. It's easy to be reproduced with netperf > > test. > > The reason which host signals guest is to free guest tx buffers, if > there is no tx work, then it's not necessary to signal the guest unless > guest runs out of memory. The pending buffers will be released > virtio_net device gone. > > What's the behavior of netperf test when you hit this situation? > > Thanks > Shirley IIRC guest networking seems to be lost.
On Wed, 2012-05-16 at 18:14 +0300, Michael S. Tsirkin wrote: > On Wed, May 16, 2012 at 08:10:27AM -0700, Shirley Ma wrote: > > On Wed, 2012-05-16 at 10:58 +0800, Jason Wang wrote: > > > >> drivers/vhost/vhost.c | 1 + > > > >> 1 files changed, 1 insertions(+), 0 deletions(-) > > > >> > > > >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > > > >> index 947f00d..7b75fdf 100644 > > > >> --- a/drivers/vhost/vhost.c > > > >> +++ b/drivers/vhost/vhost.c > > > >> @@ -1604,6 +1604,7 @@ void vhost_zerocopy_callback(void *arg) > > > >> struct vhost_ubuf_ref *ubufs = ubuf->arg; > > > >> struct vhost_virtqueue *vq = ubufs->vq; > > > >> > > > >> + vhost_poll_queue(&vq->poll); > > > >> /* set len = 1 to mark this desc buffers done DMA */ > > > >> vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN; > > > >> kref_put(&ubufs->kref, vhost_zerocopy_done_signal); > > > > Doing so, we might have redundant vhost_poll_queue(). Do you > know in > > > > which scenario there might be missing of adding and signaling > during > > > > zerocopy? > > > > > > Yes, as we only do signaling and adding during tx work, if there's > no > > > tx > > > work when the skb were sent, we may lose the opportunity to let > guest > > > know about the completion. It's easy to be reproduced with netperf > > > test. > > > > The reason which host signals guest is to free guest tx buffers, if > > there is no tx work, then it's not necessary to signal the guest > unless > > guest runs out of memory. The pending buffers will be released > > virtio_net device gone. > > > > What's the behavior of netperf test when you hit this situation? > > > > Thanks > > Shirley > > IIRC guest networking seems to be lost. It seems vhost_enable_notify is missing in somewhere else? Thanks Shirley -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 16, 2012 at 10:32:05AM -0700, Shirley Ma wrote: > On Wed, 2012-05-16 at 18:14 +0300, Michael S. Tsirkin wrote: > > On Wed, May 16, 2012 at 08:10:27AM -0700, Shirley Ma wrote: > > > On Wed, 2012-05-16 at 10:58 +0800, Jason Wang wrote: > > > > >> drivers/vhost/vhost.c | 1 + > > > > >> 1 files changed, 1 insertions(+), 0 deletions(-) > > > > >> > > > > >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > > > > >> index 947f00d..7b75fdf 100644 > > > > >> --- a/drivers/vhost/vhost.c > > > > >> +++ b/drivers/vhost/vhost.c > > > > >> @@ -1604,6 +1604,7 @@ void vhost_zerocopy_callback(void *arg) > > > > >> struct vhost_ubuf_ref *ubufs = ubuf->arg; > > > > >> struct vhost_virtqueue *vq = ubufs->vq; > > > > >> > > > > >> + vhost_poll_queue(&vq->poll); > > > > >> /* set len = 1 to mark this desc buffers done DMA */ > > > > >> vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN; > > > > >> kref_put(&ubufs->kref, vhost_zerocopy_done_signal); > > > > > Doing so, we might have redundant vhost_poll_queue(). Do you > > know in > > > > > which scenario there might be missing of adding and signaling > > during > > > > > zerocopy? > > > > > > > > Yes, as we only do signaling and adding during tx work, if there's > > no > > > > tx > > > > work when the skb were sent, we may lose the opportunity to let > > guest > > > > know about the completion. It's easy to be reproduced with netperf > > > > test. > > > > > > The reason which host signals guest is to free guest tx buffers, if > > > there is no tx work, then it's not necessary to signal the guest > > unless > > > guest runs out of memory. The pending buffers will be released > > > virtio_net device gone. > > > > > > What's the behavior of netperf test when you hit this situation? > > > > > > Thanks > > > Shirley > > > > IIRC guest networking seems to be lost. > > It seems vhost_enable_notify is missing in somewhere else? > > Thanks > Shirley Donnu. I see virtio sending packets but they do not get to tun on host. debugging. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2012-05-16 at 21:36 +0300, Michael S. Tsirkin wrote: > On Wed, May 16, 2012 at 10:32:05AM -0700, Shirley Ma wrote: > > On Wed, 2012-05-16 at 18:14 +0300, Michael S. Tsirkin wrote: > > > On Wed, May 16, 2012 at 08:10:27AM -0700, Shirley Ma wrote: > > > > On Wed, 2012-05-16 at 10:58 +0800, Jason Wang wrote: > > > > > >> drivers/vhost/vhost.c | 1 + > > > > > >> 1 files changed, 1 insertions(+), 0 deletions(-) > > > > > >> > > > > > >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > > > > > >> index 947f00d..7b75fdf 100644 > > > > > >> --- a/drivers/vhost/vhost.c > > > > > >> +++ b/drivers/vhost/vhost.c > > > > > >> @@ -1604,6 +1604,7 @@ void vhost_zerocopy_callback(void > *arg) > > > > > >> struct vhost_ubuf_ref *ubufs = ubuf->arg; > > > > > >> struct vhost_virtqueue *vq = ubufs->vq; > > > > > >> > > > > > >> + vhost_poll_queue(&vq->poll); > > > > > >> /* set len = 1 to mark this desc buffers done DMA > */ > > > > > >> vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN; > > > > > >> kref_put(&ubufs->kref, > vhost_zerocopy_done_signal); > > > > > > Doing so, we might have redundant vhost_poll_queue(). Do you > > > know in > > > > > > which scenario there might be missing of adding and > signaling > > > during > > > > > > zerocopy? > > > > > > > > > > Yes, as we only do signaling and adding during tx work, if > there's > > > no > > > > > tx > > > > > work when the skb were sent, we may lose the opportunity to > let > > > guest > > > > > know about the completion. It's easy to be reproduced with > netperf > > > > > test. > > > > > > > > The reason which host signals guest is to free guest tx buffers, > if > > > > there is no tx work, then it's not necessary to signal the guest > > > unless > > > > guest runs out of memory. The pending buffers will be released > > > > virtio_net device gone. > > > > > > > > What's the behavior of netperf test when you hit this situation? > > > > > > > > Thanks > > > > Shirley > > > > > > IIRC guest networking seems to be lost. > > > > It seems vhost_enable_notify is missing in somewhere else? > > > > Thanks > > Shirley > > Donnu. I see virtio sending packets but they do not get > to tun on host. debugging. I looked at the code, if (zerocopy) check is needed for below code: + if (zerocopy) { num_pends = likely(vq->upend_idx >= vq->done_idx) ? (vq->upend_idx - vq->done_idx) : (vq->upend_idx + UIO_MAXIOV - vq->done_idx); if (unlikely(num_pends > VHOST_MAX_PEND)) { tx_poll_start(net, sock); vhost_poll_queue set_bit(SOCK_ASYNC_NOSPACE, &sock->flags); break; } + } if (unlikely(vhost_enable_notify(&net->dev, vq))) { vhost_disable_notify(&net->dev, vq); continue; } break; Second, whether it's possible the problem comes from tx_poll_start()? In some situation, vhost_poll_wakeup() is not being called? Thanks Shirley -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/17/2012 01:32 AM, Shirley Ma wrote: > On Wed, 2012-05-16 at 18:14 +0300, Michael S. Tsirkin wrote: >> On Wed, May 16, 2012 at 08:10:27AM -0700, Shirley Ma wrote: >>> On Wed, 2012-05-16 at 10:58 +0800, Jason Wang wrote: >>>>>> drivers/vhost/vhost.c | 1 + >>>>>> 1 files changed, 1 insertions(+), 0 deletions(-) >>>>>> >>>>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c >>>>>> index 947f00d..7b75fdf 100644 >>>>>> --- a/drivers/vhost/vhost.c >>>>>> +++ b/drivers/vhost/vhost.c >>>>>> @@ -1604,6 +1604,7 @@ void vhost_zerocopy_callback(void *arg) >>>>>> struct vhost_ubuf_ref *ubufs = ubuf->arg; >>>>>> struct vhost_virtqueue *vq = ubufs->vq; >>>>>> >>>>>> + vhost_poll_queue(&vq->poll); >>>>>> /* set len = 1 to mark this desc buffers done DMA */ >>>>>> vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN; >>>>>> kref_put(&ubufs->kref, vhost_zerocopy_done_signal); >>>>> Doing so, we might have redundant vhost_poll_queue(). Do you >> know in >>>>> which scenario there might be missing of adding and signaling >> during >>>>> zerocopy? >>>> Yes, as we only do signaling and adding during tx work, if there's >> no >>>> tx >>>> work when the skb were sent, we may lose the opportunity to let >> guest >>>> know about the completion. It's easy to be reproduced with netperf >>>> test. >>> The reason which host signals guest is to free guest tx buffers, if >>> there is no tx work, then it's not necessary to signal the guest >> unless >>> guest runs out of memory. The pending buffers will be released >>> virtio_net device gone. Looks like we only free the skbs in .ndo_start_xmit(). >>> >>> What's the behavior of netperf test when you hit this situation? >>> >>> Thanks >>> Shirley >> IIRC guest networking seems to be lost. > It seems vhost_enable_notify is missing in somewhere else? > > Thanks > Shirley > The problem is we may stop the tx queue when there no enough capacity to place packets, at this moment we depends on the tx interrupt to re-enable the tx queue. So if we didn't poll the vhost during callback, guest may lose the tx interrupt to re-enable the tx queue which could stall the whole tx queue. Thanks -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2012-05-17 at 10:50 +0800, Jason Wang wrote: > The problem is we may stop the tx queue when there no enough capacity > to > place packets, at this moment we depends on the tx interrupt to > re-enable the tx queue. So if we didn't poll the vhost during > callback, > guest may lose the tx interrupt to re-enable the tx queue which could > stall the whole tx queue. VHOST_MAX_PEND should handle the capacity. Hasn't the above situation been handled in handle_tx() code?: ... if (unlikely(num_pends > VHOST_MAX_PEND)) { tx_poll_start(net, sock); set_bit(SOCK_ASYNC_NOSPACE, &sock->flags); break; } ... Thanks Shirley -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/17/2012 11:34 PM, Shirley Ma wrote: > On Thu, 2012-05-17 at 10:50 +0800, Jason Wang wrote: >> The problem is we may stop the tx queue when there no enough capacity >> to >> place packets, at this moment we depends on the tx interrupt to >> re-enable the tx queue. So if we didn't poll the vhost during >> callback, >> guest may lose the tx interrupt to re-enable the tx queue which could >> stall the whole tx queue. > VHOST_MAX_PEND should handle the capacity. > > Hasn't the above situation been handled in handle_tx() code?: > ... > if (unlikely(num_pends> VHOST_MAX_PEND)) { > tx_poll_start(net, sock); > set_bit(SOCK_ASYNC_NOSPACE,&sock->flags); > break; > } > ... > > Thanks > Shirley It may not help in because: - tx polling depends on skb_orphan() which is often called by device driver when it place the packet into the queue of the devices instead of when the packets were sent. So it was too early for vhost to be notified. - it only works when the pending DMAs exceeds VHOST_MAX_PEND, it's highly possible that guest needs to be notified when the pending packets isn't so much. So this piece of code may not help and could be removed and we need to poll the virt-queue during zerocopy callback ( through it could be further optimized but may not be easy). > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2012-05-18 at 17:58 +0800, Jason Wang wrote: > On 05/17/2012 11:34 PM, Shirley Ma wrote: > > On Thu, 2012-05-17 at 10:50 +0800, Jason Wang wrote: > >> The problem is we may stop the tx queue when there no enough > capacity > >> to > >> place packets, at this moment we depends on the tx interrupt to > >> re-enable the tx queue. So if we didn't poll the vhost during > >> callback, > >> guest may lose the tx interrupt to re-enable the tx queue which > could > >> stall the whole tx queue. > > VHOST_MAX_PEND should handle the capacity. > > > > Hasn't the above situation been handled in handle_tx() code?: > > ... > > if (unlikely(num_pends> VHOST_MAX_PEND)) { > > tx_poll_start(net, sock); > > > set_bit(SOCK_ASYNC_NOSPACE,&sock->flags); > > break; > > } > > ... > > > > Thanks > > Shirley > > It may not help in because: > > - tx polling depends on skb_orphan() which is often called by device > driver when it place the packet into the queue of the devices instead > of when the packets were sent. So it was too early for vhost to be > notified. Then do you think it's better to replace with vhost_poll_queue here instead? > - it only works when the pending DMAs exceeds VHOST_MAX_PEND, it's > highly possible that guest needs to be notified when the pending > packets > isn't so much. In which situation the guest needs to be notified when there is no TX besides buffers run out? > So this piece of code may not help and could be removed and we need > to > poll the virt-queue during zerocopy callback ( through it could be > further optimized but may not be easy). Thanks Shirley -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/17/2012 03:08 AM, Shirley Ma wrote: > On Wed, 2012-05-16 at 21:36 +0300, Michael S. Tsirkin wrote: >> On Wed, May 16, 2012 at 10:32:05AM -0700, Shirley Ma wrote: >>> On Wed, 2012-05-16 at 18:14 +0300, Michael S. Tsirkin wrote: >>>> On Wed, May 16, 2012 at 08:10:27AM -0700, Shirley Ma wrote: >>>>> On Wed, 2012-05-16 at 10:58 +0800, Jason Wang wrote: >>>>>>>> drivers/vhost/vhost.c | 1 + >>>>>>>> 1 files changed, 1 insertions(+), 0 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c >>>>>>>> index 947f00d..7b75fdf 100644 >>>>>>>> --- a/drivers/vhost/vhost.c >>>>>>>> +++ b/drivers/vhost/vhost.c >>>>>>>> @@ -1604,6 +1604,7 @@ void vhost_zerocopy_callback(void >> *arg) >>>>>>>> struct vhost_ubuf_ref *ubufs = ubuf->arg; >>>>>>>> struct vhost_virtqueue *vq = ubufs->vq; >>>>>>>> >>>>>>>> + vhost_poll_queue(&vq->poll); >>>>>>>> /* set len = 1 to mark this desc buffers done DMA >> */ >>>>>>>> vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN; >>>>>>>> kref_put(&ubufs->kref, >> vhost_zerocopy_done_signal); >>>>>>> Doing so, we might have redundant vhost_poll_queue(). Do you >>>> know in >>>>>>> which scenario there might be missing of adding and >> signaling >>>> during >>>>>>> zerocopy? >>>>>> Yes, as we only do signaling and adding during tx work, if >> there's >>>> no >>>>>> tx >>>>>> work when the skb were sent, we may lose the opportunity to >> let >>>> guest >>>>>> know about the completion. It's easy to be reproduced with >> netperf >>>>>> test. >>>>> The reason which host signals guest is to free guest tx buffers, >> if >>>>> there is no tx work, then it's not necessary to signal the guest >>>> unless >>>>> guest runs out of memory. The pending buffers will be released >>>>> virtio_net device gone. >>>>> >>>>> What's the behavior of netperf test when you hit this situation? >>>>> >>>>> Thanks >>>>> Shirley >>>> IIRC guest networking seems to be lost. >>> It seems vhost_enable_notify is missing in somewhere else? >>> >>> Thanks >>> Shirley >> Donnu. I see virtio sending packets but they do not get >> to tun on host. debugging. > I looked at the code, if (zerocopy) check is needed for below code: > > + if (zerocopy) { > num_pends = likely(vq->upend_idx>= vq->done_idx) ? > (vq->upend_idx - vq->done_idx) : > (vq->upend_idx + UIO_MAXIOV - vq->done_idx); > if (unlikely(num_pends> VHOST_MAX_PEND)) { > tx_poll_start(net, sock); > vhost_poll_queue > set_bit(SOCK_ASYNC_NOSPACE,&sock->flags); > break; > } > + } > if (unlikely(vhost_enable_notify(&net->dev, vq))) { > vhost_disable_notify(&net->dev, vq); > continue; > } > break; No objection to add this but looks no difference to me as we won't touch upend_idx and done_idx when zercocopy is not used. > > Second, whether it's possible the problem comes from tx_poll_start()? In > some situation, vhost_poll_wakeup() is not being called? > > Thanks > Shirley > > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/18/2012 11:29 PM, Shirley Ma wrote: > On Fri, 2012-05-18 at 17:58 +0800, Jason Wang wrote: >> On 05/17/2012 11:34 PM, Shirley Ma wrote: >>> On Thu, 2012-05-17 at 10:50 +0800, Jason Wang wrote: >>>> The problem is we may stop the tx queue when there no enough >> capacity >>>> to >>>> place packets, at this moment we depends on the tx interrupt to >>>> re-enable the tx queue. So if we didn't poll the vhost during >>>> callback, >>>> guest may lose the tx interrupt to re-enable the tx queue which >> could >>>> stall the whole tx queue. >>> VHOST_MAX_PEND should handle the capacity. >>> >>> Hasn't the above situation been handled in handle_tx() code?: >>> ... >>> if (unlikely(num_pends> VHOST_MAX_PEND)) { >>> tx_poll_start(net, sock); >>> >> set_bit(SOCK_ASYNC_NOSPACE,&sock->flags); >>> break; >>> } >>> ... >>> >>> Thanks >>> Shirley >> It may not help in because: >> >> - tx polling depends on skb_orphan() which is often called by device >> driver when it place the packet into the queue of the devices instead >> of when the packets were sent. So it was too early for vhost to be >> notified. > Then do you think it's better to replace with vhost_poll_queue here > instead? Just like what does this patch do - calling vhost_poll_queue() in vhost_zerocopy_callback(). >> - it only works when the pending DMAs exceeds VHOST_MAX_PEND, it's >> highly possible that guest needs to be notified when the pending >> packets >> isn't so much. > In which situation the guest needs to be notified when there is no TX > besides buffers run out? Consider guest call virtqueue_enable_cb_delayed() which means it only need to be notified when 3/4 of pending buffers ( about 178 buffers (256-MAX_SKB_FRAGS-2)*3/4 ) were sent by host. So vhost_net would notify guest when about 60 buffers were pending. Since tx polling is only enabled when pending packets exceeds VHOST_MAX_PEND 128, so tx work would not be notified to run and guest would never get the interrupt it expected to re-enable the queue. And just like what we've discussed, tx polling based adding and signaling is too early for vhost. >> So this piece of code may not help and could be removed and we need >> to >> poll the virt-queue during zerocopy callback ( through it could be >> further optimized but may not be easy). > Thanks > Shirley > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-05-21 at 14:05 +0800, Jason Wang wrote: > >> - tx polling depends on skb_orphan() which is often called by > device > >> driver when it place the packet into the queue of the devices > instead > >> of when the packets were sent. So it was too early for vhost to be > >> notified. > > Then do you think it's better to replace with vhost_poll_queue here > > instead? > > Just like what does this patch do - calling vhost_poll_queue() in > vhost_zerocopy_callback(). > >> - it only works when the pending DMAs exceeds VHOST_MAX_PEND, it's > >> highly possible that guest needs to be notified when the pending > >> packets > >> isn't so much. > > In which situation the guest needs to be notified when there is no > TX > > besides buffers run out? > > Consider guest call virtqueue_enable_cb_delayed() which means it only > need to be notified when 3/4 of pending buffers ( about 178 buffers > (256-MAX_SKB_FRAGS-2)*3/4 ) were sent by host. So vhost_net would > notify > guest when about 60 buffers were pending. Since tx polling is only > enabled when pending packets exceeds VHOST_MAX_PEND 128, so tx work > would not be notified to run and guest would never get the interrupt > it > expected to re-enable the queue. So it seems we still need vhost_enable_notify() in handle_tx when there is no tx in zerocopy case. Do you know which one is more expensive: the cost of vhost_poll_queue() in each zerocopy callback or calling vhost_enable_notify()? Have you compared the results by removing below code in handle_tx()? - if (unlikely(num_pends > VHOST_MAX_PEND)) { - tx_poll_start(net, sock); - set_bit(SOCK_ASYNC_NOSPACE, &sock->flags); - break; - } > > And just like what we've discussed, tx polling based adding and > signaling is too early for vhost. Thanks Shirley -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-05-21 at 08:42 -0700, Shirley Ma wrote: > On Mon, 2012-05-21 at 14:05 +0800, Jason Wang wrote: > > >> - tx polling depends on skb_orphan() which is often called by > > device > > >> driver when it place the packet into the queue of the devices > > instead > > >> of when the packets were sent. So it was too early for vhost to > be > > >> notified. > > > Then do you think it's better to replace with vhost_poll_queue > here > > > instead? > > > > Just like what does this patch do - calling vhost_poll_queue() in > > vhost_zerocopy_callback(). > > >> - it only works when the pending DMAs exceeds VHOST_MAX_PEND, > it's > > >> highly possible that guest needs to be notified when the pending > > >> packets > > >> isn't so much. > > > In which situation the guest needs to be notified when there is no > > TX > > > besides buffers run out? > > > > Consider guest call virtqueue_enable_cb_delayed() which means it > only > > need to be notified when 3/4 of pending buffers ( about 178 buffers > > (256-MAX_SKB_FRAGS-2)*3/4 ) were sent by host. So vhost_net would > > notify > > guest when about 60 buffers were pending. Since tx polling is only > > enabled when pending packets exceeds VHOST_MAX_PEND 128, so tx work > > would not be notified to run and guest would never get the interrupt > > it > > expected to re-enable the queue. > > So it seems we still need vhost_enable_notify() in handle_tx when there > is no tx in zerocopy case. > > Do you know which one is more expensive: the cost of > vhost_poll_queue() > in each zerocopy callback or calling vhost_enable_notify()? > > Have you compared the results by removing below code in handle_tx()? > > - if (unlikely(num_pends > VHOST_MAX_PEND)) { > - tx_poll_start(net, sock); > - set_bit(SOCK_ASYNC_NOSPACE, > &sock->flags); > - break; > - } > > > > And just like what we've discussed, tx polling based adding and > > signaling is too early for vhost. > Then it could be too early for vhost to notify guest anywhere in handle_tx for zerocopy. Then we might need to remove any notification in handle_tx for zerocopy to vhost zerocopy callback instead. Adding vhost_poll_queue in vhost zerocopy callback unconditionally would consume unnecessary cpu. We need to think about a better solution here. Thanks Shirley -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/21/2012 11:42 PM, Shirley Ma wrote: > On Mon, 2012-05-21 at 14:05 +0800, Jason Wang wrote: >>>> - tx polling depends on skb_orphan() which is often called by >> device >>>> driver when it place the packet into the queue of the devices >> instead >>>> of when the packets were sent. So it was too early for vhost to be >>>> notified. >>> Then do you think it's better to replace with vhost_poll_queue here >>> instead? >> Just like what does this patch do - calling vhost_poll_queue() in >> vhost_zerocopy_callback(). >>>> - it only works when the pending DMAs exceeds VHOST_MAX_PEND, it's >>>> highly possible that guest needs to be notified when the pending >>>> packets >>>> isn't so much. >>> In which situation the guest needs to be notified when there is no >> TX >>> besides buffers run out? >> Consider guest call virtqueue_enable_cb_delayed() which means it only >> need to be notified when 3/4 of pending buffers ( about 178 buffers >> (256-MAX_SKB_FRAGS-2)*3/4 ) were sent by host. So vhost_net would >> notify >> guest when about 60 buffers were pending. Since tx polling is only >> enabled when pending packets exceeds VHOST_MAX_PEND 128, so tx work >> would not be notified to run and guest would never get the interrupt >> it >> expected to re-enable the queue. > So it seems we still need vhost_enable_notify() in handle_tx when there > is no tx in zerocopy case. > > Do you know which one is more expensive: the cost of vhost_poll_queue() > in each zerocopy callback or calling vhost_enable_notify()? Didn't follow here, do you mean vhost_signal() here? > > Have you compared the results by removing below code in handle_tx()? > > - if (unlikely(num_pends> VHOST_MAX_PEND)) { > - tx_poll_start(net, sock); > - set_bit(SOCK_ASYNC_NOSPACE,&sock->flags); > - break; > - } I remember I've done some basic test when I send this patch, there's no much increasing of cpu utilization. Would double check this again. >> And just like what we've discussed, tx polling based adding and >> signaling is too early for vhost. > Thanks > Shirley > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/22/2012 12:12 AM, Shirley Ma wrote: > Then it could be too early for vhost to notify guest anywhere in > handle_tx for zerocopy. Then we might need to remove any notification in > handle_tx for zerocopy to vhost zerocopy callback instead. > > Adding vhost_poll_queue in vhost zerocopy callback unconditionally would > consume unnecessary cpu. > > We need to think about a better solution here. A possible idea is only call vhost_poll_queue only when the packet of used_event were sent: if there's no out-of-order completion vhost could signal the guest; if there is, let the pending packets before used_event call vhost_poll_queue. > Thanks > Shirley -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2012-05-22 at 18:05 +0800, Jason Wang wrote: > On 05/21/2012 11:42 PM, Shirley Ma wrote: > > On Mon, 2012-05-21 at 14:05 +0800, Jason Wang wrote: > >>>> - tx polling depends on skb_orphan() which is often called by > >> device > >>>> driver when it place the packet into the queue of the devices > >> instead > >>>> of when the packets were sent. So it was too early for vhost to > be > >>>> notified. > >>> Then do you think it's better to replace with vhost_poll_queue > here > >>> instead? > >> Just like what does this patch do - calling vhost_poll_queue() in > >> vhost_zerocopy_callback(). > >>>> - it only works when the pending DMAs exceeds VHOST_MAX_PEND, > it's > >>>> highly possible that guest needs to be notified when the pending > >>>> packets > >>>> isn't so much. > >>> In which situation the guest needs to be notified when there is no > >> TX > >>> besides buffers run out? > >> Consider guest call virtqueue_enable_cb_delayed() which means it > only > >> need to be notified when 3/4 of pending buffers ( about 178 buffers > >> (256-MAX_SKB_FRAGS-2)*3/4 ) were sent by host. So vhost_net would > >> notify > >> guest when about 60 buffers were pending. Since tx polling is only > >> enabled when pending packets exceeds VHOST_MAX_PEND 128, so tx work > >> would not be notified to run and guest would never get the > interrupt > >> it > >> expected to re-enable the queue. > > So it seems we still need vhost_enable_notify() in handle_tx when > there > > is no tx in zerocopy case. > > > > Do you know which one is more expensive: the cost of > vhost_poll_queue() > > in each zerocopy callback or calling vhost_enable_notify()? > > Didn't follow here, do you mean vhost_signal() here? I meant removing the code in handle_tx for zerocopy as below: + if (zcopy) { /* If more outstanding DMAs, queue the work. * Handle upend_idx wrap around */ num_pends = likely(vq->upend_idx >= vq->done_idx) ? (vq->upend_idx - vq->done_idx) : (vq->upend_idx + UIO_MAXIOV - vq->done_idx); + /* zerocopy vhost_enable_notify is under zerocopy callback + * since it could be too early to notify here */ + break; + } - if (unlikely(num_pends > VHOST_MAX_PEND)) { - tx_poll_start(net, sock); - set_bit(SOCK_ASYNC_NOSPACE, &sock->flags); - break; - } if (unlikely(vhost_enable_notify(&net->dev, vq))) { vhost_disable_notify(&net->dev, vq); continue; } break; Thanks Shirley -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/22/2012 11:55 PM, Shirley Ma wrote: > On Tue, 2012-05-22 at 18:05 +0800, Jason Wang wrote: >> On 05/21/2012 11:42 PM, Shirley Ma wrote: >>> On Mon, 2012-05-21 at 14:05 +0800, Jason Wang wrote: >>>>>> - tx polling depends on skb_orphan() which is often called by >>>> device >>>>>> driver when it place the packet into the queue of the devices >>>> instead >>>>>> of when the packets were sent. So it was too early for vhost to >> be >>>>>> notified. >>>>> Then do you think it's better to replace with vhost_poll_queue >> here >>>>> instead? >>>> Just like what does this patch do - calling vhost_poll_queue() in >>>> vhost_zerocopy_callback(). >>>>>> - it only works when the pending DMAs exceeds VHOST_MAX_PEND, >> it's >>>>>> highly possible that guest needs to be notified when the pending >>>>>> packets >>>>>> isn't so much. >>>>> In which situation the guest needs to be notified when there is no >>>> TX >>>>> besides buffers run out? >>>> Consider guest call virtqueue_enable_cb_delayed() which means it >> only >>>> need to be notified when 3/4 of pending buffers ( about 178 buffers >>>> (256-MAX_SKB_FRAGS-2)*3/4 ) were sent by host. So vhost_net would >>>> notify >>>> guest when about 60 buffers were pending. Since tx polling is only >>>> enabled when pending packets exceeds VHOST_MAX_PEND 128, so tx work >>>> would not be notified to run and guest would never get the >> interrupt >>>> it >>>> expected to re-enable the queue. >>> So it seems we still need vhost_enable_notify() in handle_tx when >> there >>> is no tx in zerocopy case. >>> >>> Do you know which one is more expensive: the cost of >> vhost_poll_queue() >>> in each zerocopy callback or calling vhost_enable_notify()? >> Didn't follow here, do you mean vhost_signal() here? > I meant removing the code in handle_tx for zerocopy as below: > > + if (zcopy) { > /* If more outstanding DMAs, queue the work. > * Handle upend_idx wrap around > */ > num_pends = likely(vq->upend_idx>= vq->done_idx) ? > (vq->upend_idx - vq->done_idx) : > (vq->upend_idx + UIO_MAXIOV - vq->done_idx); > + /* zerocopy vhost_enable_notify is under zerocopy callback > + * since it could be too early to notify here */ > + break; > + } > - if (unlikely(num_pends> VHOST_MAX_PEND)) { > - tx_poll_start(net, sock); > - set_bit(SOCK_ASYNC_NOSPACE,&sock->flags); > - break; > - } > if (unlikely(vhost_enable_notify(&net->dev, vq))) { > vhost_disable_notify(&net->dev, vq); > continue; > } > break; Didn't think this can work well as the notification from guest were disabled forever. > > Thanks > Shirley > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 947f00d..7b75fdf 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -1604,6 +1604,7 @@ void vhost_zerocopy_callback(void *arg) struct vhost_ubuf_ref *ubufs = ubuf->arg; struct vhost_virtqueue *vq = ubufs->vq; + vhost_poll_queue(&vq->poll); /* set len = 1 to mark this desc buffers done DMA */ vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN; kref_put(&ubufs->kref, vhost_zerocopy_done_signal);
We add used and signal guest in worker thread but did not poll the virtqueue during the zero copy callback. This may lead the missing of adding and signalling during zerocopy. Solve this by polling the virtqueue and let it wakeup the worker during callback. Signed-off-by: Jason Wang <jasowang@redhat.com> --- drivers/vhost/vhost.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html