Message ID | 64d47c628b3fdc0ac156aed4be182933d8bcc0db.1304541919.git.mst@redhat.com |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > Add an API that tells the other side that callbacks > should be delayed until a lot of work has been done. > Implement using the new used_event feature. Since you're going to add a capacity query anyway, why not add the threshold argument here? Then the caller can choose how much space it needs. Maybe net and block will want different things... Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote: > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > Add an API that tells the other side that callbacks > > should be delayed until a lot of work has been done. > > Implement using the new used_event feature. > > Since you're going to add a capacity query anyway, why not add the > threshold argument here? I thought that if we keep the API kind of generic there might be more of a chance that future transports will be able to implement it. For example, with an old host we can't commit to a specific index. > > Then the caller can choose how much space it needs. Maybe net and block > will want different things... > > Cheers, > Rusty. Hmm, maybe. OK.
On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote: > > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > Add an API that tells the other side that callbacks > > > should be delayed until a lot of work has been done. > > > Implement using the new used_event feature. > > > > Since you're going to add a capacity query anyway, why not add the > > threshold argument here? > > I thought that if we keep the API kind of generic > there might be more of a chance that future transports > will be able to implement it. For example, with an > old host we can't commit to a specific index. No, it's always a hint anyway: you can be notified before the threshold is reached. But best make it explicit I think. Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 16, 2011 at 04:43:21PM +0930, Rusty Russell wrote: > On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote: > > > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > Add an API that tells the other side that callbacks > > > > should be delayed until a lot of work has been done. > > > > Implement using the new used_event feature. > > > > > > Since you're going to add a capacity query anyway, why not add the > > > threshold argument here? > > > > I thought that if we keep the API kind of generic > > there might be more of a chance that future transports > > will be able to implement it. For example, with an > > old host we can't commit to a specific index. > > No, it's always a hint anyway: you can be notified before the threshold > is reached. But best make it explicit I think. > > Cheers, > Rusty. I tried doing that and remembered the real reason I went for this API: capacity is limited by descriptor table space, not used ring space: each entry in the used ring frees up multiple entries in the descriptor ring. Thus the ring can't provide callback after capacity is N: capacity is only available after we get bufs. We could try and make the API pass in the number of freed bufs, however: - this is not really what virtio-net cares about (it cares about capacity) - if the driver passes a number > number of outstanding bufs, it will never get a callback. So to stay correct the driver will need to track number of outstanding requests. The simpler API avoids that. APIs are easy to change so I'm guessing it's not a major blocker: we can change later when e.g. block tries to pass in some kind of extra hint: we'll be smarter about how this API can change then. Right?
On Thu, 19 May 2011 10:24:12 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Mon, May 16, 2011 at 04:43:21PM +0930, Rusty Russell wrote: > > On Sun, 15 May 2011 15:48:18 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Mon, May 09, 2011 at 03:27:33PM +0930, Rusty Russell wrote: > > > > On Wed, 4 May 2011 23:52:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > Add an API that tells the other side that callbacks > > > > > should be delayed until a lot of work has been done. > > > > > Implement using the new used_event feature. > > > > > > > > Since you're going to add a capacity query anyway, why not add the > > > > threshold argument here? > > > > > > I thought that if we keep the API kind of generic > > > there might be more of a chance that future transports > > > will be able to implement it. For example, with an > > > old host we can't commit to a specific index. > > > > No, it's always a hint anyway: you can be notified before the threshold > > is reached. But best make it explicit I think. > > > > Cheers, > > Rusty. > > > I tried doing that and remembered the real reason I went for this API: > > capacity is limited by descriptor table space, not > used ring space: each entry in the used ring frees up multiple entries > in the descriptor ring. Thus the ring can't provide > callback after capacity is N: capacity is only available > after we get bufs. I think this indicates a problem, but I haven't reviewed your patches except very cursorily. I am slack... > We could try and make the API pass in the number of freed bufs, however: > - this is not really what virtio-net cares about (it cares about > capacity) Yes, max sg elements and max requests are separate, though in the current virtio implementation remaining sg <= remaining request slots. That's why we currently feed back remaining descriptors to the driver, not the number of request slots. This implies that the thresholds should refer to descriptor numbers (ie. wake me when there are this many descriptors freed), not the used ring at all. Which means we're barking up the wrong tree... I think this needs more thought. > - if the driver passes a number > number of outstanding bufs, it will > never get a callback. So to stay correct the driver will need to > track number of outstanding requests. The simpler API avoids that. I think the driver should simply say "wake me when you have this many descriptors free". And set it during initialization, rather than every time. The virtio_ring code should handle it from there. Perhaps that can be done with the current technique, where the virtio_ring makes an educated guess on when sufficient capacity will be available... Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 262dfe6..3a70d70 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -397,6 +397,33 @@ bool virtqueue_enable_cb(struct virtqueue *_vq) } EXPORT_SYMBOL_GPL(virtqueue_enable_cb); +bool virtqueue_enable_cb_delayed(struct virtqueue *_vq) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + int bufs; + + START_USE(vq); + + /* We optimistically turn back on interrupts, then check if there was + * more to do. */ + /* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to + * either clear the flags bit or point the event index at the next + * entry. Always do both to keep code simple. */ + vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT; + /* TODO: tune this threshold */ + bufs = (vq->vring.avail->idx - vq->last_used_idx) * 3 / 4; + vring_used_event(&vq->vring) = vq->last_used_idx + bufs; + virtio_mb(); + if (unlikely(vq->vring.used->idx - vq->last_used_idx > bufs)) { + END_USE(vq); + return false; + } + + END_USE(vq); + return true; +} +EXPORT_SYMBOL_GPL(virtqueue_enable_cb_delayed); + void *virtqueue_detach_unused_buf(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); diff --git a/include/linux/virtio.h b/include/linux/virtio.h index 718336b..5151fd1 100644 --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -51,6 +51,13 @@ struct virtqueue { * This re-enables callbacks; it returns "false" if there are pending * buffers in the queue, to detect a possible race between the driver * checking for more work, and enabling callbacks. + * virtqueue_enable_cb_delayed: restart callbacks after disable_cb. + * vq: the struct virtqueue we're talking about. + * This re-enables callbacks but hints to the other side to delay + * interrupts until most of the available buffers have been processed; + * it returns "false" if there are many pending buffers in the queue, + * to detect a possible race between the driver checking for more work, + * and enabling callbacks. * virtqueue_detach_unused_buf: detach first unused buffer * vq: the struct virtqueue we're talking about. * Returns NULL or the "data" token handed to add_buf @@ -86,6 +93,8 @@ void virtqueue_disable_cb(struct virtqueue *vq); bool virtqueue_enable_cb(struct virtqueue *vq); +bool virtqueue_enable_cb_delayed(struct virtqueue *vq); + void *virtqueue_detach_unused_buf(struct virtqueue *vq); /**
Add an API that tells the other side that callbacks should be delayed until a lot of work has been done. Implement using the new used_event feature. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> --- drivers/virtio/virtio_ring.c | 27 +++++++++++++++++++++++++++ include/linux/virtio.h | 9 +++++++++ 2 files changed, 36 insertions(+), 0 deletions(-)