diff mbox series

[v4,2/8] vhost-user.rst: Improve [GS]ET_VRING_BASE doc

Message ID 20231004125904.110781-3-hreitz@redhat.com
State New
Headers show
Series vhost-user: Back-end state migration | expand

Commit Message

Hanna Czenczek Oct. 4, 2023, 12:58 p.m. UTC
GET_VRING_BASE does not mention that it stops the respective ring.  Fix
that.

Furthermore, it is not fully clear what the "base offset" these
commands' documentation refers to is; an offset could be many things.
Be more precise and verbose about it, especially given that these
commands use different payload structures depending on whether the vring
is split or packed.

Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
 docs/interop/vhost-user.rst | 66 ++++++++++++++++++++++++++++++++++---
 1 file changed, 62 insertions(+), 4 deletions(-)

Comments

Stefan Hajnoczi Oct. 5, 2023, 5:38 p.m. UTC | #1
On Wed, Oct 04, 2023 at 02:58:58PM +0200, Hanna Czenczek wrote:
> GET_VRING_BASE does not mention that it stops the respective ring.  Fix
> that.
> 
> Furthermore, it is not fully clear what the "base offset" these
> commands' documentation refers to is; an offset could be many things.
> Be more precise and verbose about it, especially given that these
> commands use different payload structures depending on whether the vring
> is split or packed.
> 
> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> ---
>  docs/interop/vhost-user.rst | 66 ++++++++++++++++++++++++++++++++++---
>  1 file changed, 62 insertions(+), 4 deletions(-)
> 
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 2f68e67a1a..50f5acebe5 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -108,6 +108,37 @@ A vring state description
>  
>  :num: a 32-bit number
>  
> +A vring descriptor index for split virtqueues
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> ++-------------+---------------------+
> +| vring index | index in avail ring |
> ++-------------+---------------------+
> +
> +:vring index: 32-bit index of the respective virtqueue
> +
> +:index in avail ring: 32-bit value, of which currently only the lower 16
> +  bits are used:
> +
> +  - Bits 0–15: Next descriptor index in the *Available Ring*

I think we need to say more to make this implementable just by reading
the spec:

  Index of the next *Available Ring* descriptor that the back-end will
  process. This is a free-running index that is not wrapped by the ring
  size.

Feel free to rephrase.

> +  - Bits 16–31: Reserved (set to zero)
> +
> +Vring descriptor indices for packed virtqueues
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> ++-------------+--------------------+
> +| vring index | descriptor indices |
> ++-------------+--------------------+
> +
> +:vring index: 32-bit index of the respective virtqueue
> +
> +:descriptor indices: 32-bit value:
> +
> +  - Bits 0–14: Index in the *Available Ring*

Same here.

> +  - Bit 15: Driver (Available) Ring Wrap Counter
> +  - Bits 16–30: Index in the *Used Ring*

Same here.

> +  - Bit 31: Device (Used) Ring Wrap Counter
> +
>  A vring address description
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>  
> @@ -1031,18 +1062,45 @@ Front-end message types
>  ``VHOST_USER_SET_VRING_BASE``
>    :id: 10
>    :equivalent ioctl: ``VHOST_SET_VRING_BASE``
> -  :request payload: vring state description
> +  :request payload: vring descriptor index/indices
>    :reply payload: N/A
>  
> -  Sets the base offset in the available vring.
> +  Sets the next index to use for descriptors in this vring:
> +
> +  * For a split virtqueue, sets only the next descriptor index in the
> +    *Available Ring*.  The device is supposed to read the next index in
> +    the *Used Ring* from the respective vring structure in guest memory.
> +
> +  * For a packed virtqueue, both indices are supplied, as they are not
> +    explicitly available in memory.
> +
> +  Consequently, the payload type is specific to the type of virt queue
> +  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
> +  indices for packed virtqueues*).
>  
>  ``VHOST_USER_GET_VRING_BASE``
>    :id: 11
>    :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
>    :request payload: vring state description
> -  :reply payload: vring state description
> +  :reply payload: vring descriptor index/indices
> +
> +  Stops the vring and returns the current descriptor index or indices:
> +
> +    * For a split virtqueue, returns only the 16-bit next descriptor
> +      index in the *Available Ring*.  The index in the *Used Ring* is
> +      controlled by the guest driver and can be read from the vring

I find "is controlled by the guest driver" confusing. The device writes
the Used Ring index. The driver only reads it. The device is the active
party here.

The sentence can be shortened to omit the "controlled by the guest
driver" part.

> +      structure in memory, so is not covered.
> +
> +    * For a packed virtqueue, neither index is explicitly available to
> +      read from memory, so both indices (as maintained by the device) are
> +      returned.
> +
> +  Consequently, the payload type is specific to the type of virt queue
> +  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
> +  indices for packed virtqueues*).
>  
> -  Get the available vring base offset.
> +  The request payload’s *num* field is currently reserved and must be
> +  set to 0.
>  
>  ``VHOST_USER_SET_VRING_KICK``
>    :id: 12
> -- 
> 2.41.0
>
Hanna Czenczek Oct. 6, 2023, 7:53 a.m. UTC | #2
On 05.10.23 19:38, Stefan Hajnoczi wrote:
> On Wed, Oct 04, 2023 at 02:58:58PM +0200, Hanna Czenczek wrote:
>> GET_VRING_BASE does not mention that it stops the respective ring.  Fix
>> that.
>>
>> Furthermore, it is not fully clear what the "base offset" these
>> commands' documentation refers to is; an offset could be many things.
>> Be more precise and verbose about it, especially given that these
>> commands use different payload structures depending on whether the vring
>> is split or packed.
>>
>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>> ---
>>   docs/interop/vhost-user.rst | 66 ++++++++++++++++++++++++++++++++++---
>>   1 file changed, 62 insertions(+), 4 deletions(-)
>>
>> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
>> index 2f68e67a1a..50f5acebe5 100644
>> --- a/docs/interop/vhost-user.rst
>> +++ b/docs/interop/vhost-user.rst
>> @@ -108,6 +108,37 @@ A vring state description
>>   
>>   :num: a 32-bit number
>>   
>> +A vring descriptor index for split virtqueues
>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> ++-------------+---------------------+
>> +| vring index | index in avail ring |
>> ++-------------+---------------------+
>> +
>> +:vring index: 32-bit index of the respective virtqueue
>> +
>> +:index in avail ring: 32-bit value, of which currently only the lower 16
>> +  bits are used:
>> +
>> +  - Bits 0–15: Next descriptor index in the *Available Ring*
> I think we need to say more to make this implementable just by reading
> the spec:
>
>    Index of the next *Available Ring* descriptor that the back-end will
>    process. This is a free-running index that is not wrapped by the ring
>    size.

Sure, thanks.

> Feel free to rephrase.
>
>> +  - Bits 16–31: Reserved (set to zero)
>> +
>> +Vring descriptor indices for packed virtqueues
>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> ++-------------+--------------------+
>> +| vring index | descriptor indices |
>> ++-------------+--------------------+
>> +
>> +:vring index: 32-bit index of the respective virtqueue
>> +
>> +:descriptor indices: 32-bit value:
>> +
>> +  - Bits 0–14: Index in the *Available Ring*
> Same here.
>
>> +  - Bit 15: Driver (Available) Ring Wrap Counter
>> +  - Bits 16–30: Index in the *Used Ring*
> Same here.
>
>> +  - Bit 31: Device (Used) Ring Wrap Counter
>> +
>>   A vring address description
>>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>   
>> @@ -1031,18 +1062,45 @@ Front-end message types
>>   ``VHOST_USER_SET_VRING_BASE``
>>     :id: 10
>>     :equivalent ioctl: ``VHOST_SET_VRING_BASE``
>> -  :request payload: vring state description
>> +  :request payload: vring descriptor index/indices
>>     :reply payload: N/A
>>   
>> -  Sets the base offset in the available vring.
>> +  Sets the next index to use for descriptors in this vring:
>> +
>> +  * For a split virtqueue, sets only the next descriptor index in the
>> +    *Available Ring*.  The device is supposed to read the next index in
>> +    the *Used Ring* from the respective vring structure in guest memory.
>> +
>> +  * For a packed virtqueue, both indices are supplied, as they are not
>> +    explicitly available in memory.
>> +
>> +  Consequently, the payload type is specific to the type of virt queue
>> +  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
>> +  indices for packed virtqueues*).
>>   
>>   ``VHOST_USER_GET_VRING_BASE``
>>     :id: 11
>>     :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
>>     :request payload: vring state description
>> -  :reply payload: vring state description
>> +  :reply payload: vring descriptor index/indices
>> +
>> +  Stops the vring and returns the current descriptor index or indices:
>> +
>> +    * For a split virtqueue, returns only the 16-bit next descriptor
>> +      index in the *Available Ring*.  The index in the *Used Ring* is
>> +      controlled by the guest driver and can be read from the vring
> I find "is controlled by the guest driver" confusing. The device writes
> the Used Ring index. The driver only reads it. The device is the active
> party here.

Er, good point.  That breaks the whole reasoning.  Then I don’t 
understand why we do get/set the available ring index and not the used 
ring index.  Do you know why?

> The sentence can be shortened to omit the "controlled by the guest
> driver" part.

I don’t want to shorten it, because I would like to know why we don’t 
get/set both indices for split virtqueues, too.

Hanna

>> +      structure in memory, so is not covered.
>> +
>> +    * For a packed virtqueue, neither index is explicitly available to
>> +      read from memory, so both indices (as maintained by the device) are
>> +      returned.
>> +
>> +  Consequently, the payload type is specific to the type of virt queue
>> +  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
>> +  indices for packed virtqueues*).
>>   
>> -  Get the available vring base offset.
>> +  The request payload’s *num* field is currently reserved and must be
>> +  set to 0.
>>   
>>   ``VHOST_USER_SET_VRING_KICK``
>>     :id: 12
>> -- 
>> 2.41.0
>>
>>
>> _______________________________________________
>> Virtio-fs mailing list
>> Virtio-fs@redhat.com
>> https://listman.redhat.com/mailman/listinfo/virtio-fs
Michael S. Tsirkin Oct. 6, 2023, 8:49 a.m. UTC | #3
On Fri, Oct 06, 2023 at 09:53:53AM +0200, Hanna Czenczek wrote:
> On 05.10.23 19:38, Stefan Hajnoczi wrote:
> > On Wed, Oct 04, 2023 at 02:58:58PM +0200, Hanna Czenczek wrote:
> > > GET_VRING_BASE does not mention that it stops the respective ring.  Fix
> > > that.
> > > 
> > > Furthermore, it is not fully clear what the "base offset" these
> > > commands' documentation refers to is; an offset could be many things.
> > > Be more precise and verbose about it, especially given that these
> > > commands use different payload structures depending on whether the vring
> > > is split or packed.
> > > 
> > > Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> > > ---
> > >   docs/interop/vhost-user.rst | 66 ++++++++++++++++++++++++++++++++++---
> > >   1 file changed, 62 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > > index 2f68e67a1a..50f5acebe5 100644
> > > --- a/docs/interop/vhost-user.rst
> > > +++ b/docs/interop/vhost-user.rst
> > > @@ -108,6 +108,37 @@ A vring state description
> > >   :num: a 32-bit number
> > > +A vring descriptor index for split virtqueues
> > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > +
> > > ++-------------+---------------------+
> > > +| vring index | index in avail ring |
> > > ++-------------+---------------------+
> > > +
> > > +:vring index: 32-bit index of the respective virtqueue
> > > +
> > > +:index in avail ring: 32-bit value, of which currently only the lower 16
> > > +  bits are used:
> > > +
> > > +  - Bits 0–15: Next descriptor index in the *Available Ring*
> > I think we need to say more to make this implementable just by reading
> > the spec:
> > 
> >    Index of the next *Available Ring* descriptor that the back-end will
> >    process. This is a free-running index that is not wrapped by the ring
> >    size.
> 
> Sure, thanks.
> 
> > Feel free to rephrase.
> > 
> > > +  - Bits 16–31: Reserved (set to zero)
> > > +
> > > +Vring descriptor indices for packed virtqueues
> > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > +
> > > ++-------------+--------------------+
> > > +| vring index | descriptor indices |
> > > ++-------------+--------------------+
> > > +
> > > +:vring index: 32-bit index of the respective virtqueue
> > > +
> > > +:descriptor indices: 32-bit value:
> > > +
> > > +  - Bits 0–14: Index in the *Available Ring*
> > Same here.
> > 
> > > +  - Bit 15: Driver (Available) Ring Wrap Counter
> > > +  - Bits 16–30: Index in the *Used Ring*
> > Same here.
> > 
> > > +  - Bit 31: Device (Used) Ring Wrap Counter
> > > +
> > >   A vring address description
> > >   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > @@ -1031,18 +1062,45 @@ Front-end message types
> > >   ``VHOST_USER_SET_VRING_BASE``
> > >     :id: 10
> > >     :equivalent ioctl: ``VHOST_SET_VRING_BASE``
> > > -  :request payload: vring state description
> > > +  :request payload: vring descriptor index/indices
> > >     :reply payload: N/A
> > > -  Sets the base offset in the available vring.
> > > +  Sets the next index to use for descriptors in this vring:
> > > +
> > > +  * For a split virtqueue, sets only the next descriptor index in the
> > > +    *Available Ring*.  The device is supposed to read the next index in
> > > +    the *Used Ring* from the respective vring structure in guest memory.
> > > +
> > > +  * For a packed virtqueue, both indices are supplied, as they are not
> > > +    explicitly available in memory.
> > > +
> > > +  Consequently, the payload type is specific to the type of virt queue
> > > +  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
> > > +  indices for packed virtqueues*).
> > >   ``VHOST_USER_GET_VRING_BASE``
> > >     :id: 11
> > >     :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
> > >     :request payload: vring state description
> > > -  :reply payload: vring state description
> > > +  :reply payload: vring descriptor index/indices
> > > +
> > > +  Stops the vring and returns the current descriptor index or indices:
> > > +
> > > +    * For a split virtqueue, returns only the 16-bit next descriptor
> > > +      index in the *Available Ring*.  The index in the *Used Ring* is
> > > +      controlled by the guest driver and can be read from the vring
> > I find "is controlled by the guest driver" confusing. The device writes
> > the Used Ring index. The driver only reads it. The device is the active
> > party here.
> 
> Er, good point.  That breaks the whole reasoning.  Then I don’t understand
> why we do get/set the available ring index and not the used ring index.  Do
> you know why?

It's simple. used ring index in memory is controlled by the device and
reflects device state. device can just read it back to restore.
available ring index in memory is controlled by driver and does
not reflect device state.

> > The sentence can be shortened to omit the "controlled by the guest
> > driver" part.
> 
> I don’t want to shorten it, because I would like to know why we don’t
> get/set both indices for split virtqueues, too.
> 
> Hanna
> 
> > > +      structure in memory, so is not covered.
> > > +
> > > +    * For a packed virtqueue, neither index is explicitly available to
> > > +      read from memory, so both indices (as maintained by the device) are
> > > +      returned.
> > > +
> > > +  Consequently, the payload type is specific to the type of virt queue
> > > +  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
> > > +  indices for packed virtqueues*).
> > > -  Get the available vring base offset.
> > > +  The request payload’s *num* field is currently reserved and must be
> > > +  set to 0.
> > >   ``VHOST_USER_SET_VRING_KICK``
> > >     :id: 12
> > > -- 
> > > 2.41.0
> > > 
> > > 
> > > _______________________________________________
> > > Virtio-fs mailing list
> > > Virtio-fs@redhat.com
> > > https://listman.redhat.com/mailman/listinfo/virtio-fs
Hanna Czenczek Oct. 6, 2023, 1:55 p.m. UTC | #4
On 06.10.23 10:49, Michael S. Tsirkin wrote:
> On Fri, Oct 06, 2023 at 09:53:53AM +0200, Hanna Czenczek wrote:
>> On 05.10.23 19:38, Stefan Hajnoczi wrote:
>>> On Wed, Oct 04, 2023 at 02:58:58PM +0200, Hanna Czenczek wrote:
>>>> GET_VRING_BASE does not mention that it stops the respective ring.  Fix
>>>> that.
>>>>
>>>> Furthermore, it is not fully clear what the "base offset" these
>>>> commands' documentation refers to is; an offset could be many things.
>>>> Be more precise and verbose about it, especially given that these
>>>> commands use different payload structures depending on whether the vring
>>>> is split or packed.
>>>>
>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>>>> ---
>>>>    docs/interop/vhost-user.rst | 66 ++++++++++++++++++++++++++++++++++---
>>>>    1 file changed, 62 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
>>>> index 2f68e67a1a..50f5acebe5 100644
>>>> --- a/docs/interop/vhost-user.rst
>>>> +++ b/docs/interop/vhost-user.rst
>>>> @@ -108,6 +108,37 @@ A vring state description
>>>>    :num: a 32-bit number
>>>> +A vring descriptor index for split virtqueues
>>>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> +
>>>> ++-------------+---------------------+
>>>> +| vring index | index in avail ring |
>>>> ++-------------+---------------------+
>>>> +
>>>> +:vring index: 32-bit index of the respective virtqueue
>>>> +
>>>> +:index in avail ring: 32-bit value, of which currently only the lower 16
>>>> +  bits are used:
>>>> +
>>>> +  - Bits 0–15: Next descriptor index in the *Available Ring*
>>> I think we need to say more to make this implementable just by reading
>>> the spec:
>>>
>>>     Index of the next *Available Ring* descriptor that the back-end will
>>>     process. This is a free-running index that is not wrapped by the ring
>>>     size.
>> Sure, thanks.
>>
>>> Feel free to rephrase.
>>>
>>>> +  - Bits 16–31: Reserved (set to zero)
>>>> +
>>>> +Vring descriptor indices for packed virtqueues
>>>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> +
>>>> ++-------------+--------------------+
>>>> +| vring index | descriptor indices |
>>>> ++-------------+--------------------+
>>>> +
>>>> +:vring index: 32-bit index of the respective virtqueue
>>>> +
>>>> +:descriptor indices: 32-bit value:
>>>> +
>>>> +  - Bits 0–14: Index in the *Available Ring*
>>> Same here.
>>>
>>>> +  - Bit 15: Driver (Available) Ring Wrap Counter
>>>> +  - Bits 16–30: Index in the *Used Ring*
>>> Same here.
>>>
>>>> +  - Bit 31: Device (Used) Ring Wrap Counter
>>>> +
>>>>    A vring address description
>>>>    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> @@ -1031,18 +1062,45 @@ Front-end message types
>>>>    ``VHOST_USER_SET_VRING_BASE``
>>>>      :id: 10
>>>>      :equivalent ioctl: ``VHOST_SET_VRING_BASE``
>>>> -  :request payload: vring state description
>>>> +  :request payload: vring descriptor index/indices
>>>>      :reply payload: N/A
>>>> -  Sets the base offset in the available vring.
>>>> +  Sets the next index to use for descriptors in this vring:
>>>> +
>>>> +  * For a split virtqueue, sets only the next descriptor index in the
>>>> +    *Available Ring*.  The device is supposed to read the next index in
>>>> +    the *Used Ring* from the respective vring structure in guest memory.
>>>> +
>>>> +  * For a packed virtqueue, both indices are supplied, as they are not
>>>> +    explicitly available in memory.
>>>> +
>>>> +  Consequently, the payload type is specific to the type of virt queue
>>>> +  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
>>>> +  indices for packed virtqueues*).
>>>>    ``VHOST_USER_GET_VRING_BASE``
>>>>      :id: 11
>>>>      :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
>>>>      :request payload: vring state description
>>>> -  :reply payload: vring state description
>>>> +  :reply payload: vring descriptor index/indices
>>>> +
>>>> +  Stops the vring and returns the current descriptor index or indices:
>>>> +
>>>> +    * For a split virtqueue, returns only the 16-bit next descriptor
>>>> +      index in the *Available Ring*.  The index in the *Used Ring* is
>>>> +      controlled by the guest driver and can be read from the vring
>>> I find "is controlled by the guest driver" confusing. The device writes
>>> the Used Ring index. The driver only reads it. The device is the active
>>> party here.
>> Er, good point.  That breaks the whole reasoning.  Then I don’t understand
>> why we do get/set the available ring index and not the used ring index.  Do
>> you know why?
> It's simple. used ring index in memory is controlled by the device and
> reflects device state.

Exactly, it’s device state, that’s why I thought the front-end needs to 
ensure its read and restored around the reset we currently have in 
vhost_dev_stop()/start().

> device can just read it back to restore.

I find it strange that the device is supposed to read its own state from 
memory.

> available ring index in memory is controlled by driver and does
> not reflect device state.

Why can’t the device read the available index from memory?  That value 
is put into memory by the driver precisely so the device can read it 
from there.

Hanna
Hanna Czenczek Oct. 6, 2023, 1:58 p.m. UTC | #5
On 06.10.23 15:55, Hanna Czenczek wrote:
> On 06.10.23 10:49, Michael S. Tsirkin wrote:
>> On Fri, Oct 06, 2023 at 09:53:53AM +0200, Hanna Czenczek wrote:
>>> On 05.10.23 19:38, Stefan Hajnoczi wrote:
>>>> On Wed, Oct 04, 2023 at 02:58:58PM +0200, Hanna Czenczek wrote:

[...]

>>>>    ``VHOST_USER_GET_VRING_BASE``
>>>>      :id: 11
>>>>      :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
>>>>      :request payload: vring state description
>>>> -  :reply payload: vring state description
>>>> +  :reply payload: vring descriptor index/indices
>>>> +
>>>> +  Stops the vring and returns the current descriptor index or 
>>>> indices:
>>>> +
>>>> +    * For a split virtqueue, returns only the 16-bit next descriptor
>>>> +      index in the *Available Ring*.  The index in the *Used Ring* is
>>>> +      controlled by the guest driver and can be read from the vring
>>>> I find "is controlled by the guest driver" confusing. The device 
>>>> writes
>>>> the Used Ring index. The driver only reads it. The device is the 
>>>> active
>>>> party here.
>>> Er, good point.  That breaks the whole reasoning.  Then I don’t 
>>> understand
>>> why we do get/set the available ring index and not the used ring 
>>> index.  Do
>>> you know why?
>> It's simple. used ring index in memory is controlled by the device and
>> reflects device state.
>
> Exactly, it’s device state, that’s why I thought the front-end needs 
> to ensure its read and restored around the reset we currently have in 
> vhost_dev_stop()/start().
>
>> device can just read it back to restore.
>
> I find it strange that the device is supposed to read its own state 
> from memory.
>
>> available ring index in memory is controlled by driver and does
>> not reflect device state.
>
> Why can’t the device read the available index from memory?  That value 
> is put into memory by the driver precisely so the device can read it 
> from there.

Ah, wait, is the idea that the device may have an internal available 
index counter that reflects what descriptor it has already fetched? I.e. 
this index will lag behind the one in memory, and the difference are new 
descriptors that the device still needs to read? If that internal 
counter is the index that’s get/set here, then yes, that makes a lot of 
sense.

Hanna
Michael S. Tsirkin Oct. 7, 2023, 9:27 p.m. UTC | #6
On Fri, Oct 06, 2023 at 03:55:56PM +0200, Hanna Czenczek wrote:
> On 06.10.23 10:49, Michael S. Tsirkin wrote:
> > On Fri, Oct 06, 2023 at 09:53:53AM +0200, Hanna Czenczek wrote:
> > > On 05.10.23 19:38, Stefan Hajnoczi wrote:
> > > > On Wed, Oct 04, 2023 at 02:58:58PM +0200, Hanna Czenczek wrote:
> > > > > GET_VRING_BASE does not mention that it stops the respective ring.  Fix
> > > > > that.
> > > > > 
> > > > > Furthermore, it is not fully clear what the "base offset" these
> > > > > commands' documentation refers to is; an offset could be many things.
> > > > > Be more precise and verbose about it, especially given that these
> > > > > commands use different payload structures depending on whether the vring
> > > > > is split or packed.
> > > > > 
> > > > > Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> > > > > ---
> > > > >    docs/interop/vhost-user.rst | 66 ++++++++++++++++++++++++++++++++++---
> > > > >    1 file changed, 62 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > > > > index 2f68e67a1a..50f5acebe5 100644
> > > > > --- a/docs/interop/vhost-user.rst
> > > > > +++ b/docs/interop/vhost-user.rst
> > > > > @@ -108,6 +108,37 @@ A vring state description
> > > > >    :num: a 32-bit number
> > > > > +A vring descriptor index for split virtqueues
> > > > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > +
> > > > > ++-------------+---------------------+
> > > > > +| vring index | index in avail ring |
> > > > > ++-------------+---------------------+
> > > > > +
> > > > > +:vring index: 32-bit index of the respective virtqueue
> > > > > +
> > > > > +:index in avail ring: 32-bit value, of which currently only the lower 16
> > > > > +  bits are used:
> > > > > +
> > > > > +  - Bits 0–15: Next descriptor index in the *Available Ring*
> > > > I think we need to say more to make this implementable just by reading
> > > > the spec:
> > > > 
> > > >     Index of the next *Available Ring* descriptor that the back-end will
> > > >     process. This is a free-running index that is not wrapped by the ring
> > > >     size.
> > > Sure, thanks.
> > > 
> > > > Feel free to rephrase.
> > > > 
> > > > > +  - Bits 16–31: Reserved (set to zero)
> > > > > +
> > > > > +Vring descriptor indices for packed virtqueues
> > > > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > +
> > > > > ++-------------+--------------------+
> > > > > +| vring index | descriptor indices |
> > > > > ++-------------+--------------------+
> > > > > +
> > > > > +:vring index: 32-bit index of the respective virtqueue
> > > > > +
> > > > > +:descriptor indices: 32-bit value:
> > > > > +
> > > > > +  - Bits 0–14: Index in the *Available Ring*
> > > > Same here.
> > > > 
> > > > > +  - Bit 15: Driver (Available) Ring Wrap Counter
> > > > > +  - Bits 16–30: Index in the *Used Ring*
> > > > Same here.
> > > > 
> > > > > +  - Bit 31: Device (Used) Ring Wrap Counter
> > > > > +
> > > > >    A vring address description
> > > > >    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > @@ -1031,18 +1062,45 @@ Front-end message types
> > > > >    ``VHOST_USER_SET_VRING_BASE``
> > > > >      :id: 10
> > > > >      :equivalent ioctl: ``VHOST_SET_VRING_BASE``
> > > > > -  :request payload: vring state description
> > > > > +  :request payload: vring descriptor index/indices
> > > > >      :reply payload: N/A
> > > > > -  Sets the base offset in the available vring.
> > > > > +  Sets the next index to use for descriptors in this vring:
> > > > > +
> > > > > +  * For a split virtqueue, sets only the next descriptor index in the
> > > > > +    *Available Ring*.  The device is supposed to read the next index in
> > > > > +    the *Used Ring* from the respective vring structure in guest memory.
> > > > > +
> > > > > +  * For a packed virtqueue, both indices are supplied, as they are not
> > > > > +    explicitly available in memory.
> > > > > +
> > > > > +  Consequently, the payload type is specific to the type of virt queue
> > > > > +  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
> > > > > +  indices for packed virtqueues*).
> > > > >    ``VHOST_USER_GET_VRING_BASE``
> > > > >      :id: 11
> > > > >      :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
> > > > >      :request payload: vring state description
> > > > > -  :reply payload: vring state description
> > > > > +  :reply payload: vring descriptor index/indices
> > > > > +
> > > > > +  Stops the vring and returns the current descriptor index or indices:
> > > > > +
> > > > > +    * For a split virtqueue, returns only the 16-bit next descriptor
> > > > > +      index in the *Available Ring*.  The index in the *Used Ring* is
> > > > > +      controlled by the guest driver and can be read from the vring
> > > > I find "is controlled by the guest driver" confusing. The device writes
> > > > the Used Ring index. The driver only reads it. The device is the active
> > > > party here.
> > > Er, good point.  That breaks the whole reasoning.  Then I don’t understand
> > > why we do get/set the available ring index and not the used ring index.  Do
> > > you know why?
> > It's simple. used ring index in memory is controlled by the device and
> > reflects device state.
> 
> Exactly, it’s device state, that’s why I thought the front-end needs to
> ensure its read and restored around the reset we currently have in
> vhost_dev_stop()/start().
> 
> > device can just read it back to restore.
> 
> I find it strange that the device is supposed to read its own state from
> memory.

/me shrugs. It puts it there, why not read it back. Duplicating state
is not usually a good idea - leads to bugs.

> > available ring index in memory is controlled by driver and does
> > not reflect device state.
> 
> Why can’t the device read the available index from memory?  That value is
> put into memory by the driver precisely so the device can read it from
> there.
> 
> Hanna

Consider an example of RX ring for net device. buffers might be
available but device does not use them until packets arrive.  what I
think you could say is that actually just the used index should be
sufficient. So I think main thing GET_BASE does is stop the ring. As for
the value returned, we can if we want to validate that it matches used
ring index.
Michael S. Tsirkin Oct. 7, 2023, 9:29 p.m. UTC | #7
On Fri, Oct 06, 2023 at 03:58:44PM +0200, Hanna Czenczek wrote:
> On 06.10.23 15:55, Hanna Czenczek wrote:
> > On 06.10.23 10:49, Michael S. Tsirkin wrote:
> > > On Fri, Oct 06, 2023 at 09:53:53AM +0200, Hanna Czenczek wrote:
> > > > On 05.10.23 19:38, Stefan Hajnoczi wrote:
> > > > > On Wed, Oct 04, 2023 at 02:58:58PM +0200, Hanna Czenczek wrote:
> 
> [...]
> 
> > > > >    ``VHOST_USER_GET_VRING_BASE``
> > > > >      :id: 11
> > > > >      :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
> > > > >      :request payload: vring state description
> > > > > -  :reply payload: vring state description
> > > > > +  :reply payload: vring descriptor index/indices
> > > > > +
> > > > > +  Stops the vring and returns the current descriptor index
> > > > > or indices:
> > > > > +
> > > > > +    * For a split virtqueue, returns only the 16-bit next descriptor
> > > > > +      index in the *Available Ring*.  The index in the *Used Ring* is
> > > > > +      controlled by the guest driver and can be read from the vring
> > > > > I find "is controlled by the guest driver" confusing. The
> > > > > device writes
> > > > > the Used Ring index. The driver only reads it. The device is
> > > > > the active
> > > > > party here.
> > > > Er, good point.  That breaks the whole reasoning.  Then I don’t
> > > > understand
> > > > why we do get/set the available ring index and not the used ring
> > > > index.  Do
> > > > you know why?
> > > It's simple. used ring index in memory is controlled by the device and
> > > reflects device state.
> > 
> > Exactly, it’s device state, that’s why I thought the front-end needs to
> > ensure its read and restored around the reset we currently have in
> > vhost_dev_stop()/start().
> > 
> > > device can just read it back to restore.
> > 
> > I find it strange that the device is supposed to read its own state from
> > memory.
> > 
> > > available ring index in memory is controlled by driver and does
> > > not reflect device state.
> > 
> > Why can’t the device read the available index from memory?  That value
> > is put into memory by the driver precisely so the device can read it
> > from there.
> 
> Ah, wait, is the idea that the device may have an internal available index
> counter that reflects what descriptor it has already fetched? I.e. this
> index will lag behind the one in memory, and the difference are new
> descriptors that the device still needs to read? If that internal counter is
> the index that’s get/set here, then yes, that makes a lot of sense.
> 
> Hanna

Exactly. And this gets eventually written out as used index.
diff mbox series

Patch

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 2f68e67a1a..50f5acebe5 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -108,6 +108,37 @@  A vring state description
 
 :num: a 32-bit number
 
+A vring descriptor index for split virtqueues
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
++-------------+---------------------+
+| vring index | index in avail ring |
++-------------+---------------------+
+
+:vring index: 32-bit index of the respective virtqueue
+
+:index in avail ring: 32-bit value, of which currently only the lower 16
+  bits are used:
+
+  - Bits 0–15: Next descriptor index in the *Available Ring*
+  - Bits 16–31: Reserved (set to zero)
+
+Vring descriptor indices for packed virtqueues
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
++-------------+--------------------+
+| vring index | descriptor indices |
++-------------+--------------------+
+
+:vring index: 32-bit index of the respective virtqueue
+
+:descriptor indices: 32-bit value:
+
+  - Bits 0–14: Index in the *Available Ring*
+  - Bit 15: Driver (Available) Ring Wrap Counter
+  - Bits 16–30: Index in the *Used Ring*
+  - Bit 31: Device (Used) Ring Wrap Counter
+
 A vring address description
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -1031,18 +1062,45 @@  Front-end message types
 ``VHOST_USER_SET_VRING_BASE``
   :id: 10
   :equivalent ioctl: ``VHOST_SET_VRING_BASE``
-  :request payload: vring state description
+  :request payload: vring descriptor index/indices
   :reply payload: N/A
 
-  Sets the base offset in the available vring.
+  Sets the next index to use for descriptors in this vring:
+
+  * For a split virtqueue, sets only the next descriptor index in the
+    *Available Ring*.  The device is supposed to read the next index in
+    the *Used Ring* from the respective vring structure in guest memory.
+
+  * For a packed virtqueue, both indices are supplied, as they are not
+    explicitly available in memory.
+
+  Consequently, the payload type is specific to the type of virt queue
+  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
+  indices for packed virtqueues*).
 
 ``VHOST_USER_GET_VRING_BASE``
   :id: 11
   :equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
   :request payload: vring state description
-  :reply payload: vring state description
+  :reply payload: vring descriptor index/indices
+
+  Stops the vring and returns the current descriptor index or indices:
+
+    * For a split virtqueue, returns only the 16-bit next descriptor
+      index in the *Available Ring*.  The index in the *Used Ring* is
+      controlled by the guest driver and can be read from the vring
+      structure in memory, so is not covered.
+
+    * For a packed virtqueue, neither index is explicitly available to
+      read from memory, so both indices (as maintained by the device) are
+      returned.
+
+  Consequently, the payload type is specific to the type of virt queue
+  (*a vring descriptor index for split virtqueues* vs. *vring descriptor
+  indices for packed virtqueues*).
 
-  Get the available vring base offset.
+  The request payload’s *num* field is currently reserved and must be
+  set to 0.
 
 ``VHOST_USER_SET_VRING_KICK``
   :id: 12