diff mbox

[RFC] virtio: add features qdev property

Message ID 20091213204341.GA25823@redhat.com
State New
Headers show

Commit Message

Michael S. Tsirkin Dec. 13, 2009, 8:43 p.m. UTC
Add features property to virtio. This makes it
possible to e.g. define machine without indirect
buffer support, which is required for 0.10
compatibility. or without hardware checksum
support, which is required for 0.11 compatibility.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Here's what I came up with for solving
the problem of differences between features
in virtio between 0.12 and 0.11 (applies
on to of guest_features patch).
Comments? Gerd, what do you think?


 hw/virtio-pci.c |   29 +++++++++++++++++++++++++++--
 hw/virtio.h     |    1 +
 2 files changed, 28 insertions(+), 2 deletions(-)

Comments

Gerd Hoffmann Dec. 14, 2009, 9:41 a.m. UTC | #1
On 12/13/09 21:43, Michael S. Tsirkin wrote:
> Add features property to virtio. This makes it
> possible to e.g. define machine without indirect
> buffer support, which is required for 0.10
> compatibility. or without hardware checksum
> support, which is required for 0.11 compatibility.

I'd suggest to add flags for the individual features to the drivers 
which actually use it instead, so you'll have

   -device virtio-net-pci,hw-checksum=0

and

   -device virtio-blk-pci,indirect-buffers=0

cheers,
   Gerd
Michael S. Tsirkin Dec. 14, 2009, 9:42 a.m. UTC | #2
On Mon, Dec 14, 2009 at 10:41:26AM +0100, Gerd Hoffmann wrote:
> On 12/13/09 21:43, Michael S. Tsirkin wrote:
>> Add features property to virtio. This makes it
>> possible to e.g. define machine without indirect
>> buffer support, which is required for 0.10
>> compatibility. or without hardware checksum
>> support, which is required for 0.11 compatibility.
>
> I'd suggest to add flags for the individual features to the drivers  
> which actually use it instead, so you'll have
>
>   -device virtio-net-pci,hw-checksum=0
>
> and
>
>   -device virtio-blk-pci,indirect-buffers=0
>
> cheers,
>   Gerd

Hmm. I hoped to avoid it, there are lots of features so it's a lot of
work and in practice, this will most likely be set by machine
description ...
Gerd Hoffmann Dec. 14, 2009, 10:24 a.m. UTC | #3
On 12/14/09 10:42, Michael S. Tsirkin wrote:
> On Mon, Dec 14, 2009 at 10:41:26AM +0100, Gerd Hoffmann wrote:
>> On 12/13/09 21:43, Michael S. Tsirkin wrote:
>>> Add features property to virtio. This makes it
>>> possible to e.g. define machine without indirect
>>> buffer support, which is required for 0.10
>>> compatibility. or without hardware checksum
>>> support, which is required for 0.11 compatibility.
>>
>> I'd suggest to add flags for the individual features to the drivers
>> which actually use it instead, so you'll have
>>
>>    -device virtio-net-pci,hw-checksum=0
>>
>> and
>>
>>    -device virtio-blk-pci,indirect-buffers=0
>>
>> cheers,
>>    Gerd
>
> Hmm. I hoped to avoid it, there are lots of features so it's a lot of
> work and in practice, this will most likely be set by machine
> description ...

MSI-X aka vectors property is already done this way, so I'd tend to 
continue this way.  It is also more user friendly.  Sure, these are most 
likely not used on a daily base by users, but being able to turn off -- 
say -- indirect buffers for testing and/or bug hunting reasons without 
having to construct magic hex numbers from virtio header files would be 
nice.

Can you give a list of features?  The patch description sounded like it 
is just the two listed above ...

cheers,
   Gerd
Michael S. Tsirkin Dec. 14, 2009, 11:10 a.m. UTC | #4
On Mon, Dec 14, 2009 at 11:24:41AM +0100, Gerd Hoffmann wrote:
> On 12/14/09 10:42, Michael S. Tsirkin wrote:
>> On Mon, Dec 14, 2009 at 10:41:26AM +0100, Gerd Hoffmann wrote:
>>> On 12/13/09 21:43, Michael S. Tsirkin wrote:
>>>> Add features property to virtio. This makes it
>>>> possible to e.g. define machine without indirect
>>>> buffer support, which is required for 0.10
>>>> compatibility. or without hardware checksum
>>>> support, which is required for 0.11 compatibility.
>>>
>>> I'd suggest to add flags for the individual features to the drivers
>>> which actually use it instead, so you'll have
>>>
>>>    -device virtio-net-pci,hw-checksum=0
>>>
>>> and
>>>
>>>    -device virtio-blk-pci,indirect-buffers=0
>>>
>>> cheers,
>>>    Gerd
>>
>> Hmm. I hoped to avoid it, there are lots of features so it's a lot of
>> work and in practice, this will most likely be set by machine
>> description ...
>
> MSI-X aka vectors property is already done this way, so I'd tend to  
> continue this way.  It is also more user friendly.  Sure, these are most  
> likely not used on a daily base by users, but being able to turn off --  
> say -- indirect buffers for testing and/or bug hunting reasons without  
> having to construct magic hex numbers from virtio header files would be  
> nice.
>
> Can you give a list of features?  The patch description sounded like it  
> is just the two listed above ...
>
> cheers,
>   Gerd

Heh, it's a long list.
transport features (common to all):
	VIRTIO_F_NOTIFY_ON_EMPTY;
	VIRTIO_RING_F_INDIRECT_DESC;
	VIRTIO_F_BAD_FEATURE; <- not sure we need a flag for this

for net:
    uint32_t features = (1 << VIRTIO_NET_F_MAC) |
                        (1 << VIRTIO_NET_F_MRG_RXBUF) |
                        (1 << VIRTIO_NET_F_STATUS) |
                        (1 << VIRTIO_NET_F_CTRL_VQ) |
                        (1 << VIRTIO_NET_F_CTRL_RX) |
                        (1 << VIRTIO_NET_F_CTRL_VLAN) |
                        (1 << VIRTIO_NET_F_CTRL_RX_EXTRA);

    if (peer_has_vnet_hdr(n)) {
        tap_using_vnet_hdr(n->nic->nc.peer, 1);

        features |= (1 << VIRTIO_NET_F_CSUM);
        features |= (1 << VIRTIO_NET_F_HOST_TSO4);
        features |= (1 << VIRTIO_NET_F_HOST_TSO6);
        features |= (1 << VIRTIO_NET_F_HOST_ECN);

        features |= (1 << VIRTIO_NET_F_GUEST_CSUM);
        features |= (1 << VIRTIO_NET_F_GUEST_TSO4);
        features |= (1 << VIRTIO_NET_F_GUEST_TSO6);
        features |= (1 << VIRTIO_NET_F_GUEST_ECN);

        if (peer_has_ufo(n)) {
            features |= (1 << VIRTIO_NET_F_GUEST_UFO);
            features |= (1 << VIRTIO_NET_F_HOST_UFO);
        }

for block:

   features |= (1 << VIRTIO_BLK_F_SEG_MAX);
    features |= (1 << VIRTIO_BLK_F_GEOMETRY);

    if (bdrv_enable_write_cache(s->bs))
        features |= (1 << VIRTIO_BLK_F_WCACHE);
#ifdef __linux__
    features |= (1 << VIRTIO_BLK_F_SCSI);
#endif
    if (strcmp(s->serial_str, "0"))
        features |= 1 << VIRTIO_BLK_F_IDENTIFY;

    if (bdrv_is_read_only(s->bs))
        features |= 1 << VIRTIO_BLK_F_RO;

I could try and group features, but this way we
loose in flexibility ...

How about I name properties exactly like virtio macros?  e.g.
VIRTIO_BLK_F_IDENTIFY etc?  This way maybe I can use preprocessor magic
to reduce duplication ...
Also, I'd like these things to be saved in bits and not add a ton
of fields in device. Ideas how to do this?
Gerd Hoffmann Dec. 14, 2009, 11:37 a.m. UTC | #5
On 12/14/09 12:10, Michael S. Tsirkin wrote:
> On Mon, Dec 14, 2009 at 11:24:41AM +0100, Gerd Hoffmann wrote:
> for block:
>      if (strcmp(s->serial_str, "0"))
>          features |= 1<<  VIRTIO_BLK_F_IDENTIFY;
>
>      if (bdrv_is_read_only(s->bs))
>          features |= 1<<  VIRTIO_BLK_F_RO;

Sure you want these be configurable?

> Also, I'd like these things to be saved in bits and not add a ton
> of fields in device. Ideas how to do this?

I guess you only want disable features?

You could have a bitmap property then, which accepts names for the bits. 
  It would need a table like this ...

    char *bitnames[] = {
	[ VIRTIO_BLK_F_IDENTIFY ] = "blk-identify",
	[ VIRTIO_BLK_F_RO       [ = "blk-ro",
	[ ... ]
    };

Then the property parser would accepts strings such as 'bit1|bit2' and 
you can have

   -device 'virtio-blk-pci,disable=blk-identify|ring-indirect'

The driver will just do 'vdev->host_features &= ~disable'.

cheers,
   Gerd
Alexander Graf Dec. 14, 2009, 11:50 a.m. UTC | #6
Am 14.12.2009 um 12:10 schrieb "Michael S. Tsirkin" <mst@redhat.com>:

> On Mon, Dec 14, 2009 at 11:24:41AM +0100, Gerd Hoffmann wrote:
>> On 12/14/09 10:42, Michael S. Tsirkin wrote:
>>> On Mon, Dec 14, 2009 at 10:41:26AM +0100, Gerd Hoffmann wrote:
>>>> On 12/13/09 21:43, Michael S. Tsirkin wrote:
>>>>> Add features property to virtio. This makes it
>>>>> possible to e.g. define machine without indirect
>>>>> buffer support, which is required for 0.10
>>>>> compatibility. or without hardware checksum
>>>>> support, which is required for 0.11 compatibility.
>>>>
>>>> I'd suggest to add flags for the individual features to the drivers
>>>> which actually use it instead, so you'll have
>>>>
>>>>   -device virtio-net-pci,hw-checksum=0
>>>>
>>>> and
>>>>
>>>>   -device virtio-blk-pci,indirect-buffers=0
>>>>
>>>> cheers,
>>>>   Gerd
>>>
>>> Hmm. I hoped to avoid it, there are lots of features so it's a lot  
>>> of
>>> work and in practice, this will most likely be set by machine
>>> description ...
>>
>> MSI-X aka vectors property is already done this way, so I'd tend to
>> continue this way.  It is also more user friendly.  Sure, these are  
>> most
>> likely not used on a daily base by users, but being able to turn  
>> off --
>> say -- indirect buffers for testing and/or bug hunting reasons  
>> without
>> having to construct magic hex numbers from virtio header files  
>> would be
>> nice.
>>
>> Can you give a list of features?  The patch description sounded  
>> like it
>> is just the two listed above ...
>>
>> cheers,
>>  Gerd
>
> Heh, it's a long list.
> transport features (common to all):
>    VIRTIO_F_NOTIFY_ON_EMPTY;
>    VIRTIO_RING_F_INDIRECT_DESC;
>    VIRTIO_F_BAD_FEATURE; <- not sure we need a flag for this
>
> for net:
>    uint32_t features = (1 << VIRTIO_NET_F_MAC) |
>                        (1 << VIRTIO_NET_F_MRG_RXBUF) |
>                        (1 << VIRTIO_NET_F_STATUS) |
>                        (1 << VIRTIO_NET_F_CTRL_VQ) |
>                        (1 << VIRTIO_NET_F_CTRL_RX) |
>                        (1 << VIRTIO_NET_F_CTRL_VLAN) |
>                        (1 << VIRTIO_NET_F_CTRL_RX_EXTRA);
>
>    if (peer_has_vnet_hdr(n)) {
>        tap_using_vnet_hdr(n->nic->nc.peer, 1);
>
>        features |= (1 << VIRTIO_NET_F_CSUM);
>        features |= (1 << VIRTIO_NET_F_HOST_TSO4);
>        features |= (1 << VIRTIO_NET_F_HOST_TSO6);
>        features |= (1 << VIRTIO_NET_F_HOST_ECN);
>
>        features |= (1 << VIRTIO_NET_F_GUEST_CSUM);
>        features |= (1 << VIRTIO_NET_F_GUEST_TSO4);
>        features |= (1 << VIRTIO_NET_F_GUEST_TSO6);
>        features |= (1 << VIRTIO_NET_F_GUEST_ECN);
>
>        if (peer_has_ufo(n)) {
>            features |= (1 << VIRTIO_NET_F_GUEST_UFO);
>            features |= (1 << VIRTIO_NET_F_HOST_UFO);
>        }
>
> for block:
>
>   features |= (1 << VIRTIO_BLK_F_SEG_MAX);
>    features |= (1 << VIRTIO_BLK_F_GEOMETRY);
>
>    if (bdrv_enable_write_cache(s->bs))
>        features |= (1 << VIRTIO_BLK_F_WCACHE);
> #ifdef __linux__
>    features |= (1 << VIRTIO_BLK_F_SCSI);
> #endif
>    if (strcmp(s->serial_str, "0"))
>        features |= 1 << VIRTIO_BLK_F_IDENTIFY;
>
>    if (bdrv_is_read_only(s->bs))
>        features |= 1 << VIRTIO_BLK_F_RO;
>
> I could try and group features, but this way we
> loose in flexibility ...
>
> How about I name properties exactly like virtio macros?  e.g.
> VIRTIO_BLK_F_IDENTIFY etc?  This way maybe I can use preprocessor  
> magic
> to reduce duplication ...

It would even be great to only have a single point to add a feature  
bit to.

So instead of

#define VIRTIO_BLK_F_IDENTIFY 1

You'd do

VIRTIO_FEATURE(BLK, IDENTIFY, 1)

which would add the feature including a lower case representation of  
the same to an array you could use to evaluate features from.

By having magic like this we'd guarantee that new features will also  
get exposed.

> Also, I'd like these things to be saved in bits and not add a ton
> of fields in device. Ideas how to do this?

Why?

But if you really want to because you want to save ram, just use int x: 
1 style representations.

If you want to automatically construct a bitmap to compare to out of  
the cmdline given feature bits, use the array I described above to  
generate the bitmap in the init function.

Alex
>
Michael S. Tsirkin Dec. 14, 2009, 1:15 p.m. UTC | #7
On Mon, Dec 14, 2009 at 12:37:29PM +0100, Gerd Hoffmann wrote:
> On 12/14/09 12:10, Michael S. Tsirkin wrote:
>> On Mon, Dec 14, 2009 at 11:24:41AM +0100, Gerd Hoffmann wrote:
>> for block:
>>      if (strcmp(s->serial_str, "0"))
>>          features |= 1<<  VIRTIO_BLK_F_IDENTIFY;
>>
>>      if (bdrv_is_read_only(s->bs))
>>          features |= 1<<  VIRTIO_BLK_F_RO;
>
> Sure you want these be configurable?
>
>> Also, I'd like these things to be saved in bits and not add a ton
>> of fields in device. Ideas how to do this?
>
> I guess you only want disable features?
>
> You could have a bitmap property then, which accepts names for the bits.  
>  It would need a table like this ...
>
>    char *bitnames[] = {
> 	[ VIRTIO_BLK_F_IDENTIFY ] = "blk-identify",
> 	[ VIRTIO_BLK_F_RO       [ = "blk-ro",
> 	[ ... ]
>    };
>
> Then the property parser would accepts strings such as 'bit1|bit2' and  
> you can have
>
>   -device 'virtio-blk-pci,disable=blk-identify|ring-indirect'
>
> The driver will just do 'vdev->host_features &= ~disable'.
>
> cheers,
>   Gerd


Excellent.
Markus Armbruster Dec. 14, 2009, 1:30 p.m. UTC | #8
Gerd Hoffmann <kraxel@redhat.com> writes:

> On 12/14/09 12:10, Michael S. Tsirkin wrote:
>> On Mon, Dec 14, 2009 at 11:24:41AM +0100, Gerd Hoffmann wrote:
>> for block:
>>      if (strcmp(s->serial_str, "0"))
>>          features |= 1<<  VIRTIO_BLK_F_IDENTIFY;
>>
>>      if (bdrv_is_read_only(s->bs))
>>          features |= 1<<  VIRTIO_BLK_F_RO;
>
> Sure you want these be configurable?
>
>> Also, I'd like these things to be saved in bits and not add a ton
>> of fields in device. Ideas how to do this?
>
> I guess you only want disable features?
>
> You could have a bitmap property then, which accepts names for the
> bits. It would need a table like this ...
>
>    char *bitnames[] = {
> 	[ VIRTIO_BLK_F_IDENTIFY ] = "blk-identify",
> 	[ VIRTIO_BLK_F_RO       [ = "blk-ro",
> 	[ ... ]
>    };
>
> Then the property parser would accepts strings such as 'bit1|bit2' and
> you can have
>
>   -device 'virtio-blk-pci,disable=blk-identify|ring-indirect'
>
> The driver will just do 'vdev->host_features &= ~disable'.

Use of '|' in option argument syntax is user-hostile, because it
requires quoting in the shell.  What about '+'?
Michael S. Tsirkin Dec. 14, 2009, 1:59 p.m. UTC | #9
On Mon, Dec 14, 2009 at 02:30:19PM +0100, Markus Armbruster wrote:
> Gerd Hoffmann <kraxel@redhat.com> writes:
> 
> > On 12/14/09 12:10, Michael S. Tsirkin wrote:
> >> On Mon, Dec 14, 2009 at 11:24:41AM +0100, Gerd Hoffmann wrote:
> >> for block:
> >>      if (strcmp(s->serial_str, "0"))
> >>          features |= 1<<  VIRTIO_BLK_F_IDENTIFY;
> >>
> >>      if (bdrv_is_read_only(s->bs))
> >>          features |= 1<<  VIRTIO_BLK_F_RO;
> >
> > Sure you want these be configurable?
> >
> >> Also, I'd like these things to be saved in bits and not add a ton
> >> of fields in device. Ideas how to do this?
> >
> > I guess you only want disable features?

It's not an easy quesiton.
If we do it as disable bits, then we get incompatible
machines when running on different hosts.
Would it be better to fail instead?

> > You could have a bitmap property then, which accepts names for the
> > bits. It would need a table like this ...
> >
> >    char *bitnames[] = {
> > 	[ VIRTIO_BLK_F_IDENTIFY ] = "blk-identify",
> > 	[ VIRTIO_BLK_F_RO       [ = "blk-ro",
> > 	[ ... ]
> >    };
> >
> > Then the property parser would accepts strings such as 'bit1|bit2' and
> > you can have
> >
> >   -device 'virtio-blk-pci,disable=blk-identify|ring-indirect'
> >
> > The driver will just do 'vdev->host_features &= ~disable'.
> 
> Use of '|' in option argument syntax is user-hostile, because it
> requires quoting in the shell.  What about '+'?

I am guessing that '|' parsing is already there,
but yes + would be a nice touch.
Gerd Hoffmann Dec. 14, 2009, 2:56 p.m. UTC | #10
On 12/14/09 14:30, Markus Armbruster wrote:
>> Then the property parser would accepts strings such as 'bit1|bit2' and
>> you can have
>>
>>    -device 'virtio-blk-pci,disable=blk-identify|ring-indirect'
>>
>> The driver will just do 'vdev->host_features&= ~disable'.
>
> Use of '|' in option argument syntax is user-hostile, because it
> requires quoting in the shell.  What about '+'?

I don't care that much.  I've picked '|' because it is 'or' in many 
programming languanges and thus sort of intuitive (at least to me). 
Using '+' is fine with me too.

cheers,
   Gerd
Gerd Hoffmann Dec. 14, 2009, 3:01 p.m. UTC | #11
On 12/14/09 14:59, Michael S. Tsirkin wrote:
> It's not an easy quesiton.
> If we do it as disable bits, then we get incompatible
> machines when running on different hosts.

In case that one host supports feature which the other doesn't and the 
feature isn't masked out?  Well, management failure I'd say.  The whole 
point of the compat machine types (and properties used there) is to make 
sure the vm's are compatible even in case the hosts run different 
software versions.

cheers,
   Gerd
Michael S. Tsirkin Dec. 14, 2009, 4:23 p.m. UTC | #12
On Mon, Dec 14, 2009 at 04:01:33PM +0100, Gerd Hoffmann wrote:
> On 12/14/09 14:59, Michael S. Tsirkin wrote:
>> It's not an easy quesiton.
>> If we do it as disable bits, then we get incompatible
>> machines when running on different hosts.
>
> In case that one host supports feature which the other doesn't and the  
> feature isn't masked out?  Well, management failure I'd say.  The whole  
> point of the compat machine types (and properties used there) is to make  
> sure the vm's are compatible even in case the hosts run different  
> software versions.
>
> cheers,
>   Gerd


So how do you do this?
Assume we have -disable_hw_csum.
We want new machine type to have it off, right?
But now you run qemu on host which does
not support hw_csum. With your suggestion
it will not enable hw csum?
Gerd Hoffmann Dec. 14, 2009, 5:18 p.m. UTC | #13
On 12/14/09 17:23, Michael S. Tsirkin wrote:
> On Mon, Dec 14, 2009 at 04:01:33PM +0100, Gerd Hoffmann wrote:
> So how do you do this?
> Assume we have -disable_hw_csum.
> We want new machine type to have it off, right?
> But now you run qemu on host which does
> not support hw_csum. With your suggestion
> it will not enable hw csum?

I have trouble getting the setup you are talking about ...

Sounds like hw_csum could be enabled/disabled depending on the hardware 
capabilities on the host.  Is that correct?

cheers,
   Gerd
Michael S. Tsirkin Dec. 14, 2009, 7:17 p.m. UTC | #14
On Mon, Dec 14, 2009 at 06:18:29PM +0100, Gerd Hoffmann wrote:
> On 12/14/09 17:23, Michael S. Tsirkin wrote:
>> On Mon, Dec 14, 2009 at 04:01:33PM +0100, Gerd Hoffmann wrote:
>> So how do you do this?
>> Assume we have -disable_hw_csum.
>> We want new machine type to have it off, right?
>> But now you run qemu on host which does
>> not support hw_csum. With your suggestion
>> it will not enable hw csum?
>
> I have trouble getting the setup you are talking about ...
>
> Sounds like hw_csum could be enabled/disabled depending on the hardware  
> capabilities on the host.  Is that correct?
>
> cheers,
>   Gerd

This currently depends on version of tun driver in the host.
Michael S. Tsirkin Dec. 14, 2009, 7:20 p.m. UTC | #15
On Mon, Dec 14, 2009 at 12:37:29PM +0100, Gerd Hoffmann wrote:
> On 12/14/09 12:10, Michael S. Tsirkin wrote:
>> On Mon, Dec 14, 2009 at 11:24:41AM +0100, Gerd Hoffmann wrote:
>> for block:
>>      if (strcmp(s->serial_str, "0"))
>>          features |= 1<<  VIRTIO_BLK_F_IDENTIFY;
>>
>>      if (bdrv_is_read_only(s->bs))
>>          features |= 1<<  VIRTIO_BLK_F_RO;
>
> Sure you want these be configurable?
>
>> Also, I'd like these things to be saved in bits and not add a ton
>> of fields in device. Ideas how to do this?
>
> I guess you only want disable features?
>
> You could have a bitmap property then, which accepts names for the bits.  
>  It would need a table like this ...
>
>    char *bitnames[] = {
> 	[ VIRTIO_BLK_F_IDENTIFY ] = "blk-identify",
> 	[ VIRTIO_BLK_F_RO       [ = "blk-ro",
> 	[ ... ]
>    };
>
> Then the property parser would accepts strings such as 'bit1|bit2' and  
> you can have
>
>   -device 'virtio-blk-pci,disable=blk-identify|ring-indirect'
>
> The driver will just do 'vdev->host_features &= ~disable'.
>
> cheers,
>   Gerd

Is there an example of an existing property that is like this?
Gerd Hoffmann Dec. 14, 2009, 8:40 p.m. UTC | #16
On 12/14/09 20:17, Michael S. Tsirkin wrote:
> On Mon, Dec 14, 2009 at 06:18:29PM +0100, Gerd Hoffmann wrote:
>> On 12/14/09 17:23, Michael S. Tsirkin wrote:
>>> On Mon, Dec 14, 2009 at 04:01:33PM +0100, Gerd Hoffmann wrote:
>>> So how do you do this?
>>> Assume we have -disable_hw_csum.
>>> We want new machine type to have it off, right?
>>> But now you run qemu on host which does
>>> not support hw_csum. With your suggestion
>>> it will not enable hw csum?
>>
>> I have trouble getting the setup you are talking about ...
>>
>> Sounds like hw_csum could be enabled/disabled depending on the hardware
>> capabilities on the host.  Is that correct?
>>
>> cheers,
>>    Gerd
>
> This currently depends on version of tun driver in the host.

So this has nothing to do with -M pc-0.11 backward compatibility, right? 
  You'll hit this when migrating 0.12 -> 0.12 with different host 
kernels, right?  And the common bit here is that both issues can be 
handled by configuring virtio feature bits via properties, right?

What was the question again?

cheers,
   Gerd
Gerd Hoffmann Dec. 14, 2009, 8:42 p.m. UTC | #17
>>
>>    -device 'virtio-blk-pci,disable=blk-identify|ring-indirect'
>>
>> The driver will just do 'vdev->host_features&= ~disable'.
>>
>> cheers,
>>    Gerd
>
> Is there an example of an existing property that is like this?

No.  A new property type is needed for this.

cheers,
   Gerd
Michael S. Tsirkin Dec. 14, 2009, 8:43 p.m. UTC | #18
On Mon, Dec 14, 2009 at 09:40:28PM +0100, Gerd Hoffmann wrote:
> On 12/14/09 20:17, Michael S. Tsirkin wrote:
>> On Mon, Dec 14, 2009 at 06:18:29PM +0100, Gerd Hoffmann wrote:
>>> On 12/14/09 17:23, Michael S. Tsirkin wrote:
>>>> On Mon, Dec 14, 2009 at 04:01:33PM +0100, Gerd Hoffmann wrote:
>>>> So how do you do this?
>>>> Assume we have -disable_hw_csum.
>>>> We want new machine type to have it off, right?
>>>> But now you run qemu on host which does
>>>> not support hw_csum. With your suggestion
>>>> it will not enable hw csum?
>>>
>>> I have trouble getting the setup you are talking about ...
>>>
>>> Sounds like hw_csum could be enabled/disabled depending on the hardware
>>> capabilities on the host.  Is that correct?
>>>
>>> cheers,
>>>    Gerd
>>
>> This currently depends on version of tun driver in the host.
>
> So this has nothing to do with -M pc-0.11 backward compatibility, right?  

Yes, this does. With 0.11 you must not expose this even
if supported by host. Otherwise you wnt be able to migrate to 0.11.

>  You'll hit this when migrating 0.12 -> 0.12 with different host  
> kernels, right?  And the common bit here is that both issues can be  
> handled by configuring virtio feature bits via properties, right?

Yes.

> What was the question again?
>
> cheers,
>   Gerd


What do we put in e.g. 0.11 compat? Any features we enable
there might not be supported by host.
Gerd Hoffmann Dec. 14, 2009, 9:12 p.m. UTC | #19
On 12/14/09 21:43, Michael S. Tsirkin wrote:
> What do we put in e.g. 0.11 compat? Any features we enable
> there might not be supported by host.

compat properties as usual?

Sorry, I still fail to see your problem.

You'll have a 'disable' bitmap.  Fill it via 'features=<bitmap-prop>'. 
Or using separate properties for each feature.

qemu goes figure host_features.  When done it masks out the features 
disabled via properties.  The result is used as final feature bitmap.

-M pc-0.11 will disable hw_checksum
management software can do it too via -device virtio-net-pci,... if some 
machines in the pool don't support it.  And of course qemu will disable 
it on its own in case the host kernel doesn't support it.

cheers,
   Gerd
Michael S. Tsirkin Dec. 14, 2009, 9:14 p.m. UTC | #20
On Mon, Dec 14, 2009 at 10:12:14PM +0100, Gerd Hoffmann wrote:
> On 12/14/09 21:43, Michael S. Tsirkin wrote:
>> What do we put in e.g. 0.11 compat? Any features we enable
>> there might not be supported by host.
>
> compat properties as usual?
>
> Sorry, I still fail to see your problem.
>
> You'll have a 'disable' bitmap.  Fill it via 'features=<bitmap-prop>'.  
> Or using separate properties for each feature.
> qemu goes figure host_features.  When done it masks out the features  
> disabled via properties.  The result is used as final feature bitmap.
>
> -M pc-0.11 will disable hw_checksum
> management software can do it too via -device virtio-net-pci,... if some  
> machines in the pool don't support it.  And of course qemu will disable  
> it on its own in case the host kernel doesn't support it.
>
> cheers,
>   Gerd

management is always behind :).  Imagine a new and improved qemu. I run
old management.  I expect to see old properties, but old management can
not disable new ones :)
Gerd Hoffmann Dec. 14, 2009, 9:24 p.m. UTC | #21
On 12/14/09 22:14, Michael S. Tsirkin wrote:
> On Mon, Dec 14, 2009 at 10:12:14PM +0100, Gerd Hoffmann wrote:
>> On 12/14/09 21:43, Michael S. Tsirkin wrote:
>>> What do we put in e.g. 0.11 compat? Any features we enable
>>> there might not be supported by host.
>>
>> compat properties as usual?
>>
>> Sorry, I still fail to see your problem.
>>
>> You'll have a 'disable' bitmap.  Fill it via 'features=<bitmap-prop>'.
>> Or using separate properties for each feature.
>> qemu goes figure host_features.  When done it masks out the features
>> disabled via properties.  The result is used as final feature bitmap.
>>
>> -M pc-0.11 will disable hw_checksum
>> management software can do it too via -device virtio-net-pci,... if some
>> machines in the pool don't support it.  And of course qemu will disable
>> it on its own in case the host kernel doesn't support it.
>>
>> cheers,
>>    Gerd
>
> management is always behind :).  Imagine a new and improved qemu. I run
> old management.  I expect to see old properties, but old management can
> not disable new ones :)

If management can't deal with 0.12 features it should use -M pc-0.11

cheers,
   Gerd
diff mbox

Patch

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 80bc645..43b02b6 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -90,6 +90,7 @@  typedef struct {
     uint32_t addr;
     uint32_t class_code;
     uint32_t nvectors;
+    uint32_t host_features;
     DriveInfo *dinfo;
     NICConf nic;
 } VirtIOPCIProxy;
@@ -235,8 +236,7 @@  static uint32_t virtio_ioport_read(VirtIOPCIProxy *proxy, uint32_t addr)
 
     switch (addr) {
     case VIRTIO_PCI_HOST_FEATURES:
-        ret = vdev->get_features(vdev);
-        ret |= vdev->binding->get_features(proxy);
+        ret = vdev->host_features;
         break;
     case VIRTIO_PCI_GUEST_FEATURES:
         ret = vdev->guest_features;
@@ -398,6 +398,8 @@  static const VirtIOBindings virtio_pci_bindings = {
     .get_features = virtio_pci_get_features,
 };
 
+#define VIRTIO_PCI_NO_FEATURES 0xffffffff
+
 static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev,
                             uint16_t vendor, uint16_t device,
                             uint16_t class_code, uint8_t pif)
@@ -442,6 +444,18 @@  static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev,
                            virtio_map);
 
     virtio_bind_device(vdev, &virtio_pci_bindings, proxy);
+    if (proxy->host_features == VIRTIO_PCI_NO_FEATURES)
+        proxy->host_features = vdev->get_features(vdev) |
+                vdev->binding->get_features(proxy);
+    else if (proxy->host_features & ~vdev->get_features(vdev) &
+             ~vdev->binding->get_features(proxy)) {
+        fprintf(stderr, "Requested host features 0x%x, "
+                "but supported features are transport:0x%x device:0x%x\n",
+                proxy->host_features,
+                vdev->binding->get_features(proxy),
+                vdev->get_features(vdev));
+        exit(1);
+    }
 }
 
 static int virtio_blk_init_pci(PCIDevice *pci_dev)
@@ -561,6 +575,8 @@  static PCIDeviceInfo virtio_info[] = {
             DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
             DEFINE_PROP_DRIVE("drive", VirtIOPCIProxy, dinfo),
             DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
+            DEFINE_PROP_HEX32("features", VirtIOPCIProxy, host_features,
+                              VIRTIO_PCI_NO_FEATURES),
             DEFINE_PROP_END_OF_LIST(),
         },
         .qdev.reset = virtio_pci_reset,
@@ -571,6 +587,8 @@  static PCIDeviceInfo virtio_info[] = {
         .exit       = virtio_net_exit_pci,
         .qdev.props = (Property[]) {
             DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
+            DEFINE_PROP_HEX32("features", VirtIOPCIProxy, host_features,
+                              0xffffffff),
             DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),
             DEFINE_PROP_END_OF_LIST(),
         },
@@ -582,6 +600,8 @@  static PCIDeviceInfo virtio_info[] = {
         .exit      = virtio_exit_pci,
         .qdev.props = (Property[]) {
             DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
+            DEFINE_PROP_HEX32("features", VirtIOPCIProxy, host_features,
+                              VIRTIO_PCI_NO_FEATURES),
             DEFINE_PROP_END_OF_LIST(),
         },
         .qdev.reset = virtio_pci_reset,
@@ -590,6 +610,11 @@  static PCIDeviceInfo virtio_info[] = {
         .qdev.size = sizeof(VirtIOPCIProxy),
         .init      = virtio_balloon_init_pci,
         .exit      = virtio_exit_pci,
+        .qdev.props = (Property[]) {
+            DEFINE_PROP_HEX32("features", VirtIOPCIProxy, host_features,
+                              VIRTIO_PCI_NO_FEATURES),
+            DEFINE_PROP_END_OF_LIST(),
+        },
         .qdev.reset = virtio_pci_reset,
     },{
         /* end of list */
diff --git a/hw/virtio.h b/hw/virtio.h
index 85ef171..73f784f 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -101,6 +101,7 @@  struct VirtIODevice
     uint8_t isr;
     uint16_t queue_sel;
     uint32_t guest_features;
+    uint32_t host_features;
     size_t config_len;
     void *config;
     uint16_t config_vector;