diff mbox series

[v2] virtio-iommu: Use qemu_real_host_page_mask as default page_size_mask

Message ID 20240117132039.332273-1-eric.auger@redhat.com
State New
Headers show
Series [v2] virtio-iommu: Use qemu_real_host_page_mask as default page_size_mask | expand

Commit Message

Eric Auger Jan. 17, 2024, 1:20 p.m. UTC
We used to set default page_size_mask to qemu_target_page_mask() but
with VFIO assignment it makes more sense to use the actual host page mask
instead.

So from now on qemu_real_host_page_mask() will be used as a default.
To be able to migrate older code, we increase the vmstat version_id
to 3 and if an older incoming v2 stream is detected we set the previous
default value.

The new default is well adapted to configs where host and guest have
the same page size. This allows to fix hotplugging VFIO devices on a
64kB guest and a 64kB host. This test case has been failing before
and even crashing qemu with hw_error("vfio: DMA mapping failed,
unable to continue") in VFIO common). Indeed the hot-attached VFIO
device would call memory_region_iommu_set_page_size_mask with 64kB
mask whereas after the granule was frozen to 4kB on machine init done.
Now this works. However the new default will prevent 4kB guest on
64kB host because the granule will be set to 64kB which would be
larger than the guest page size. In that situation, the virtio-iommu
driver fails on viommu_domain_finalise() with
"granule 0x10000 larger than system page size 0x1000".

The current limitation of global granule in the virtio-iommu
should be removed and turned into per domain granule. But
until we get this upgraded, this new default is probably
better because I don't think anyone is currently interested in
running a 4kB page size guest with virtio-iommu on a 64kB host.
However supporting 64kB guest on 64kB host with virtio-iommu and
VFIO looks a more important feature.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

---

v1 -> v2:
- fixed 2 typos in the commit msg and added Jean's R-b and T-b
---
 hw/virtio/virtio-iommu.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Michael S. Tsirkin Feb. 13, 2024, 9:43 a.m. UTC | #1
On Wed, Jan 17, 2024 at 02:20:39PM +0100, Eric Auger wrote:
> We used to set default page_size_mask to qemu_target_page_mask() but
> with VFIO assignment it makes more sense to use the actual host page mask
> instead.
> 
> So from now on qemu_real_host_page_mask() will be used as a default.
> To be able to migrate older code, we increase the vmstat version_id
> to 3 and if an older incoming v2 stream is detected we set the previous
> default value.
> 
> The new default is well adapted to configs where host and guest have
> the same page size. This allows to fix hotplugging VFIO devices on a
> 64kB guest and a 64kB host. This test case has been failing before
> and even crashing qemu with hw_error("vfio: DMA mapping failed,
> unable to continue") in VFIO common). Indeed the hot-attached VFIO
> device would call memory_region_iommu_set_page_size_mask with 64kB
> mask whereas after the granule was frozen to 4kB on machine init done.
> Now this works. However the new default will prevent 4kB guest on
> 64kB host because the granule will be set to 64kB which would be
> larger than the guest page size. In that situation, the virtio-iommu
> driver fails on viommu_domain_finalise() with
> "granule 0x10000 larger than system page size 0x1000".
> 
> The current limitation of global granule in the virtio-iommu
> should be removed and turned into per domain granule. But
> until we get this upgraded, this new default is probably
> better because I don't think anyone is currently interested in
> running a 4kB page size guest with virtio-iommu on a 64kB host.
> However supporting 64kB guest on 64kB host with virtio-iommu and
> VFIO looks a more important feature.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

What about migration compatibility? In particular, cross-version one?
Don't we need compat machinery for this?

> ---
> 
> v1 -> v2:
> - fixed 2 typos in the commit msg and added Jean's R-b and T-b
> ---
>  hw/virtio/virtio-iommu.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 8a4bd933c6..ec2ba11d1d 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -1313,7 +1313,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>       * in vfio realize
>       */
>      s->config.bypass = s->boot_bypass;
> -    s->config.page_size_mask = qemu_target_page_mask();
> +    s->config.page_size_mask = qemu_real_host_page_mask();
>      s->config.input_range.end = UINT64_MAX;
>      s->config.domain_range.end = UINT32_MAX;
>      s->config.probe_size = VIOMMU_PROBE_SIZE;
> @@ -1491,13 +1491,16 @@ static int iommu_post_load(void *opaque, int version_id)
>       * still correct.
>       */
>      virtio_iommu_switch_address_space_all(s);
> +    if (version_id <= 2) {
> +        s->config.page_size_mask = qemu_target_page_mask();
> +    }
>      return 0;
>  }
>  
>  static const VMStateDescription vmstate_virtio_iommu_device = {
>      .name = "virtio-iommu-device",
>      .minimum_version_id = 2,
> -    .version_id = 2,
> +    .version_id = 3,
>      .post_load = iommu_post_load,
>      .fields = (const VMStateField[]) {
>          VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 2,
> -- 
> 2.41.0
Eric Auger Feb. 13, 2024, 10:32 a.m. UTC | #2
Hi Michael,

On 2/13/24 10:43, Michael S. Tsirkin wrote:
> On Wed, Jan 17, 2024 at 02:20:39PM +0100, Eric Auger wrote:
>> We used to set default page_size_mask to qemu_target_page_mask() but
>> with VFIO assignment it makes more sense to use the actual host page mask
>> instead.
>>
>> So from now on qemu_real_host_page_mask() will be used as a default.
>> To be able to migrate older code, we increase the vmstat version_id
>> to 3 and if an older incoming v2 stream is detected we set the previous
>> default value.
>>
>> The new default is well adapted to configs where host and guest have
>> the same page size. This allows to fix hotplugging VFIO devices on a
>> 64kB guest and a 64kB host. This test case has been failing before
>> and even crashing qemu with hw_error("vfio: DMA mapping failed,
>> unable to continue") in VFIO common). Indeed the hot-attached VFIO
>> device would call memory_region_iommu_set_page_size_mask with 64kB
>> mask whereas after the granule was frozen to 4kB on machine init done.
>> Now this works. However the new default will prevent 4kB guest on
>> 64kB host because the granule will be set to 64kB which would be
>> larger than the guest page size. In that situation, the virtio-iommu
>> driver fails on viommu_domain_finalise() with
>> "granule 0x10000 larger than system page size 0x1000".
>>
>> The current limitation of global granule in the virtio-iommu
>> should be removed and turned into per domain granule. But
>> until we get this upgraded, this new default is probably
>> better because I don't think anyone is currently interested in
>> running a 4kB page size guest with virtio-iommu on a 64kB host.
>> However supporting 64kB guest on 64kB host with virtio-iommu and
>> VFIO looks a more important feature.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> What about migration compatibility? In particular, cross-version one?
> Don't we need compat machinery for this?
See below
>
>> ---
>>
>> v1 -> v2:
>> - fixed 2 typos in the commit msg and added Jean's R-b and T-b
>> ---
>>  hw/virtio/virtio-iommu.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>> index 8a4bd933c6..ec2ba11d1d 100644
>> --- a/hw/virtio/virtio-iommu.c
>> +++ b/hw/virtio/virtio-iommu.c
>> @@ -1313,7 +1313,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>>       * in vfio realize
>>       */
>>      s->config.bypass = s->boot_bypass;
>> -    s->config.page_size_mask = qemu_target_page_mask();
>> +    s->config.page_size_mask = qemu_real_host_page_mask();
>>      s->config.input_range.end = UINT64_MAX;
>>      s->config.domain_range.end = UINT32_MAX;
>>      s->config.probe_size = VIOMMU_PROBE_SIZE;
>> @@ -1491,13 +1491,16 @@ static int iommu_post_load(void *opaque, int version_id)
>>       * still correct.
>>       */
>>      virtio_iommu_switch_address_space_all(s);
>> +    if (version_id <= 2) {
>> +        s->config.page_size_mask = qemu_target_page_mask();
I tested migration from v2 -> v3 and the above code is overriding the
new default by the older one.

Do you have an other concern?

Thanks

Eric
>> +    }
>>      return 0;
>>  }
>>  
>>  static const VMStateDescription vmstate_virtio_iommu_device = {
>>      .name = "virtio-iommu-device",
>>      .minimum_version_id = 2,
>> -    .version_id = 2,
>> +    .version_id = 3,
>>      .post_load = iommu_post_load,
>>      .fields = (const VMStateField[]) {
>>          VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 2,
>> -- 
>> 2.41.0
Michael S. Tsirkin Feb. 13, 2024, 11:07 a.m. UTC | #3
On Tue, Feb 13, 2024 at 11:32:13AM +0100, Eric Auger wrote:
> Hi Michael,
> 
> On 2/13/24 10:43, Michael S. Tsirkin wrote:
> > On Wed, Jan 17, 2024 at 02:20:39PM +0100, Eric Auger wrote:
> >> We used to set default page_size_mask to qemu_target_page_mask() but
> >> with VFIO assignment it makes more sense to use the actual host page mask
> >> instead.
> >>
> >> So from now on qemu_real_host_page_mask() will be used as a default.
> >> To be able to migrate older code, we increase the vmstat version_id
> >> to 3 and if an older incoming v2 stream is detected we set the previous
> >> default value.
> >>
> >> The new default is well adapted to configs where host and guest have
> >> the same page size. This allows to fix hotplugging VFIO devices on a
> >> 64kB guest and a 64kB host. This test case has been failing before
> >> and even crashing qemu with hw_error("vfio: DMA mapping failed,
> >> unable to continue") in VFIO common). Indeed the hot-attached VFIO
> >> device would call memory_region_iommu_set_page_size_mask with 64kB
> >> mask whereas after the granule was frozen to 4kB on machine init done.
> >> Now this works. However the new default will prevent 4kB guest on
> >> 64kB host because the granule will be set to 64kB which would be
> >> larger than the guest page size. In that situation, the virtio-iommu
> >> driver fails on viommu_domain_finalise() with
> >> "granule 0x10000 larger than system page size 0x1000".
> >>
> >> The current limitation of global granule in the virtio-iommu
> >> should be removed and turned into per domain granule. But
> >> until we get this upgraded, this new default is probably
> >> better because I don't think anyone is currently interested in
> >> running a 4kB page size guest with virtio-iommu on a 64kB host.
> >> However supporting 64kB guest on 64kB host with virtio-iommu and
> >> VFIO looks a more important feature.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > What about migration compatibility? In particular, cross-version one?
> > Don't we need compat machinery for this?
> See below
> >
> >> ---
> >>
> >> v1 -> v2:
> >> - fixed 2 typos in the commit msg and added Jean's R-b and T-b
> >> ---
> >>  hw/virtio/virtio-iommu.c | 7 +++++--
> >>  1 file changed, 5 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> >> index 8a4bd933c6..ec2ba11d1d 100644
> >> --- a/hw/virtio/virtio-iommu.c
> >> +++ b/hw/virtio/virtio-iommu.c
> >> @@ -1313,7 +1313,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
> >>       * in vfio realize
> >>       */
> >>      s->config.bypass = s->boot_bypass;
> >> -    s->config.page_size_mask = qemu_target_page_mask();
> >> +    s->config.page_size_mask = qemu_real_host_page_mask();
> >>      s->config.input_range.end = UINT64_MAX;
> >>      s->config.domain_range.end = UINT32_MAX;
> >>      s->config.probe_size = VIOMMU_PROBE_SIZE;
> >> @@ -1491,13 +1491,16 @@ static int iommu_post_load(void *opaque, int version_id)
> >>       * still correct.
> >>       */
> >>      virtio_iommu_switch_address_space_all(s);
> >> +    if (version_id <= 2) {
> >> +        s->config.page_size_mask = qemu_target_page_mask();
> I tested migration from v2 -> v3 and the above code is overriding the
> new default by the older one.
> 
> Do you have an other concern?
> 
> Thanks
> 
> Eric



> >> +    }
> >>      return 0;
> >>  }
> >>  
> >>  static const VMStateDescription vmstate_virtio_iommu_device = {
> >>      .name = "virtio-iommu-device",
> >>      .minimum_version_id = 2,
> >> -    .version_id = 2,
> >> +    .version_id = 3,
> >>      .post_load = iommu_post_load,
> >>      .fields = (const VMStateField[]) {
> >>          VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 2,

Oh I missed the version change. But then migration  to older version
is completely broken isn't it? Old qemu can not handle version_id 3 at
all.


Generally, compat machinery is nicer than the old version hacks.


> >> -- 
> >> 2.41.0
Michael S. Tsirkin Feb. 13, 2024, 11:09 a.m. UTC | #4
On Tue, Feb 13, 2024 at 11:32:13AM +0100, Eric Auger wrote:
> Do you have an other concern?

I also worry a bit about migrating between hosts with different
page sizes. Not with kvm I am guessing but with tcg it does work I think?
Is this just for vfio and vdpa? Can we limit this to these setups
maybe?
Eric Auger Feb. 13, 2024, 11:19 a.m. UTC | #5
On 2/13/24 12:07, Michael S. Tsirkin wrote:
> On Tue, Feb 13, 2024 at 11:32:13AM +0100, Eric Auger wrote:
>> Hi Michael,
>>
>> On 2/13/24 10:43, Michael S. Tsirkin wrote:
>>> On Wed, Jan 17, 2024 at 02:20:39PM +0100, Eric Auger wrote:
>>>> We used to set default page_size_mask to qemu_target_page_mask() but
>>>> with VFIO assignment it makes more sense to use the actual host page mask
>>>> instead.
>>>>
>>>> So from now on qemu_real_host_page_mask() will be used as a default.
>>>> To be able to migrate older code, we increase the vmstat version_id
>>>> to 3 and if an older incoming v2 stream is detected we set the previous
>>>> default value.
>>>>
>>>> The new default is well adapted to configs where host and guest have
>>>> the same page size. This allows to fix hotplugging VFIO devices on a
>>>> 64kB guest and a 64kB host. This test case has been failing before
>>>> and even crashing qemu with hw_error("vfio: DMA mapping failed,
>>>> unable to continue") in VFIO common). Indeed the hot-attached VFIO
>>>> device would call memory_region_iommu_set_page_size_mask with 64kB
>>>> mask whereas after the granule was frozen to 4kB on machine init done.
>>>> Now this works. However the new default will prevent 4kB guest on
>>>> 64kB host because the granule will be set to 64kB which would be
>>>> larger than the guest page size. In that situation, the virtio-iommu
>>>> driver fails on viommu_domain_finalise() with
>>>> "granule 0x10000 larger than system page size 0x1000".
>>>>
>>>> The current limitation of global granule in the virtio-iommu
>>>> should be removed and turned into per domain granule. But
>>>> until we get this upgraded, this new default is probably
>>>> better because I don't think anyone is currently interested in
>>>> running a 4kB page size guest with virtio-iommu on a 64kB host.
>>>> However supporting 64kB guest on 64kB host with virtio-iommu and
>>>> VFIO looks a more important feature.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>> Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>>> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>> What about migration compatibility? In particular, cross-version one?
>>> Don't we need compat machinery for this?
>> See below
>>>> ---
>>>>
>>>> v1 -> v2:
>>>> - fixed 2 typos in the commit msg and added Jean's R-b and T-b
>>>> ---
>>>>  hw/virtio/virtio-iommu.c | 7 +++++--
>>>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>>>> index 8a4bd933c6..ec2ba11d1d 100644
>>>> --- a/hw/virtio/virtio-iommu.c
>>>> +++ b/hw/virtio/virtio-iommu.c
>>>> @@ -1313,7 +1313,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>>>>       * in vfio realize
>>>>       */
>>>>      s->config.bypass = s->boot_bypass;
>>>> -    s->config.page_size_mask = qemu_target_page_mask();
>>>> +    s->config.page_size_mask = qemu_real_host_page_mask();
>>>>      s->config.input_range.end = UINT64_MAX;
>>>>      s->config.domain_range.end = UINT32_MAX;
>>>>      s->config.probe_size = VIOMMU_PROBE_SIZE;
>>>> @@ -1491,13 +1491,16 @@ static int iommu_post_load(void *opaque, int version_id)
>>>>       * still correct.
>>>>       */
>>>>      virtio_iommu_switch_address_space_all(s);
>>>> +    if (version_id <= 2) {
>>>> +        s->config.page_size_mask = qemu_target_page_mask();
>> I tested migration from v2 -> v3 and the above code is overriding the
>> new default by the older one.
>>
>> Do you have an other concern?
>>
>> Thanks
>>
>> Eric
>
>
>>>> +    }
>>>>      return 0;
>>>>  }
>>>>  
>>>>  static const VMStateDescription vmstate_virtio_iommu_device = {
>>>>      .name = "virtio-iommu-device",
>>>>      .minimum_version_id = 2,
>>>> -    .version_id = 2,
>>>> +    .version_id = 3,
>>>>      .post_load = iommu_post_load,
>>>>      .fields = (const VMStateField[]) {
>>>>          VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 2,
> Oh I missed the version change. But then migration  to older version
> is completely broken isn't it? Old qemu can not handle version_id 3 at
> all.
Indeed, I considered migrating backyard was not that much important. Do
you consider this is mandated?

Eric
>
> Generally, compat machinery is nicer than the old version hacks.
>
>
>>>> -- 
>>>> 2.41.0
>
Eric Auger Feb. 13, 2024, 11:24 a.m. UTC | #6
Hi Michael,
On 2/13/24 12:09, Michael S. Tsirkin wrote:
> On Tue, Feb 13, 2024 at 11:32:13AM +0100, Eric Auger wrote:
>> Do you have an other concern?
> I also worry a bit about migrating between hosts with different
> page sizes. Not with kvm I am guessing but with tcg it does work I think?
I have never tried but is it a valid use case? Adding Peter in CC.
> Is this just for vfio and vdpa? Can we limit this to these setups
> maybe?
I am afraid we know the actual use case too later. If the VFIO device is
hotplugged we have started working with 4kB granule.

The other way is to introduce a min_granule option as done for aw-bits.
But it is heavier.

Thanks

Eric
>
Michael S. Tsirkin Feb. 13, 2024, 11:39 a.m. UTC | #7
On Tue, Feb 13, 2024 at 12:19:16PM +0100, Eric Auger wrote:
> 
> 
> On 2/13/24 12:07, Michael S. Tsirkin wrote:
> > On Tue, Feb 13, 2024 at 11:32:13AM +0100, Eric Auger wrote:
> >> Hi Michael,
> >>
> >> On 2/13/24 10:43, Michael S. Tsirkin wrote:
> >>> On Wed, Jan 17, 2024 at 02:20:39PM +0100, Eric Auger wrote:
> >>>> We used to set default page_size_mask to qemu_target_page_mask() but
> >>>> with VFIO assignment it makes more sense to use the actual host page mask
> >>>> instead.
> >>>>
> >>>> So from now on qemu_real_host_page_mask() will be used as a default.
> >>>> To be able to migrate older code, we increase the vmstat version_id
> >>>> to 3 and if an older incoming v2 stream is detected we set the previous
> >>>> default value.
> >>>>
> >>>> The new default is well adapted to configs where host and guest have
> >>>> the same page size. This allows to fix hotplugging VFIO devices on a
> >>>> 64kB guest and a 64kB host. This test case has been failing before
> >>>> and even crashing qemu with hw_error("vfio: DMA mapping failed,
> >>>> unable to continue") in VFIO common). Indeed the hot-attached VFIO
> >>>> device would call memory_region_iommu_set_page_size_mask with 64kB
> >>>> mask whereas after the granule was frozen to 4kB on machine init done.
> >>>> Now this works. However the new default will prevent 4kB guest on
> >>>> 64kB host because the granule will be set to 64kB which would be
> >>>> larger than the guest page size. In that situation, the virtio-iommu
> >>>> driver fails on viommu_domain_finalise() with
> >>>> "granule 0x10000 larger than system page size 0x1000".
> >>>>
> >>>> The current limitation of global granule in the virtio-iommu
> >>>> should be removed and turned into per domain granule. But
> >>>> until we get this upgraded, this new default is probably
> >>>> better because I don't think anyone is currently interested in
> >>>> running a 4kB page size guest with virtio-iommu on a 64kB host.
> >>>> However supporting 64kB guest on 64kB host with virtio-iommu and
> >>>> VFIO looks a more important feature.
> >>>>
> >>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>>> Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >>>> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >>> What about migration compatibility? In particular, cross-version one?
> >>> Don't we need compat machinery for this?
> >> See below
> >>>> ---
> >>>>
> >>>> v1 -> v2:
> >>>> - fixed 2 typos in the commit msg and added Jean's R-b and T-b
> >>>> ---
> >>>>  hw/virtio/virtio-iommu.c | 7 +++++--
> >>>>  1 file changed, 5 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> >>>> index 8a4bd933c6..ec2ba11d1d 100644
> >>>> --- a/hw/virtio/virtio-iommu.c
> >>>> +++ b/hw/virtio/virtio-iommu.c
> >>>> @@ -1313,7 +1313,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
> >>>>       * in vfio realize
> >>>>       */
> >>>>      s->config.bypass = s->boot_bypass;
> >>>> -    s->config.page_size_mask = qemu_target_page_mask();
> >>>> +    s->config.page_size_mask = qemu_real_host_page_mask();
> >>>>      s->config.input_range.end = UINT64_MAX;
> >>>>      s->config.domain_range.end = UINT32_MAX;
> >>>>      s->config.probe_size = VIOMMU_PROBE_SIZE;
> >>>> @@ -1491,13 +1491,16 @@ static int iommu_post_load(void *opaque, int version_id)
> >>>>       * still correct.
> >>>>       */
> >>>>      virtio_iommu_switch_address_space_all(s);
> >>>> +    if (version_id <= 2) {
> >>>> +        s->config.page_size_mask = qemu_target_page_mask();
> >> I tested migration from v2 -> v3 and the above code is overriding the
> >> new default by the older one.
> >>
> >> Do you have an other concern?
> >>
> >> Thanks
> >>
> >> Eric
> >
> >
> >>>> +    }
> >>>>      return 0;
> >>>>  }
> >>>>  
> >>>>  static const VMStateDescription vmstate_virtio_iommu_device = {
> >>>>      .name = "virtio-iommu-device",
> >>>>      .minimum_version_id = 2,
> >>>> -    .version_id = 2,
> >>>> +    .version_id = 3,
> >>>>      .post_load = iommu_post_load,
> >>>>      .fields = (const VMStateField[]) {
> >>>>          VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 2,
> > Oh I missed the version change. But then migration  to older version
> > is completely broken isn't it? Old qemu can not handle version_id 3 at
> > all.
> Indeed, I considered migrating backyard was not that much important. Do
> you consider this is mandated?
> 
> Eric

Generally yes. We only ship downstream but e.g. any RHEL major version
migrates to same major version in any direction as people have clusters
mixing different versions. It's easier to maintain that guarantee
upstream than break it upstream and try to fix it downstream.


> >
> > Generally, compat machinery is nicer than the old version hacks.
> >
> >
> >>>> -- 
> >>>> 2.41.0
> >
Eric Auger Feb. 13, 2024, 11:49 a.m. UTC | #8
On 2/13/24 12:39, Michael S. Tsirkin wrote:
> On Tue, Feb 13, 2024 at 12:19:16PM +0100, Eric Auger wrote:
>>
>> On 2/13/24 12:07, Michael S. Tsirkin wrote:
>>> On Tue, Feb 13, 2024 at 11:32:13AM +0100, Eric Auger wrote:
>>>> Hi Michael,
>>>>
>>>> On 2/13/24 10:43, Michael S. Tsirkin wrote:
>>>>> On Wed, Jan 17, 2024 at 02:20:39PM +0100, Eric Auger wrote:
>>>>>> We used to set default page_size_mask to qemu_target_page_mask() but
>>>>>> with VFIO assignment it makes more sense to use the actual host page mask
>>>>>> instead.
>>>>>>
>>>>>> So from now on qemu_real_host_page_mask() will be used as a default.
>>>>>> To be able to migrate older code, we increase the vmstat version_id
>>>>>> to 3 and if an older incoming v2 stream is detected we set the previous
>>>>>> default value.
>>>>>>
>>>>>> The new default is well adapted to configs where host and guest have
>>>>>> the same page size. This allows to fix hotplugging VFIO devices on a
>>>>>> 64kB guest and a 64kB host. This test case has been failing before
>>>>>> and even crashing qemu with hw_error("vfio: DMA mapping failed,
>>>>>> unable to continue") in VFIO common). Indeed the hot-attached VFIO
>>>>>> device would call memory_region_iommu_set_page_size_mask with 64kB
>>>>>> mask whereas after the granule was frozen to 4kB on machine init done.
>>>>>> Now this works. However the new default will prevent 4kB guest on
>>>>>> 64kB host because the granule will be set to 64kB which would be
>>>>>> larger than the guest page size. In that situation, the virtio-iommu
>>>>>> driver fails on viommu_domain_finalise() with
>>>>>> "granule 0x10000 larger than system page size 0x1000".
>>>>>>
>>>>>> The current limitation of global granule in the virtio-iommu
>>>>>> should be removed and turned into per domain granule. But
>>>>>> until we get this upgraded, this new default is probably
>>>>>> better because I don't think anyone is currently interested in
>>>>>> running a 4kB page size guest with virtio-iommu on a 64kB host.
>>>>>> However supporting 64kB guest on 64kB host with virtio-iommu and
>>>>>> VFIO looks a more important feature.
>>>>>>
>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>> Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>>>>> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>>>> What about migration compatibility? In particular, cross-version one?
>>>>> Don't we need compat machinery for this?
>>>> See below
>>>>>> ---
>>>>>>
>>>>>> v1 -> v2:
>>>>>> - fixed 2 typos in the commit msg and added Jean's R-b and T-b
>>>>>> ---
>>>>>>  hw/virtio/virtio-iommu.c | 7 +++++--
>>>>>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>>>>>> index 8a4bd933c6..ec2ba11d1d 100644
>>>>>> --- a/hw/virtio/virtio-iommu.c
>>>>>> +++ b/hw/virtio/virtio-iommu.c
>>>>>> @@ -1313,7 +1313,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>>>>>>       * in vfio realize
>>>>>>       */
>>>>>>      s->config.bypass = s->boot_bypass;
>>>>>> -    s->config.page_size_mask = qemu_target_page_mask();
>>>>>> +    s->config.page_size_mask = qemu_real_host_page_mask();
>>>>>>      s->config.input_range.end = UINT64_MAX;
>>>>>>      s->config.domain_range.end = UINT32_MAX;
>>>>>>      s->config.probe_size = VIOMMU_PROBE_SIZE;
>>>>>> @@ -1491,13 +1491,16 @@ static int iommu_post_load(void *opaque, int version_id)
>>>>>>       * still correct.
>>>>>>       */
>>>>>>      virtio_iommu_switch_address_space_all(s);
>>>>>> +    if (version_id <= 2) {
>>>>>> +        s->config.page_size_mask = qemu_target_page_mask();
>>>> I tested migration from v2 -> v3 and the above code is overriding the
>>>> new default by the older one.
>>>>
>>>> Do you have an other concern?
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>>
>>>>>> +    }
>>>>>>      return 0;
>>>>>>  }
>>>>>>  
>>>>>>  static const VMStateDescription vmstate_virtio_iommu_device = {
>>>>>>      .name = "virtio-iommu-device",
>>>>>>      .minimum_version_id = 2,
>>>>>> -    .version_id = 2,
>>>>>> +    .version_id = 3,
>>>>>>      .post_load = iommu_post_load,
>>>>>>      .fields = (const VMStateField[]) {
>>>>>>          VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 2,
>>> Oh I missed the version change. But then migration  to older version
>>> is completely broken isn't it? Old qemu can not handle version_id 3 at
>>> all.
>> Indeed, I considered migrating backyard was not that much important. Do
>> you consider this is mandated?
>>
>> Eric
> Generally yes. We only ship downstream but e.g. any RHEL major version
> migrates to same major version in any direction as people have clusters
> mixing different versions. It's easier to maintain that guarantee
> upstream than break it upstream and try to fix it downstream.

OK then I will revisit the implementation

thanks

Eric
>
>
>>> Generally, compat machinery is nicer than the old version hacks.
>>>
>>>
>>>>>> -- 
>>>>>> 2.41.0
Michael S. Tsirkin Feb. 13, 2024, noon UTC | #9
On Tue, Feb 13, 2024 at 12:24:22PM +0100, Eric Auger wrote:
> Hi Michael,
> On 2/13/24 12:09, Michael S. Tsirkin wrote:
> > On Tue, Feb 13, 2024 at 11:32:13AM +0100, Eric Auger wrote:
> >> Do you have an other concern?
> > I also worry a bit about migrating between hosts with different
> > page sizes. Not with kvm I am guessing but with tcg it does work I think?
> I have never tried but is it a valid use case? Adding Peter in CC.
> > Is this just for vfio and vdpa? Can we limit this to these setups
> > maybe?
> I am afraid we know the actual use case too later. If the VFIO device is
> hotplugged we have started working with 4kB granule.
> 
> The other way is to introduce a min_granule option as done for aw-bits.
> But it is heavier.
> 
> Thanks
> 
> Eric
> >

Let's say, if you are changing the default then we definitely want
a way to get the cmpatible behaviour for tcg.
So the compat machinery should be user-accessible too and documented.
Eric Auger Feb. 21, 2024, 10:41 a.m. UTC | #10
Hi,

On 2/13/24 13:00, Michael S. Tsirkin wrote:
> On Tue, Feb 13, 2024 at 12:24:22PM +0100, Eric Auger wrote:
>> Hi Michael,
>> On 2/13/24 12:09, Michael S. Tsirkin wrote:
>>> On Tue, Feb 13, 2024 at 11:32:13AM +0100, Eric Auger wrote:
>>>> Do you have an other concern?
>>> I also worry a bit about migrating between hosts with different
>>> page sizes. Not with kvm I am guessing but with tcg it does work I think?
>> I have never tried but is it a valid use case? Adding Peter in CC.
>>> Is this just for vfio and vdpa? Can we limit this to these setups
>>> maybe?
>> I am afraid we know the actual use case too later. If the VFIO device is
>> hotplugged we have started working with 4kB granule.
>>
>> The other way is to introduce a min_granule option as done for aw-bits.
>> But it is heavier.
>>
>> Thanks
>>
>> Eric
> Let's say, if you are changing the default then we definitely want
> a way to get the cmpatible behaviour for tcg.
> So the compat machinery should be user-accessible too and documented.

I guess I need to add a new option to guarantee the machine compat.

I was thinking about an enum GranuleMode property taking the following
values, 4KB, 64KB, host
Jean, do you think there is a rationale offering something richer?

Obviously being able to set the exact page_size_mask + host mode would
be better but this does not really fit into any std property type.

Thanks

Eric
>
Jean-Philippe Brucker Feb. 21, 2024, 11:31 a.m. UTC | #11
On Wed, Feb 21, 2024 at 11:41:57AM +0100, Eric Auger wrote:
> Hi,
> 
> On 2/13/24 13:00, Michael S. Tsirkin wrote:
> > On Tue, Feb 13, 2024 at 12:24:22PM +0100, Eric Auger wrote:
> >> Hi Michael,
> >> On 2/13/24 12:09, Michael S. Tsirkin wrote:
> >>> On Tue, Feb 13, 2024 at 11:32:13AM +0100, Eric Auger wrote:
> >>>> Do you have an other concern?
> >>> I also worry a bit about migrating between hosts with different
> >>> page sizes. Not with kvm I am guessing but with tcg it does work I think?
> >> I have never tried but is it a valid use case? Adding Peter in CC.
> >>> Is this just for vfio and vdpa? Can we limit this to these setups
> >>> maybe?
> >> I am afraid we know the actual use case too later. If the VFIO device is
> >> hotplugged we have started working with 4kB granule.
> >>
> >> The other way is to introduce a min_granule option as done for aw-bits.
> >> But it is heavier.
> >>
> >> Thanks
> >>
> >> Eric
> > Let's say, if you are changing the default then we definitely want
> > a way to get the cmpatible behaviour for tcg.
> > So the compat machinery should be user-accessible too and documented.
> 
> I guess I need to add a new option to guarantee the machine compat.
> 
> I was thinking about an enum GranuleMode property taking the following
> values, 4KB, 64KB, host
> Jean, do you think there is a rationale offering something richer?

16KB seems to be gaining popularity, we should include that (I think it's
the only granule supported by Apple IOMMU?). Hopefully that will be
enough.

Thanks,
Jean

> 
> Obviously being able to set the exact page_size_mask + host mode would
> be better but this does not really fit into any std property type.
> 
> Thanks
> 
> Eric
> >
>
Eric Auger Feb. 21, 2024, 12:45 p.m. UTC | #12
On 2/21/24 12:31, Jean-Philippe Brucker wrote:
> On Wed, Feb 21, 2024 at 11:41:57AM +0100, Eric Auger wrote:
>> Hi,
>>
>> On 2/13/24 13:00, Michael S. Tsirkin wrote:
>>> On Tue, Feb 13, 2024 at 12:24:22PM +0100, Eric Auger wrote:
>>>> Hi Michael,
>>>> On 2/13/24 12:09, Michael S. Tsirkin wrote:
>>>>> On Tue, Feb 13, 2024 at 11:32:13AM +0100, Eric Auger wrote:
>>>>>> Do you have an other concern?
>>>>> I also worry a bit about migrating between hosts with different
>>>>> page sizes. Not with kvm I am guessing but with tcg it does work I think?
>>>> I have never tried but is it a valid use case? Adding Peter in CC.
>>>>> Is this just for vfio and vdpa? Can we limit this to these setups
>>>>> maybe?
>>>> I am afraid we know the actual use case too later. If the VFIO device is
>>>> hotplugged we have started working with 4kB granule.
>>>>
>>>> The other way is to introduce a min_granule option as done for aw-bits.
>>>> But it is heavier.
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>> Let's say, if you are changing the default then we definitely want
>>> a way to get the cmpatible behaviour for tcg.
>>> So the compat machinery should be user-accessible too and documented.
>> I guess I need to add a new option to guarantee the machine compat.
>>
>> I was thinking about an enum GranuleMode property taking the following
>> values, 4KB, 64KB, host
>> Jean, do you think there is a rationale offering something richer?
> 16KB seems to be gaining popularity, we should include that (I think it's
> the only granule supported by Apple IOMMU?). Hopefully that will be
> enough.
thank you for your prompt reply. I do agree.

nevertheless I am going to prototype passing the whole page size mask I
think. That could be useful to test some weird configs

Thanks

Eric
>
> Thanks,
> Jean
>
>> Obviously being able to set the exact page_size_mask + host mode would
>> be better but this does not really fit into any std property type.
>>
>> Thanks
>>
>> Eric
diff mbox series

Patch

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 8a4bd933c6..ec2ba11d1d 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -1313,7 +1313,7 @@  static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
      * in vfio realize
      */
     s->config.bypass = s->boot_bypass;
-    s->config.page_size_mask = qemu_target_page_mask();
+    s->config.page_size_mask = qemu_real_host_page_mask();
     s->config.input_range.end = UINT64_MAX;
     s->config.domain_range.end = UINT32_MAX;
     s->config.probe_size = VIOMMU_PROBE_SIZE;
@@ -1491,13 +1491,16 @@  static int iommu_post_load(void *opaque, int version_id)
      * still correct.
      */
     virtio_iommu_switch_address_space_all(s);
+    if (version_id <= 2) {
+        s->config.page_size_mask = qemu_target_page_mask();
+    }
     return 0;
 }
 
 static const VMStateDescription vmstate_virtio_iommu_device = {
     .name = "virtio-iommu-device",
     .minimum_version_id = 2,
-    .version_id = 2,
+    .version_id = 3,
     .post_load = iommu_post_load,
     .fields = (const VMStateField[]) {
         VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 2,