diff mbox series

virtio-iommu: Use qemu_real_host_page_mask as default page_size_mask

Message ID 20231221134505.100916-1-eric.auger@redhat.com
State New
Headers show
Series virtio-iommu: Use qemu_real_host_page_mask as default page_size_mask | expand

Commit Message

Eric Auger Dec. 21, 2023, 1:45 p.m. UTC
We used to set default page_size_mask to qemu_target_page_mask() but
with VFIO assignment it makes more sense to use the actual host page mask
instead.

So from now on qemu_real_host_page_mask() will be used as a default.
To be able to migrate older code, we increase the vmstat version_id
to 3 and if an older incoming v2 stream is detected we set the previous
default value.

The new default is well adapted to configs where host and guest have
the same page size. This allows to fix hotplugging VFIO devices on a
64kB guest and a 64kB host. This test case has been failing before
and even crashing qemu with hw_error("vfio: DMA mapping failed,
unable to continue") in VFIO common). Indeed the hot-attached VFIO
device would call memory_region_iommu_set_page_size_mask with 64kB
mask whereas after the granule was frozen to 4kB on machine init done.
Now this works. However the new default will prevent 4kB guest on
64kB host because the granule will be set to 64kB which would be
larger than the guest page size. In that situation, the virtio-iommu
driver fails the viommu_domain_finalise() with
"granule 0x10000 larger than system page zie 0x1000".

The current limitation of global granule in the virtio-iommu
should be removed and turned into per domain granule. But
until we get this upgraded, this new default is probably
better because I don't think anyone is currently interested in
running a 4kB page size guest with virtio-iommu on a 64kB host.
However supporting 64kB guest on 64kB host with virtio-iommu and
VFIO looks a more important feature.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/virtio/virtio-iommu.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Eric Auger Jan. 15, 2024, 11:04 a.m. UTC | #1
Hi,

On 12/21/23 14:45, Eric Auger wrote:
> We used to set default page_size_mask to qemu_target_page_mask() but
> with VFIO assignment it makes more sense to use the actual host page mask
> instead.
>
> So from now on qemu_real_host_page_mask() will be used as a default.
> To be able to migrate older code, we increase the vmstat version_id
> to 3 and if an older incoming v2 stream is detected we set the previous
> default value.
>
> The new default is well adapted to configs where host and guest have
> the same page size. This allows to fix hotplugging VFIO devices on a
> 64kB guest and a 64kB host. This test case has been failing before
> and even crashing qemu with hw_error("vfio: DMA mapping failed,
> unable to continue") in VFIO common). Indeed the hot-attached VFIO
> device would call memory_region_iommu_set_page_size_mask with 64kB
> mask whereas after the granule was frozen to 4kB on machine init done.
> Now this works. However the new default will prevent 4kB guest on
> 64kB host because the granule will be set to 64kB which would be
> larger than the guest page size. In that situation, the virtio-iommu
> driver fails the viommu_domain_finalise() with
> "granule 0x10000 larger than system page zie 0x1000".
>
> The current limitation of global granule in the virtio-iommu
> should be removed and turned into per domain granule. But
> until we get this upgraded, this new default is probably
> better because I don't think anyone is currently interested in
> running a 4kB page size guest with virtio-iommu on a 64kB host.
> However supporting 64kB guest on 64kB host with virtio-iommu and
> VFIO looks a more important feature.

Gentle ping

Thanks

Eric
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/virtio/virtio-iommu.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 9d463efc52..b77e3644ea 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -1313,7 +1313,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>       * in vfio realize
>       */
>      s->config.bypass = s->boot_bypass;
> -    s->config.page_size_mask = qemu_target_page_mask();
> +    s->config.page_size_mask = qemu_real_host_page_mask();
>      s->config.input_range.end = UINT64_MAX;
>      s->config.domain_range.end = UINT32_MAX;
>      s->config.probe_size = VIOMMU_PROBE_SIZE;
> @@ -1491,13 +1491,16 @@ static int iommu_post_load(void *opaque, int version_id)
>       * still correct.
>       */
>      virtio_iommu_switch_address_space_all(s);
> +    if (version_id <= 2) {
> +        s->config.page_size_mask = qemu_target_page_mask();
> +    }
>      return 0;
>  }
>  
>  static const VMStateDescription vmstate_virtio_iommu_device = {
>      .name = "virtio-iommu-device",
>      .minimum_version_id = 2,
> -    .version_id = 2,
> +    .version_id = 3,
>      .post_load = iommu_post_load,
>      .fields = (VMStateField[]) {
>          VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 2,
Jean-Philippe Brucker Jan. 16, 2024, 12:53 p.m. UTC | #2
On Thu, Dec 21, 2023 at 08:45:05AM -0500, Eric Auger wrote:
> We used to set default page_size_mask to qemu_target_page_mask() but
> with VFIO assignment it makes more sense to use the actual host page mask
> instead.
> 
> So from now on qemu_real_host_page_mask() will be used as a default.
> To be able to migrate older code, we increase the vmstat version_id
> to 3 and if an older incoming v2 stream is detected we set the previous
> default value.
> 
> The new default is well adapted to configs where host and guest have
> the same page size. This allows to fix hotplugging VFIO devices on a
> 64kB guest and a 64kB host. This test case has been failing before
> and even crashing qemu with hw_error("vfio: DMA mapping failed,
> unable to continue") in VFIO common). Indeed the hot-attached VFIO
> device would call memory_region_iommu_set_page_size_mask with 64kB
> mask whereas after the granule was frozen to 4kB on machine init done.

I guess TARGET_PAGE_MASK is always 4kB on arm64 CPUs, since it's the
smallest supported and the guest configures its page size at runtime.
Even if QEMU's software IOMMU can deal with any page size, VFIO can't so
passing the host page size seems more accurate than forcing a value of
4kB.

> Now this works. However the new default will prevent 4kB guest on
> 64kB host because the granule will be set to 64kB which would be
> larger than the guest page size. In that situation, the virtio-iommu
> driver fails the viommu_domain_finalise() with
> "granule 0x10000 larger than system page zie 0x1000".

"size"
(it could matter if someone searches for this message later)

> 
> The current limitation of global granule in the virtio-iommu
> should be removed and turned into per domain granule. But
> until we get this upgraded, this new default is probably
> better because I don't think anyone is currently interested in
> running a 4kB page size guest with virtio-iommu on a 64kB host.
> However supporting 64kB guest on 64kB host with virtio-iommu and
> VFIO looks a more important feature.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>

So to summarize the configurations that work for hotplug (tested with QEMU
system emulation with SMMU + QEMU VMM with virtio-iommu):

 Host | Guest | virtio-net | IGB passthrough
  4k  | 4k    | Y          | Y
  64k | 64k   | Y          | N -> Y (fixed by this patch)
  64k | 4k    | Y -> N     | N
  4k  | 64k   | Y          | Y

The change is a reasonable trade-off in my opinion. It fixes the more common
64k on 64k case, and for 4k on 64k, the error is now contained to the
guest and made clear ("granule 0x10000 larger than system page size
0x1000") instead of crashing the VMM. A guest OS now discovers that the
host needs DMA buffers aligned on 64k and could actually support this case
(but Linux won't because it can't control the origin of all DMA buffers).
Later, support for page tables will enable 4k on 64k for all devices.

Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

> ---
>  hw/virtio/virtio-iommu.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 9d463efc52..b77e3644ea 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -1313,7 +1313,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>       * in vfio realize
>       */
>      s->config.bypass = s->boot_bypass;
> -    s->config.page_size_mask = qemu_target_page_mask();
> +    s->config.page_size_mask = qemu_real_host_page_mask();
>      s->config.input_range.end = UINT64_MAX;
>      s->config.domain_range.end = UINT32_MAX;
>      s->config.probe_size = VIOMMU_PROBE_SIZE;
> @@ -1491,13 +1491,16 @@ static int iommu_post_load(void *opaque, int version_id)
>       * still correct.
>       */
>      virtio_iommu_switch_address_space_all(s);
> +    if (version_id <= 2) {
> +        s->config.page_size_mask = qemu_target_page_mask();
> +    }
>      return 0;
>  }
>  
>  static const VMStateDescription vmstate_virtio_iommu_device = {
>      .name = "virtio-iommu-device",
>      .minimum_version_id = 2,
> -    .version_id = 2,
> +    .version_id = 3,
>      .post_load = iommu_post_load,
>      .fields = (VMStateField[]) {
>          VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 2,
> -- 
> 2.27.0
>
Eric Auger Jan. 16, 2024, 2:44 p.m. UTC | #3
Hi Jean,

On 1/16/24 13:53, Jean-Philippe Brucker wrote:
> On Thu, Dec 21, 2023 at 08:45:05AM -0500, Eric Auger wrote:
>> We used to set default page_size_mask to qemu_target_page_mask() but
>> with VFIO assignment it makes more sense to use the actual host page mask
>> instead.
>>
>> So from now on qemu_real_host_page_mask() will be used as a default.
>> To be able to migrate older code, we increase the vmstat version_id
>> to 3 and if an older incoming v2 stream is detected we set the previous
>> default value.
>>
>> The new default is well adapted to configs where host and guest have
>> the same page size. This allows to fix hotplugging VFIO devices on a
>> 64kB guest and a 64kB host. This test case has been failing before
>> and even crashing qemu with hw_error("vfio: DMA mapping failed,
>> unable to continue") in VFIO common). Indeed the hot-attached VFIO
>> device would call memory_region_iommu_set_page_size_mask with 64kB
>> mask whereas after the granule was frozen to 4kB on machine init done.
> I guess TARGET_PAGE_MASK is always 4kB on arm64 CPUs, since it's the
> smallest supported and the guest configures its page size at runtime.
correct
> Even if QEMU's software IOMMU can deal with any page size, VFIO can't so
> passing the host page size seems more accurate than forcing a value of
> 4kB.
that's my guess too
>
>> Now this works. However the new default will prevent 4kB guest on
>> 64kB host because the granule will be set to 64kB which would be
>> larger than the guest page size. In that situation, the virtio-iommu
>> driver fails the viommu_domain_finalise() with
>> "granule 0x10000 larger than system page zie 0x1000".
> "size"
> (it could matter if someone searches for this message later)
sure I will correct it
>
>> The current limitation of global granule in the virtio-iommu
>> should be removed and turned into per domain granule. But
>> until we get this upgraded, this new default is probably
>> better because I don't think anyone is currently interested in
>> running a 4kB page size guest with virtio-iommu on a 64kB host.
>> However supporting 64kB guest on 64kB host with virtio-iommu and
>> VFIO looks a more important feature.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> So to summarize the configurations that work for hotplug (tested with QEMU
> system emulation with SMMU + QEMU VMM with virtio-iommu):
>
>  Host | Guest | virtio-net | IGB passthrough
>   4k  | 4k    | Y          | Y
>   64k | 64k   | Y          | N -> Y (fixed by this patch)
>   64k | 4k    | Y -> N     | N
>   4k  | 64k   | Y          | Y
That table is correct
>
> The change is a reasonable trade-off in my opinion. It fixes the more common
> 64k on 64k case, and for 4k on 64k, the error is now contained to the
> guest and made clear ("granule 0x10000 larger than system page size
> 0x1000") instead of crashing the VMM. A guest OS now discovers that the
> host needs DMA buffers aligned on 64k and could actually support this case
> (but Linux won't because it can't control the origin of all DMA buffers).
> Later, support for page tables will enable 4k on 64k for all devices.
>
> Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

Many thanks!

Eric
>
>> ---
>>  hw/virtio/virtio-iommu.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>> index 9d463efc52..b77e3644ea 100644
>> --- a/hw/virtio/virtio-iommu.c
>> +++ b/hw/virtio/virtio-iommu.c
>> @@ -1313,7 +1313,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>>       * in vfio realize
>>       */
>>      s->config.bypass = s->boot_bypass;
>> -    s->config.page_size_mask = qemu_target_page_mask();
>> +    s->config.page_size_mask = qemu_real_host_page_mask();
>>      s->config.input_range.end = UINT64_MAX;
>>      s->config.domain_range.end = UINT32_MAX;
>>      s->config.probe_size = VIOMMU_PROBE_SIZE;
>> @@ -1491,13 +1491,16 @@ static int iommu_post_load(void *opaque, int version_id)
>>       * still correct.
>>       */
>>      virtio_iommu_switch_address_space_all(s);
>> +    if (version_id <= 2) {
>> +        s->config.page_size_mask = qemu_target_page_mask();
>> +    }
>>      return 0;
>>  }
>>  
>>  static const VMStateDescription vmstate_virtio_iommu_device = {
>>      .name = "virtio-iommu-device",
>>      .minimum_version_id = 2,
>> -    .version_id = 2,
>> +    .version_id = 3,
>>      .post_load = iommu_post_load,
>>      .fields = (VMStateField[]) {
>>          VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 2,
>> -- 
>> 2.27.0
>>
diff mbox series

Patch

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 9d463efc52..b77e3644ea 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -1313,7 +1313,7 @@  static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
      * in vfio realize
      */
     s->config.bypass = s->boot_bypass;
-    s->config.page_size_mask = qemu_target_page_mask();
+    s->config.page_size_mask = qemu_real_host_page_mask();
     s->config.input_range.end = UINT64_MAX;
     s->config.domain_range.end = UINT32_MAX;
     s->config.probe_size = VIOMMU_PROBE_SIZE;
@@ -1491,13 +1491,16 @@  static int iommu_post_load(void *opaque, int version_id)
      * still correct.
      */
     virtio_iommu_switch_address_space_all(s);
+    if (version_id <= 2) {
+        s->config.page_size_mask = qemu_target_page_mask();
+    }
     return 0;
 }
 
 static const VMStateDescription vmstate_virtio_iommu_device = {
     .name = "virtio-iommu-device",
     .minimum_version_id = 2,
-    .version_id = 2,
+    .version_id = 3,
     .post_load = iommu_post_load,
     .fields = (VMStateField[]) {
         VMSTATE_GTREE_DIRECT_KEY_V(domains, VirtIOIOMMU, 2,