diff mbox series

hw/intc/arm_gic_kvm: Don't assume kernel can provide a GICv2

Message ID 20200225182435.1131-1-peter.maydell@linaro.org
State New
Headers show
Series hw/intc/arm_gic_kvm: Don't assume kernel can provide a GICv2 | expand

Commit Message

Peter Maydell Feb. 25, 2020, 6:24 p.m. UTC
In our KVM GICv2 realize function, we try to cope with old kernels
that don't provide the device control API (KVM_CAP_DEVICE_CTRL): we
try to use the device control, and if that fails we fall back to
assuming that the kernel has the old style KVM_CREATE_IRQCHIP and
that it will provide a GICv2.

This doesn't cater for the possibility of a kernel and hardware which
only provide a GICv3, which is very common now.  On that setup we
will abort() later on in kvm_arm_pmu_set_irq() when we try to wire up
an interrupt to the GIC we failed to create:

qemu-system-aarch64: PMU: KVM_SET_DEVICE_ATTR: Invalid argument
qemu-system-aarch64: failed to set irq for PMU
Aborted

If the kernel advertises KVM_CAP_DEVICE_CTRL we should trust it if it
says it can't create a GICv2, rather than assuming it has one.  We
can then produce a more helpful error message including a hint about
the most probable reason for the failure.

If the kernel doesn't advertise KVM_CAP_DEVICE_CTRL then it is truly
ancient by this point but we might as well still fall back to a
KVM_CREATE_IRQCHIP GICv2.

With this patch then the user misconfiguration which previously
caused an abort now prints:
qemu-system-aarch64: Initialization of device kvm-arm-gic failed: error creating in-kernel VGIC: No such device
Perhaps the host CPU does not support GICv2?

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
I spent a while wondering if the PMU code was broken before Marc
put me on the right track about what was going wrong (ie that
I hadn't put "-machine gic-version=host" on the commandline).

 hw/intc/arm_gic_kvm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Philippe Mathieu-Daudé Feb. 25, 2020, 6:42 p.m. UTC | #1
On 2/25/20 7:24 PM, Peter Maydell wrote:
> In our KVM GICv2 realize function, we try to cope with old kernels
> that don't provide the device control API (KVM_CAP_DEVICE_CTRL): we
> try to use the device control, and if that fails we fall back to
> assuming that the kernel has the old style KVM_CREATE_IRQCHIP and
> that it will provide a GICv2.
> 
> This doesn't cater for the possibility of a kernel and hardware which
> only provide a GICv3, which is very common now.  On that setup we
> will abort() later on in kvm_arm_pmu_set_irq() when we try to wire up
> an interrupt to the GIC we failed to create:
> 
> qemu-system-aarch64: PMU: KVM_SET_DEVICE_ATTR: Invalid argument
> qemu-system-aarch64: failed to set irq for PMU
> Aborted
> 
> If the kernel advertises KVM_CAP_DEVICE_CTRL we should trust it if it
> says it can't create a GICv2, rather than assuming it has one.  We
> can then produce a more helpful error message including a hint about
> the most probable reason for the failure.
> 
> If the kernel doesn't advertise KVM_CAP_DEVICE_CTRL then it is truly
> ancient by this point but we might as well still fall back to a
> KVM_CREATE_IRQCHIP GICv2.
> 
> With this patch then the user misconfiguration which previously
> caused an abort now prints:
> qemu-system-aarch64: Initialization of device kvm-arm-gic failed: error creating in-kernel VGIC: No such device
> Perhaps the host CPU does not support GICv2?
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> I spent a while wondering if the PMU code was broken before Marc
> put me on the right track about what was going wrong (ie that
> I hadn't put "-machine gic-version=host" on the commandline).
> 
>   hw/intc/arm_gic_kvm.c | 9 +++++++++
>   1 file changed, 9 insertions(+)
> 
> diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
> index 9deb15e7e69..d7df423a7a3 100644
> --- a/hw/intc/arm_gic_kvm.c
> +++ b/hw/intc/arm_gic_kvm.c
> @@ -551,7 +551,16 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error **errp)
>                                 KVM_DEV_ARM_VGIC_CTRL_INIT, NULL, true,
>                                 &error_abort);
>           }
> +    } else if (kvm_check_extension(kvm_state, KVM_CAP_DEVICE_CTRL)) {
> +        error_setg_errno(errp, -ret, "error creating in-kernel VGIC");
> +        error_append_hint(errp,
> +                          "Perhaps the host CPU does not support GICv2?\n");
>       } else if (ret != -ENODEV && ret != -ENOTSUP) {
> +        /*
> +         * Very ancient kernel without KVM_CAP_DEVICE_CTRL: assume that
> +         * ENODEV or ENOTSUP mean "can't create GICv2 with KVM_CREATE_DEVICE",
> +         * and that we will get a GICv2 via KVM_CREATE_IRQCHIP.
> +         */
>           error_setg_errno(errp, -ret, "error creating in-kernel VGIC");
>           return;
>       }
> 

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Andrew Jones Feb. 26, 2020, 8:52 a.m. UTC | #2
On Tue, Feb 25, 2020 at 06:24:35PM +0000, Peter Maydell wrote:
> In our KVM GICv2 realize function, we try to cope with old kernels
> that don't provide the device control API (KVM_CAP_DEVICE_CTRL): we
> try to use the device control, and if that fails we fall back to
> assuming that the kernel has the old style KVM_CREATE_IRQCHIP and
> that it will provide a GICv2.
> 
> This doesn't cater for the possibility of a kernel and hardware which
> only provide a GICv3, which is very common now.  On that setup we
> will abort() later on in kvm_arm_pmu_set_irq() when we try to wire up
> an interrupt to the GIC we failed to create:
> 
> qemu-system-aarch64: PMU: KVM_SET_DEVICE_ATTR: Invalid argument
> qemu-system-aarch64: failed to set irq for PMU
> Aborted
> 
> If the kernel advertises KVM_CAP_DEVICE_CTRL we should trust it if it
> says it can't create a GICv2, rather than assuming it has one.  We
> can then produce a more helpful error message including a hint about
> the most probable reason for the failure.
> 
> If the kernel doesn't advertise KVM_CAP_DEVICE_CTRL then it is truly
> ancient by this point but we might as well still fall back to a
> KVM_CREATE_IRQCHIP GICv2.
> 
> With this patch then the user misconfiguration which previously
> caused an abort now prints:
> qemu-system-aarch64: Initialization of device kvm-arm-gic failed: error creating in-kernel VGIC: No such device
> Perhaps the host CPU does not support GICv2?
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> I spent a while wondering if the PMU code was broken before Marc
> put me on the right track about what was going wrong (ie that
> I hadn't put "-machine gic-version=host" on the commandline).
> 
>  hw/intc/arm_gic_kvm.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
> index 9deb15e7e69..d7df423a7a3 100644
> --- a/hw/intc/arm_gic_kvm.c
> +++ b/hw/intc/arm_gic_kvm.c
> @@ -551,7 +551,16 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error **errp)
>                                KVM_DEV_ARM_VGIC_CTRL_INIT, NULL, true,
>                                &error_abort);
>          }
> +    } else if (kvm_check_extension(kvm_state, KVM_CAP_DEVICE_CTRL)) {
> +        error_setg_errno(errp, -ret, "error creating in-kernel VGIC");
> +        error_append_hint(errp,
> +                          "Perhaps the host CPU does not support GICv2?\n");
>      } else if (ret != -ENODEV && ret != -ENOTSUP) {
> +        /*
> +         * Very ancient kernel without KVM_CAP_DEVICE_CTRL: assume that
> +         * ENODEV or ENOTSUP mean "can't create GICv2 with KVM_CREATE_DEVICE",
> +         * and that we will get a GICv2 via KVM_CREATE_IRQCHIP.
> +         */
>          error_setg_errno(errp, -ret, "error creating in-kernel VGIC");
>          return;
>      }
> -- 
> 2.20.1
> 
>

This is nice, as some QEMU command line users may now be able to more
easily correct their command lines. So,

Reviewed-by: Andrew Jones <drjones@redhat.com>
Tested-by: Andrew Jones <drjones@redhat.com>

Although, many QEMU command line users still won't know what to do
without an explicit "Try -machine gic-version=host" hint, so that
might be nice to add too.

Also, unless our command line compatibility policy is so strict that
we can't change failing command lines (which this patch does to some
degree as it fails with a new message now), then we might as well
just probe for a working GIC version when using KVM, as old command
lines that implicitly or explicitly selected '2' never worked with
KVM on gicv3-only hosts anyway. We just have to try '2' first in
order to continue providing a gicv2 to a guests on hosts that support
both, but if that fails, then we can try '3', and if that works, then
users don't have to try and correct their command lines at all.

Thanks,
drew
Peter Maydell Feb. 26, 2020, 8:56 a.m. UTC | #3
On Wed, 26 Feb 2020 at 08:52, Andrew Jones <drjones@redhat.com> wrote:
> Although, many QEMU command line users still won't know what to do
> without an explicit "Try -machine gic-version=host" hint, so that
> might be nice to add too.

In the GIC code we don't know if the machine even has a
gic-version property, so we're not in the right place to try to
produce that message.

thanks
-- PMM
Andrew Jones Feb. 26, 2020, 9:17 a.m. UTC | #4
On Wed, Feb 26, 2020 at 08:56:03AM +0000, Peter Maydell wrote:
> On Wed, 26 Feb 2020 at 08:52, Andrew Jones <drjones@redhat.com> wrote:
> > Although, many QEMU command line users still won't know what to do
> > without an explicit "Try -machine gic-version=host" hint, so that
> > might be nice to add too.
> 
> In the GIC code we don't know if the machine even has a
> gic-version property, so we're not in the right place to try to
> produce that message.
>

Ah yes, we use qdev_init_nofail() in virt::create_gic(), so there's
no chance to append another hint at the machine level.

And what about when machine.gic-version is not provided and KVM is
in use? Shouldn't we try version '2', as we do now, but then also
'3', if '2' fails, before erroring out?

Thanks,
drew
Eric Auger Feb. 26, 2020, 11:58 a.m. UTC | #5
Hi Peter,

On 2/26/20 10:17 AM, Andrew Jones wrote:
> On Wed, Feb 26, 2020 at 08:56:03AM +0000, Peter Maydell wrote:
>> On Wed, 26 Feb 2020 at 08:52, Andrew Jones <drjones@redhat.com> wrote:
>>> Although, many QEMU command line users still won't know what to do
>>> without an explicit "Try -machine gic-version=host" hint, so that
>>> might be nice to add too.
>>
>> In the GIC code we don't know if the machine even has a
>> gic-version property, so we're not in the right place to try to
>> produce that message.
>>
> 
> Ah yes, we use qdev_init_nofail() in virt::create_gic(), so there's
> no chance to append another hint at the machine level.
> 
> And what about when machine.gic-version is not provided and KVM is
> in use? Shouldn't we try version '2', as we do now, but then also
> '3', if '2' fails, before erroring out?

In case of KVM accelerated mode we could effectively probe v2 first and
if not supported choose v3, as mentioned by Drew.

Couldn't kvm_arm_vgic_probe() return a bitmap by calling
kvm_create_device on both versions in dryrun mode?

Thanks

Eric

> 
> Thanks,
> drew
> 
>
Eric Auger Feb. 26, 2020, 5:09 p.m. UTC | #6
Hi Peter,

On 2/26/20 12:58 PM, Auger Eric wrote:
> Hi Peter,
> 
> On 2/26/20 10:17 AM, Andrew Jones wrote:
>> On Wed, Feb 26, 2020 at 08:56:03AM +0000, Peter Maydell wrote:
>>> On Wed, 26 Feb 2020 at 08:52, Andrew Jones <drjones@redhat.com> wrote:
>>>> Although, many QEMU command line users still won't know what to do
>>>> without an explicit "Try -machine gic-version=host" hint, so that
>>>> might be nice to add too.
>>>
>>> In the GIC code we don't know if the machine even has a
>>> gic-version property, so we're not in the right place to try to
>>> produce that message.
>>>
>>
>> Ah yes, we use qdev_init_nofail() in virt::create_gic(), so there's
>> no chance to append another hint at the machine level.
>>
>> And what about when machine.gic-version is not provided and KVM is
>> in use? Shouldn't we try version '2', as we do now, but then also
>> '3', if '2' fails, before erroring out?
> 
> In case of KVM accelerated mode we could effectively probe v2 first and
> if not supported choose v3, as mentioned by Drew.
> 
> Couldn't kvm_arm_vgic_probe() return a bitmap by calling
> kvm_create_device on both versions in dryrun mode?

I sent an RFC prototyping what we had in mind - I think - together with
Drew.

[RFC 0/2] hw/arm/virt: kvm: allow gicv3 by default if host does not
support v2

This was discussed earlier in that thread:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg674469.html.

It should not break any compatibility as the only case we are supposed
to change here was aborting before.

Thanks

Eric
> 
> Thanks
> 
> Eric
> 
>>
>> Thanks,
>> drew
>>
>>
diff mbox series

Patch

diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
index 9deb15e7e69..d7df423a7a3 100644
--- a/hw/intc/arm_gic_kvm.c
+++ b/hw/intc/arm_gic_kvm.c
@@ -551,7 +551,16 @@  static void kvm_arm_gic_realize(DeviceState *dev, Error **errp)
                               KVM_DEV_ARM_VGIC_CTRL_INIT, NULL, true,
                               &error_abort);
         }
+    } else if (kvm_check_extension(kvm_state, KVM_CAP_DEVICE_CTRL)) {
+        error_setg_errno(errp, -ret, "error creating in-kernel VGIC");
+        error_append_hint(errp,
+                          "Perhaps the host CPU does not support GICv2?\n");
     } else if (ret != -ENODEV && ret != -ENOTSUP) {
+        /*
+         * Very ancient kernel without KVM_CAP_DEVICE_CTRL: assume that
+         * ENODEV or ENOTSUP mean "can't create GICv2 with KVM_CREATE_DEVICE",
+         * and that we will get a GICv2 via KVM_CREATE_IRQCHIP.
+         */
         error_setg_errno(errp, -ret, "error creating in-kernel VGIC");
         return;
     }