diff mbox series

[RFC] vfio-ccw: Permit missing IRQs

Message ID 20210419184906.2847283-1-farman@linux.ibm.com
State New
Headers show
Series [RFC] vfio-ccw: Permit missing IRQs | expand

Commit Message

Eric Farman April 19, 2021, 6:49 p.m. UTC
Commit 690e29b91102 ("vfio-ccw: Refactor ccw irq handler") changed
one of the checks for the IRQ notifier registration from saying
"the host needs to recognize the only IRQ that exists" to saying
"the host needs to recognize ANY IRQ that exists."

And this worked fine, because the subsequent change to support the
CRW IRQ notifier doesn't get into this code when running on an older
kernel, thanks to a guard by a capability region. The later addition
of the REQ(uest) IRQ by commit b2f96f9e4f5f ("vfio-ccw: Connect the
device request notifier") broke this assumption because there is no
matching capability region. Thus, running new QEMU on an older
kernel fails with:

  vfio: unexpected number of irqs 2

Let's simply remove the check (and the less-than-helpful message),
and make the VFIO_DEVICE_GET_IRQ_INFO ioctl request for the IRQ
being processed. If it returns with EINVAL, we can treat it as
an unfortunate mismatch but not a fatal error for the guest.

Fixes: 690e29b91102 ("vfio-ccw: Refactor ccw irq handler")
Fixes: b2f96f9e4f5f ("vfio-ccw: Connect the device request notifier")
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 hw/vfio/ccw.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

Comments

Cornelia Huck April 21, 2021, 10:01 a.m. UTC | #1
On Mon, 19 Apr 2021 20:49:06 +0200
Eric Farman <farman@linux.ibm.com> wrote:

> Commit 690e29b91102 ("vfio-ccw: Refactor ccw irq handler") changed
> one of the checks for the IRQ notifier registration from saying
> "the host needs to recognize the only IRQ that exists" to saying
> "the host needs to recognize ANY IRQ that exists."
> 
> And this worked fine, because the subsequent change to support the
> CRW IRQ notifier doesn't get into this code when running on an older
> kernel, thanks to a guard by a capability region. The later addition
> of the REQ(uest) IRQ by commit b2f96f9e4f5f ("vfio-ccw: Connect the
> device request notifier") broke this assumption because there is no
> matching capability region. Thus, running new QEMU on an older
> kernel fails with:
> 
>   vfio: unexpected number of irqs 2
> 
> Let's simply remove the check (and the less-than-helpful message),
> and make the VFIO_DEVICE_GET_IRQ_INFO ioctl request for the IRQ
> being processed. If it returns with EINVAL, we can treat it as
> an unfortunate mismatch but not a fatal error for the guest.
> 
> Fixes: 690e29b91102 ("vfio-ccw: Refactor ccw irq handler")
> Fixes: b2f96f9e4f5f ("vfio-ccw: Connect the device request notifier")
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  hw/vfio/ccw.c | 15 +++++++--------
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> index b2df708e4b..cfbfc3d1a2 100644
> --- a/hw/vfio/ccw.c
> +++ b/hw/vfio/ccw.c
> @@ -411,20 +411,19 @@ static void vfio_ccw_register_irq_notifier(VFIOCCWDevice *vcdev,
>          return;
>      }
>  
> -    if (vdev->num_irqs < irq + 1) {
> -        error_setg(errp, "vfio: unexpected number of irqs %u",
> -                   vdev->num_irqs);

Alternative proposal: Change this message to

"vfio: IRQ %u not available (number of irqs %u)"

and still fail this function, while treating a failure of
vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_REQ_IRQ_INDEX, &err); in
vfio_ccw_realize() as a non-fatal error (maybe log a message).

This allows us to skip doing an ioctl call, of which we already know
that it would fail. Still, we can catch cases where a broken kernel e.g.
provides the crw region, but not the matching irq (I believe something
like that should indeed be a fatal error.)

> -        return;
> -    }
> -
>      argsz = sizeof(*irq_info);
>      irq_info = g_malloc0(argsz);
>      irq_info->index = irq;
>      irq_info->argsz = argsz;
>      if (ioctl(vdev->fd, VFIO_DEVICE_GET_IRQ_INFO,
>                irq_info) < 0 || irq_info->count < 1) {
> -        error_setg_errno(errp, errno, "vfio: Error getting irq info");
> -        goto out_free_info;
> +        if (errno == EINVAL) {
> +            warn_report("Unable to get information about IRQ %u", irq);
> +            goto out_free_info;
> +        } else {
> +            error_setg_errno(errp, errno, "vfio: Error getting irq info");
> +            goto out_free_info;
> +        }
>      }
>  
>      if (event_notifier_init(notifier, 0)) {
Eric Farman April 21, 2021, 12:28 p.m. UTC | #2
On Wed, 2021-04-21 at 12:01 +0200, Cornelia Huck wrote:
> On Mon, 19 Apr 2021 20:49:06 +0200
> Eric Farman <farman@linux.ibm.com> wrote:
> 
> > Commit 690e29b91102 ("vfio-ccw: Refactor ccw irq handler") changed
> > one of the checks for the IRQ notifier registration from saying
> > "the host needs to recognize the only IRQ that exists" to saying
> > "the host needs to recognize ANY IRQ that exists."
> > 
> > And this worked fine, because the subsequent change to support the
> > CRW IRQ notifier doesn't get into this code when running on an
> > older
> > kernel, thanks to a guard by a capability region. The later
> > addition
> > of the REQ(uest) IRQ by commit b2f96f9e4f5f ("vfio-ccw: Connect the
> > device request notifier") broke this assumption because there is no
> > matching capability region. Thus, running new QEMU on an older
> > kernel fails with:
> > 
> >   vfio: unexpected number of irqs 2
> > 
> > Let's simply remove the check (and the less-than-helpful message),
> > and make the VFIO_DEVICE_GET_IRQ_INFO ioctl request for the IRQ
> > being processed. If it returns with EINVAL, we can treat it as
> > an unfortunate mismatch but not a fatal error for the guest.
> > 
> > Fixes: 690e29b91102 ("vfio-ccw: Refactor ccw irq handler")
> > Fixes: b2f96f9e4f5f ("vfio-ccw: Connect the device request
> > notifier")
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  hw/vfio/ccw.c | 15 +++++++--------
> >  1 file changed, 7 insertions(+), 8 deletions(-)
> > 
> > diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> > index b2df708e4b..cfbfc3d1a2 100644
> > --- a/hw/vfio/ccw.c
> > +++ b/hw/vfio/ccw.c
> > @@ -411,20 +411,19 @@ static void
> > vfio_ccw_register_irq_notifier(VFIOCCWDevice *vcdev,
> >          return;
> >      }
> >  
> > -    if (vdev->num_irqs < irq + 1) {
> > -        error_setg(errp, "vfio: unexpected number of irqs %u",
> > -                   vdev->num_irqs);
> 
> Alternative proposal: Change this message to
> 
> "vfio: IRQ %u not available (number of irqs %u)"

> and still fail this function, while treating a failure of
> vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_REQ_IRQ_INDEX, &err);
> in
> vfio_ccw_realize() as a non-fatal error (maybe log a message).

This all sounds fine to me. I'll send a v2 as such.

> 
> This allows us to skip doing an ioctl call, of which we already know
> that it would fail. 

True, though as this is at the configuration time it's not as critical.

> Still, we can catch cases where a broken kernel e.g.
> provides the crw region, but not the matching irq (I believe
> something
> like that should indeed be a fatal error.)

Well they shouldn't do THAT. :)

> 
> > -        return;
> > -    }
> > -
> >      argsz = sizeof(*irq_info);
> >      irq_info = g_malloc0(argsz);
> >      irq_info->index = irq;
> >      irq_info->argsz = argsz;
> >      if (ioctl(vdev->fd, VFIO_DEVICE_GET_IRQ_INFO,
> >                irq_info) < 0 || irq_info->count < 1) {
> > -        error_setg_errno(errp, errno, "vfio: Error getting irq
> > info");
> > -        goto out_free_info;
> > +        if (errno == EINVAL) {
> > +            warn_report("Unable to get information about IRQ %u",
> > irq);
> > +            goto out_free_info;
> > +        } else {
> > +            error_setg_errno(errp, errno, "vfio: Error getting irq
> > info");
> > +            goto out_free_info;
> > +        }
> >      }
> >  
> >      if (event_notifier_init(notifier, 0)) {
diff mbox series

Patch

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index b2df708e4b..cfbfc3d1a2 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -411,20 +411,19 @@  static void vfio_ccw_register_irq_notifier(VFIOCCWDevice *vcdev,
         return;
     }
 
-    if (vdev->num_irqs < irq + 1) {
-        error_setg(errp, "vfio: unexpected number of irqs %u",
-                   vdev->num_irqs);
-        return;
-    }
-
     argsz = sizeof(*irq_info);
     irq_info = g_malloc0(argsz);
     irq_info->index = irq;
     irq_info->argsz = argsz;
     if (ioctl(vdev->fd, VFIO_DEVICE_GET_IRQ_INFO,
               irq_info) < 0 || irq_info->count < 1) {
-        error_setg_errno(errp, errno, "vfio: Error getting irq info");
-        goto out_free_info;
+        if (errno == EINVAL) {
+            warn_report("Unable to get information about IRQ %u", irq);
+            goto out_free_info;
+        } else {
+            error_setg_errno(errp, errno, "vfio: Error getting irq info");
+            goto out_free_info;
+        }
     }
 
     if (event_notifier_init(notifier, 0)) {