Message ID | 1403879569-24256-1-git-send-email-pbonzini@redhat.com |
---|---|
State | New |
Headers | show |
On Fri, 2014-06-27 at 16:32 +0200, Paolo Bonzini wrote: > io-error is for block device errors; it should always be preceded > by a BLOCK_IO_ERROR event. Where does this requirement come from? I only see a loose association of IO_ERROR to disk in libvirt and none in QEMU. > I think vfio wants to use > RUN_STATE_INTERNAL_ERROR instead. But that seems to put us into an "unknown" paused state in libvirt. An I/O error still seems like a more appropriate state. I'm not sure this qualifies as a trivial patch, it is small, but subtle. Thanks, Alex > > Cc: Alex Williamson <alex.williamson@redhat.com> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > hw/misc/vfio.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c > index 7437c2e..9d71d97 100644 > --- a/hw/misc/vfio.c > +++ b/hw/misc/vfio.c > @@ -3978,7 +3978,7 @@ static void vfio_err_notifier_handler(void *opaque) > __func__, vdev->host.domain, vdev->host.bus, > vdev->host.slot, vdev->host.function); > > - vm_stop(RUN_STATE_IO_ERROR); > + vm_stop(RUN_STATE_INTERNAL_ERROR); > } > > /*
----- Messaggio originale ----- > Da: "Alex Williamson" <alex.williamson@redhat.com> > A: "Paolo Bonzini" <pbonzini@redhat.com> > Cc: qemu-devel@nongnu.org, qemu-trivial@nongnu.org > Inviato: Venerdì, 27 giugno 2014 18:34:59 > Oggetto: Re: [PATCH for 2.1] vfio: use correct runstate > > On Fri, 2014-06-27 at 16:32 +0200, Paolo Bonzini wrote: > > io-error is for block device errors; it should always be preceded > > by a BLOCK_IO_ERROR event. > > Where does this requirement come from? I only see a loose association > of IO_ERROR to disk in libvirt and none in QEMU. See the RunState enum in qapi-schema.json: ## # @RunState # # An enumeration of VM run states. # # ... # # @internal-error: An internal error that prevents further guest execution # has occurred # # @io-error: the last IOP has failed and the device is configured to pause # on I/O errors # # @paused: guest has been paused via the 'stop' command The point of io-error is that management can look at block devices, see if any have an error reported, and then resume execution (see documentation of rerror=stop and werror=stop/enospc). This is counter to the intentions you have in vfio. > > I think vfio wants to use > > RUN_STATE_INTERNAL_ERROR instead. > > But that seems to put us into an "unknown" paused state in libvirt. I think paused is incorrect, because (unlike RUN_STATE_IO_ERROR), you cannot resume from RUN_STATE_INTERNAL_ERROR except with a reset. QEMU enforces that, and this matches the error you are reporting: error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error detected. " "Please collect any data possible and then kill the guest", __func__, vdev->host.domain, vdev->host.bus, vdev->host.slot, vdev->host.function); libvirt has a crashed state, I think that's what libvirt should call the internal-error runstate. IIRC on Xen you get to crashed when the processor raises an error on vmentry, for example. Libvirt only knows about crashed/unknown, but one could add crashed/internal-error too. Paolo
On Fri, 2014-06-27 at 13:51 -0400, Paolo Bonzini wrote: > ----- Messaggio originale ----- > > Da: "Alex Williamson" <alex.williamson@redhat.com> > > A: "Paolo Bonzini" <pbonzini@redhat.com> > > Cc: qemu-devel@nongnu.org, qemu-trivial@nongnu.org > > Inviato: Venerdì, 27 giugno 2014 18:34:59 > > Oggetto: Re: [PATCH for 2.1] vfio: use correct runstate > > > > On Fri, 2014-06-27 at 16:32 +0200, Paolo Bonzini wrote: > > > io-error is for block device errors; it should always be preceded > > > by a BLOCK_IO_ERROR event. > > > > Where does this requirement come from? I only see a loose association > > of IO_ERROR to disk in libvirt and none in QEMU. > > See the RunState enum in qapi-schema.json: > > ## > # @RunState > # > # An enumeration of VM run states. > # > # ... > # > # @internal-error: An internal error that prevents further guest execution > # has occurred > # > # @io-error: the last IOP has failed and the device is configured to pause > # on I/O errors > # > # @paused: guest has been paused via the 'stop' command > > The point of io-error is that management can look at block devices, see if > any have an error reported, and then resume execution (see documentation of > rerror=stop and werror=stop/enospc). This is counter to the intentions you > have in vfio. > > > > I think vfio wants to use > > > RUN_STATE_INTERNAL_ERROR instead. > > > > But that seems to put us into an "unknown" paused state in libvirt. > > I think paused is incorrect, because (unlike RUN_STATE_IO_ERROR), you cannot > resume from RUN_STATE_INTERNAL_ERROR except with a reset. QEMU enforces that, > and this matches the error you are reporting: > > error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error detected. " > "Please collect any data possible and then kill the guest", > __func__, vdev->host.domain, vdev->host.bus, > vdev->host.slot, vdev->host.function); > > libvirt has a crashed state, I think that's what libvirt should call the > internal-error runstate. IIRC on Xen you get to crashed when the processor > raises an error on vmentry, for example. > > Libvirt only knows about crashed/unknown, but one could add crashed/internal-error > too. Ok, I'll go with it. I'll include this in the pull request I intend to do on Monday. Thanks, Alex
[adding libvirt] On 06/27/2014 11:51 AM, Paolo Bonzini wrote: > ----- Messaggio originale ----- >> Da: "Alex Williamson" <alex.williamson@redhat.com> >> A: "Paolo Bonzini" <pbonzini@redhat.com> >> Cc: qemu-devel@nongnu.org, qemu-trivial@nongnu.org >> Inviato: Venerdì, 27 giugno 2014 18:34:59 >> Oggetto: Re: [PATCH for 2.1] vfio: use correct runstate >> >> On Fri, 2014-06-27 at 16:32 +0200, Paolo Bonzini wrote: >>> io-error is for block device errors; it should always be preceded >>> by a BLOCK_IO_ERROR event. >> >> Where does this requirement come from? I only see a loose association >> of IO_ERROR to disk in libvirt and none in QEMU. > > See the RunState enum in qapi-schema.json: > > ## > # @RunState > # > # An enumeration of VM run states. > # > # ... > # > # @internal-error: An internal error that prevents further guest execution > # has occurred > # > # @io-error: the last IOP has failed and the device is configured to pause > # on I/O errors > # > # @paused: guest has been paused via the 'stop' command > > The point of io-error is that management can look at block devices, see if > any have an error reported, and then resume execution (see documentation of > rerror=stop and werror=stop/enospc). This is counter to the intentions you > have in vfio. > >>> I think vfio wants to use >>> RUN_STATE_INTERNAL_ERROR instead. >> >> But that seems to put us into an "unknown" paused state in libvirt. > > I think paused is incorrect, because (unlike RUN_STATE_IO_ERROR), you cannot > resume from RUN_STATE_INTERNAL_ERROR except with a reset. QEMU enforces that, > and this matches the error you are reporting: > > error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error detected. " > "Please collect any data possible and then kill the guest", > __func__, vdev->host.domain, vdev->host.bus, > vdev->host.slot, vdev->host.function); > > libvirt has a crashed state, I think that's what libvirt should call the > internal-error runstate. IIRC on Xen you get to crashed when the processor > raises an error on vmentry, for example. > > Libvirt only knows about crashed/unknown, but one could add crashed/internal-error > too. Yes, we probably need to teach libvirt about this state.
diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c index 7437c2e..9d71d97 100644 --- a/hw/misc/vfio.c +++ b/hw/misc/vfio.c @@ -3978,7 +3978,7 @@ static void vfio_err_notifier_handler(void *opaque) __func__, vdev->host.domain, vdev->host.bus, vdev->host.slot, vdev->host.function); - vm_stop(RUN_STATE_IO_ERROR); + vm_stop(RUN_STATE_INTERNAL_ERROR); } /*
io-error is for block device errors; it should always be preceded by a BLOCK_IO_ERROR event. I think vfio wants to use RUN_STATE_INTERNAL_ERROR instead. Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/misc/vfio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)