diff mbox series

[1/2] xen-bus: Fix backend state transition on device reset

Message ID 20190821092020.17952-2-anthony.perard@citrix.com
State New
Headers show
Series Fix for the xen-bus driver | expand

Commit Message

Anthony PERARD Aug. 21, 2019, 9:20 a.m. UTC
When a frontend want to reset its state and the backend one, it start
with setting "Closing", then wait for the backend (QEMU) to do the same.

But when QEMU is setting "Closing" to its state, it trigger an event
(xenstore watch) that re-execute xen_device_backend_changed() and set
the backend state to "Closed". QEMU should wait for the frontend to
set "Closed" before doing the same.

Before setting "Closed" to the backend_state, we are also going to
check if the frontend was responsible for the transition to "Closing".

Fixes: b6af8926fb858c4f1426e5acb2cfc1f0580ec98a
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---
Cc: qemu-stable@nongnu.org
---
 hw/xen/xen-bus.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Paul Durrant Aug. 21, 2019, 9:36 a.m. UTC | #1
> -----Original Message-----
> From: Anthony PERARD <anthony.perard@citrix.com>
> Sent: 21 August 2019 10:20
> To: qemu-devel@nongnu.org
> Cc: Anthony Perard <anthony.perard@citrix.com>; qemu-stable@nongnu.org; Stefano Stabellini
> <sstabellini@kernel.org>; Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Subject: [PATCH 1/2] xen-bus: Fix backend state transition on device reset
> 
> When a frontend want to reset its state and the backend one, it start
> with setting "Closing", then wait for the backend (QEMU) to do the same.
> 
> But when QEMU is setting "Closing" to its state, it trigger an event
> (xenstore watch) that re-execute xen_device_backend_changed() and set
> the backend state to "Closed". QEMU should wait for the frontend to
> set "Closed" before doing the same.
> 
> Before setting "Closed" to the backend_state, we are also going to
> check if the frontend was responsible for the transition to "Closing".
> 
> Fixes: b6af8926fb858c4f1426e5acb2cfc1f0580ec98a
> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
> ---
> Cc: qemu-stable@nongnu.org
> ---
>  hw/xen/xen-bus.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
> index e40500242d..982eca4533 100644
> --- a/hw/xen/xen-bus.c
> +++ b/hw/xen/xen-bus.c
> @@ -540,9 +540,11 @@ static void xen_device_backend_changed(void *opaque)
>      /*
>       * If the toolstack (or unplug request callback) has set the backend
>       * state to Closing, but there is no active frontend (i.e. the
> -     * state is not Connected) then set the backend state to Closed.
> +     * state is not Connected or Closing) then set the backend state
> +     * to Closed.
>       */
>      if (xendev->backend_state == XenbusStateClosing &&
> +        xendev->frontend_state != XenbusStateClosing &&
>          xendev->frontend_state != XenbusStateConnected) {
>          xen_device_backend_set_state(xendev, XenbusStateClosed);

Actually, I wonder whether it is better to 'whitelist' here? AFAIK the only valid frontend states whether the backend should set itself 'closed' are 'closed' (i.e. the frontend is finished) and 'initialising' (the frontend was never there).

  Paul

>      }
> --
> Anthony PERARD
Anthony PERARD Aug. 22, 2019, 9:50 a.m. UTC | #2
On Wed, Aug 21, 2019 at 10:36:40AM +0100, Paul Durrant wrote:
> > diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
> > index e40500242d..982eca4533 100644
> > --- a/hw/xen/xen-bus.c
> > +++ b/hw/xen/xen-bus.c
> > @@ -540,9 +540,11 @@ static void xen_device_backend_changed(void *opaque)
> >      /*
> >       * If the toolstack (or unplug request callback) has set the backend
> >       * state to Closing, but there is no active frontend (i.e. the
> > -     * state is not Connected) then set the backend state to Closed.
> > +     * state is not Connected or Closing) then set the backend state
> > +     * to Closed.
> >       */
> >      if (xendev->backend_state == XenbusStateClosing &&
> > +        xendev->frontend_state != XenbusStateClosing &&
> >          xendev->frontend_state != XenbusStateConnected) {
> >          xen_device_backend_set_state(xendev, XenbusStateClosed);
> 
> Actually, I wonder whether it is better to 'whitelist' here? AFAIK the only valid frontend states whether the backend should set itself 'closed' are 'closed' (i.e. the frontend is finished) and 'initialising' (the frontend was never there).

Let's see, what are the reason backend=Closing?
    - frontend changed to Closing (because it wants to disconnect)
    - toolstack(libxl) or QEMU(unplug request) set the state to Closing,
      but also online to 0.

What should the backend do in both case:
    - frontend alive: backend should wait
        frontend state might be InitWait, Initialised, Connected,
        Closing.
    - frontend not existing or disconnected: backend can skip waiting
      and go to the next step, Closed.
        frontend might be Initialising, Closed.
        But there are also Unknown, Reconfiguring and Reconfigured which
        are probably errors.

So, the whitelist with Closed and Initialising is a good start, but what
about the Unknown state? (QEMU doesn't have backends were the state
Reconfigur* is possible, so they can be mapped to Unknown for now).

Cheers,
Paul Durrant Aug. 22, 2019, 9:59 a.m. UTC | #3
> -----Original Message-----
> From: Anthony PERARD <anthony.perard@citrix.com>
> Sent: 22 August 2019 10:51
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: qemu-devel@nongnu.org; qemu-stable@nongnu.org; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH 1/2] xen-bus: Fix backend state transition on device reset
> 
> On Wed, Aug 21, 2019 at 10:36:40AM +0100, Paul Durrant wrote:
> > > diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
> > > index e40500242d..982eca4533 100644
> > > --- a/hw/xen/xen-bus.c
> > > +++ b/hw/xen/xen-bus.c
> > > @@ -540,9 +540,11 @@ static void xen_device_backend_changed(void *opaque)
> > >      /*
> > >       * If the toolstack (or unplug request callback) has set the backend
> > >       * state to Closing, but there is no active frontend (i.e. the
> > > -     * state is not Connected) then set the backend state to Closed.
> > > +     * state is not Connected or Closing) then set the backend state
> > > +     * to Closed.
> > >       */
> > >      if (xendev->backend_state == XenbusStateClosing &&
> > > +        xendev->frontend_state != XenbusStateClosing &&
> > >          xendev->frontend_state != XenbusStateConnected) {
> > >          xen_device_backend_set_state(xendev, XenbusStateClosed);
> >
> > Actually, I wonder whether it is better to 'whitelist' here? AFAIK the only valid frontend states
> whether the backend should set itself 'closed' are 'closed' (i.e. the frontend is finished) and
> 'initialising' (the frontend was never there).
> 
> Let's see, what are the reason backend=Closing?
>     - frontend changed to Closing (because it wants to disconnect)
>     - toolstack(libxl) or QEMU(unplug request) set the state to Closing,
>       but also online to 0.
> 
> What should the backend do in both case:
>     - frontend alive: backend should wait
>         frontend state might be InitWait, Initialised, Connected,
>         Closing.
>     - frontend not existing or disconnected: backend can skip waiting
>       and go to the next step, Closed.
>         frontend might be Initialising, Closed.
>         But there are also Unknown, Reconfiguring and Reconfigured which
>         are probably errors.
> 
> So, the whitelist with Closed and Initialising is a good start, but what
> about the Unknown state? (QEMU doesn't have backends were the state
> Reconfigur* is possible, so they can be mapped to Unknown for now).

I guess we should consider Unknown (basically a missing xenstore state key) to mean either an admin, or the frontend has screwed up or is malicious so I think we just close down the backend straight away. So maybe listing InitWait, Initialised, Connected, and Closing as frontend states that are 'good' (i.e. we wait in anticipation of the frontend eventually getting to Closed) and then say all other states result in immediate close of the backend. Probably worth having a helper function for saying whether a state is good or not.

  Cheers,

    Paul

> 
> Cheers,
> 
> --
> Anthony PERARD
Anthony PERARD Aug. 22, 2019, 3:01 p.m. UTC | #4
On Thu, Aug 22, 2019 at 10:59:38AM +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Anthony PERARD <anthony.perard@citrix.com>
> > Sent: 22 August 2019 10:51
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: qemu-devel@nongnu.org; qemu-stable@nongnu.org; Stefano Stabellini <sstabellini@kernel.org>; xen-
> > devel@lists.xenproject.org
> > Subject: Re: [PATCH 1/2] xen-bus: Fix backend state transition on device reset
> > 
> > On Wed, Aug 21, 2019 at 10:36:40AM +0100, Paul Durrant wrote:
> > > > diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
> > > > index e40500242d..982eca4533 100644
> > > > --- a/hw/xen/xen-bus.c
> > > > +++ b/hw/xen/xen-bus.c
> > > > @@ -540,9 +540,11 @@ static void xen_device_backend_changed(void *opaque)
> > > >      /*
> > > >       * If the toolstack (or unplug request callback) has set the backend
> > > >       * state to Closing, but there is no active frontend (i.e. the
> > > > -     * state is not Connected) then set the backend state to Closed.
> > > > +     * state is not Connected or Closing) then set the backend state
> > > > +     * to Closed.
> > > >       */
> > > >      if (xendev->backend_state == XenbusStateClosing &&
> > > > +        xendev->frontend_state != XenbusStateClosing &&
> > > >          xendev->frontend_state != XenbusStateConnected) {
> > > >          xen_device_backend_set_state(xendev, XenbusStateClosed);
> > >
> > > Actually, I wonder whether it is better to 'whitelist' here? AFAIK the only valid frontend states
> > whether the backend should set itself 'closed' are 'closed' (i.e. the frontend is finished) and
> > 'initialising' (the frontend was never there).
> > 
> > Let's see, what are the reason backend=Closing?
> >     - frontend changed to Closing (because it wants to disconnect)
> >     - toolstack(libxl) or QEMU(unplug request) set the state to Closing,
> >       but also online to 0.
> > 
> > What should the backend do in both case:
> >     - frontend alive: backend should wait
> >         frontend state might be InitWait, Initialised, Connected,
> >         Closing.
> >     - frontend not existing or disconnected: backend can skip waiting
> >       and go to the next step, Closed.
> >         frontend might be Initialising, Closed.
> >         But there are also Unknown, Reconfiguring and Reconfigured which
> >         are probably errors.
> > 
> > So, the whitelist with Closed and Initialising is a good start, but what
> > about the Unknown state? (QEMU doesn't have backends were the state
> > Reconfigur* is possible, so they can be mapped to Unknown for now).
> 
> I guess we should consider Unknown (basically a missing xenstore state key) to mean either an admin, or the frontend has screwed up or is malicious so I think we just close down the backend straight away. So maybe listing InitWait, Initialised, Connected, and Closing as frontend states that are 'good' (i.e. we wait in anticipation of the frontend eventually getting to Closed) and then say all other states result in immediate close of the backend. Probably worth having a helper function for saying whether a state is good or not.

Sounds good, but I'll use "active" instead of "good" to name the helper
as that feels more accurate to me. Also "active" is already used in the
comment. I'll name the new helper xen_device_state_is_active().

Thanks,
diff mbox series

Patch

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index e40500242d..982eca4533 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -540,9 +540,11 @@  static void xen_device_backend_changed(void *opaque)
     /*
      * If the toolstack (or unplug request callback) has set the backend
      * state to Closing, but there is no active frontend (i.e. the
-     * state is not Connected) then set the backend state to Closed.
+     * state is not Connected or Closing) then set the backend state
+     * to Closed.
      */
     if (xendev->backend_state == XenbusStateClosing &&
+        xendev->frontend_state != XenbusStateClosing &&
         xendev->frontend_state != XenbusStateConnected) {
         xen_device_backend_set_state(xendev, XenbusStateClosed);
     }