Message ID | 20180228122323.3914-1-jandryuk@gmail.com |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
Series | xen-netfront: Fix hang on device removal | expand |
On 02/28/2018 07:23 AM, Jason Andryuk wrote: > A toolstack may delete the vif frontend and backend xenstore entries > while xen-netfront is in the removal code path. In that case, the > checks for xenbus_read_driver_state would return XenbusStateUnknown, and > xennet_remove would hang indefinitely. This hang prevents system > shutdown. > > xennet_remove must be able to handle XenbusStateUnknown, and > netback_changed must also wake up the wake_queue for that state as well. > > Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") > > Signed-off-by: Jason Andryuk <jandryuk@gmail.com> > Cc: Eduardo Otubo <otubo@redhat.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
On 28/02/18 13:23, Jason Andryuk wrote: > A toolstack may delete the vif frontend and backend xenstore entries > while xen-netfront is in the removal code path. In that case, the > checks for xenbus_read_driver_state would return XenbusStateUnknown, and > xennet_remove would hang indefinitely. This hang prevents system > shutdown. > > xennet_remove must be able to handle XenbusStateUnknown, and > netback_changed must also wake up the wake_queue for that state as well. > > Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") > > Signed-off-by: Jason Andryuk <jandryuk@gmail.com> > Cc: Eduardo Otubo <otubo@redhat.com> Committed to xen/tip for-linus-4.16a Juergen
Jason Andryuk: > A toolstack may delete the vif frontend and backend xenstore entries > while xen-netfront is in the removal code path. In that case, the > checks for xenbus_read_driver_state would return XenbusStateUnknown, and > xennet_remove would hang indefinitely. This hang prevents system > shutdown. > > xennet_remove must be able to handle XenbusStateUnknown, and > netback_changed must also wake up the wake_queue for that state as well. > > Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") I think this should go into stable since AFAIK the hanging network device can only be fixed by rebooting the guest. AFAICS this affects all 4.* branches since 5b5971df3bc2 got backported to them. Upstream commit c2d2e6738a209f0f9dffa2dc8e7292fc45360d61. Simon
On Thu, Apr 19, 2018 at 2:10 PM, Simon Gaiser <simon@invisiblethingslab.com> wrote: > Jason Andryuk: >> A toolstack may delete the vif frontend and backend xenstore entries >> while xen-netfront is in the removal code path. In that case, the >> checks for xenbus_read_driver_state would return XenbusStateUnknown, and >> xennet_remove would hang indefinitely. This hang prevents system >> shutdown. >> >> xennet_remove must be able to handle XenbusStateUnknown, and >> netback_changed must also wake up the wake_queue for that state as well. >> >> Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") > > I think this should go into stable since AFAIK the hanging network > device can only be fixed by rebooting the guest. AFAICS this affects all > 4.* branches since 5b5971df3bc2 got backported to them. > > Upstream commit c2d2e6738a209f0f9dffa2dc8e7292fc45360d61. Simon, Yes, I agree. I actually submitted the request to stable earlier today, so hopefully it gets added soon. Have you experienced this hang? Regards, Jason
Jason Andryuk: > On Thu, Apr 19, 2018 at 2:10 PM, Simon Gaiser > <simon@invisiblethingslab.com> wrote: >> Jason Andryuk: >>> A toolstack may delete the vif frontend and backend xenstore entries >>> while xen-netfront is in the removal code path. In that case, the >>> checks for xenbus_read_driver_state would return XenbusStateUnknown, and >>> xennet_remove would hang indefinitely. This hang prevents system >>> shutdown. >>> >>> xennet_remove must be able to handle XenbusStateUnknown, and >>> netback_changed must also wake up the wake_queue for that state as well. >>> >>> Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") >> >> I think this should go into stable since AFAIK the hanging network >> device can only be fixed by rebooting the guest. AFAICS this affects all >> 4.* branches since 5b5971df3bc2 got backported to them. >> >> Upstream commit c2d2e6738a209f0f9dffa2dc8e7292fc45360d61. > > Simon, > > Yes, I agree. I actually submitted the request to stable earlier > today, so hopefully it gets added soon. Ok, great. (I checked the stable patch queue, but didn't check the mailing list archive). > Have you experienced this hang? Yes, it's affecting the kernel shipped by Qubes OS (see [1]). Thanks, Simon. [1]: https://github.com/QubesOS/qubes-issues/issues/3657
On Thu, Apr 19, 2018 at 4:09 PM, Simon Gaiser <simon@invisiblethingslab.com> wrote: > Jason Andryuk: >> On Thu, Apr 19, 2018 at 2:10 PM, Simon Gaiser >> <simon@invisiblethingslab.com> wrote: >>> Jason Andryuk: >>>> A toolstack may delete the vif frontend and backend xenstore entries >>>> while xen-netfront is in the removal code path. In that case, the >>>> checks for xenbus_read_driver_state would return XenbusStateUnknown, and >>>> xennet_remove would hang indefinitely. This hang prevents system >>>> shutdown. >>>> >>>> xennet_remove must be able to handle XenbusStateUnknown, and >>>> netback_changed must also wake up the wake_queue for that state as well. >>>> >>>> Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") >>> >>> I think this should go into stable since AFAIK the hanging network >>> device can only be fixed by rebooting the guest. AFAICS this affects all >>> 4.* branches since 5b5971df3bc2 got backported to them. >>> >>> Upstream commit c2d2e6738a209f0f9dffa2dc8e7292fc45360d61. >> >> Simon, >> >> Yes, I agree. I actually submitted the request to stable earlier >> today, so hopefully it gets added soon. > > Ok, great. (I checked the stable patch queue, but didn't check the > mailing list archive). > >> Have you experienced this hang? > > Yes, it's affecting the kernel shipped by Qubes OS (see [1]). Ok, interesting. I tracked down this bug with older xenvm tools, and I didn't know if libxl tools were also affected. Greg KH added the patch to the stable queue, so it's in the process. Regards, Jason
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 8328d395e332..3127bc8633ca 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -2005,7 +2005,10 @@ static void netback_changed(struct xenbus_device *dev, case XenbusStateInitialised: case XenbusStateReconfiguring: case XenbusStateReconfigured: + break; + case XenbusStateUnknown: + wake_up_all(&module_unload_q); break; case XenbusStateInitWait: @@ -2136,7 +2139,9 @@ static int xennet_remove(struct xenbus_device *dev) xenbus_switch_state(dev, XenbusStateClosing); wait_event(module_unload_q, xenbus_read_driver_state(dev->otherend) == - XenbusStateClosing); + XenbusStateClosing || + xenbus_read_driver_state(dev->otherend) == + XenbusStateUnknown); xenbus_switch_state(dev, XenbusStateClosed); wait_event(module_unload_q,
A toolstack may delete the vif frontend and backend xenstore entries while xen-netfront is in the removal code path. In that case, the checks for xenbus_read_driver_state would return XenbusStateUnknown, and xennet_remove would hang indefinitely. This hang prevents system shutdown. xennet_remove must be able to handle XenbusStateUnknown, and netback_changed must also wake up the wake_queue for that state as well. Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Cc: Eduardo Otubo <otubo@redhat.com> --- drivers/net/xen-netfront.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)