mbox series

[0/5,SRU,XENIAL] Fix nbd panic on ubuntu_nbd_smoke_test

Message ID 20181011144123.24543-1-colin.king@canonical.com
Headers show
Series Fix nbd panic on ubuntu_nbd_smoke_test | expand

Message

Colin Ian King Oct. 11, 2018, 2:41 p.m. UTC
From: Colin Ian King <colin.king@canonical.com>

BugLink: https://bugs.launchpad.net/bugs/1793464

== SRU Justification ==

When running the Ubuntu nbd autotest regression test we trip a hang
and then a little later a panic message.  There are two upstream
fixes required as this is actually two issues in one. One fix is to 
not to shutdown the sock when IRQs are disable and a second to fix is
to race in the nbd ioctl.

== Fix ==

Upstream commits:

23272a6754b81ff6503e09c743bb4ceeeab39997
  nbd: Remove signal usage

1f7b5cf1be4351e60cf8ae7aab976503dd73c5f8
  nbd: Timeouts are not user requested disconnects

0e4f0f6f63d3416a9e529d99febfe98545427b81
  nbd: Cleanup reset of nbd and bdev after a disconnect

c261189862c6f65117eb3b1748622a08ef49c262
  nbd: don't shutdown sock with irq's disabled

97240963eb308d8d21a89c0459822f7ea98463b4
  nbd: fix race in ioctl

The first 3 patches are prerequisites required for the latter two fixes to apply and work correctly.  Most of these backports are minor patch wiggles
required because later patches have been applied to the driver in earlier fixes to this driver.
   

== Regression Potential ==

These fixes just touch nbd, so the regression potential is just limited to this. Secondly, we are pulling in upstream fixes that exist in Bionic and Cosmic kernels, so these are tried and tested fixes.

== Test Case ==

  1. Deploy a node with 4.4 Xenial
  2. Run the ubuntu_nbd_smoke_test

Without the fix, we get hang/crashes.  With the fix one can run this test
multiple times without any issues at all.

----

Josef Bacik (1):
  nbd: don't shutdown sock with irq's disabled

Markus Pargmann (3):
  nbd: Remove signal usage
  nbd: Timeouts are not user requested disconnects
  nbd: Cleanup reset of nbd and bdev after a disconnect

Vegard Nossum (1):
  nbd: fix race in ioctl

 drivers/block/nbd.c | 200 ++++++++++++++++++++++++++--------------------------
 1 file changed, 99 insertions(+), 101 deletions(-)

Comments

Stefan Bader Oct. 12, 2018, 7:24 a.m. UTC | #1
On 11.10.2018 16:41, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1793464
> 
> == SRU Justification ==
> 
> When running the Ubuntu nbd autotest regression test we trip a hang
> and then a little later a panic message.  There are two upstream
> fixes required as this is actually two issues in one. One fix is to 
> not to shutdown the sock when IRQs are disable and a second to fix is
> to race in the nbd ioctl.
> 
> == Fix ==
> 
> Upstream commits:
> 
> 23272a6754b81ff6503e09c743bb4ceeeab39997
>   nbd: Remove signal usage
> 
> 1f7b5cf1be4351e60cf8ae7aab976503dd73c5f8
>   nbd: Timeouts are not user requested disconnects
> 
> 0e4f0f6f63d3416a9e529d99febfe98545427b81
>   nbd: Cleanup reset of nbd and bdev after a disconnect
> 
> c261189862c6f65117eb3b1748622a08ef49c262
>   nbd: don't shutdown sock with irq's disabled
> 
> 97240963eb308d8d21a89c0459822f7ea98463b4
>   nbd: fix race in ioctl
> 
> The first 3 patches are prerequisites required for the latter two fixes to apply and work correctly.  Most of these backports are minor patch wiggles
> required because later patches have been applied to the driver in earlier fixes to this driver.
>    
> 
> == Regression Potential ==
> 
> These fixes just touch nbd, so the regression potential is just limited to this. Secondly, we are pulling in upstream fixes that exist in Bionic and Cosmic kernels, so these are tried and tested fixes.
> 
> == Test Case ==
> 
>   1. Deploy a node with 4.4 Xenial
>   2. Run the ubuntu_nbd_smoke_test
> 
> Without the fix, we get hang/crashes.  With the fix one can run this test
> multiple times without any issues at all.
> 
> ----
> 
> Josef Bacik (1):
>   nbd: don't shutdown sock with irq's disabled
> 
> Markus Pargmann (3):
>   nbd: Remove signal usage
>   nbd: Timeouts are not user requested disconnects
>   nbd: Cleanup reset of nbd and bdev after a disconnect
> 
> Vegard Nossum (1):
>   nbd: fix race in ioctl
> 
>  drivers/block/nbd.c | 200 ++++++++++++++++++++++++++--------------------------
>  1 file changed, 99 insertions(+), 101 deletions(-)
> 

Limited to single driver which was verified to be fixed.

Acked-by: Stefan Bader <stefan.bader@canonical.com>
Kleber Sacilotto de Souza Oct. 15, 2018, 10:03 a.m. UTC | #2
On 10/11/18 16:41, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1793464
> 
> == SRU Justification ==
> 
> When running the Ubuntu nbd autotest regression test we trip a hang
> and then a little later a panic message.  There are two upstream
> fixes required as this is actually two issues in one. One fix is to 
> not to shutdown the sock when IRQs are disable and a second to fix is
> to race in the nbd ioctl.
> 
> == Fix ==
> 
> Upstream commits:
> 
> 23272a6754b81ff6503e09c743bb4ceeeab39997
>   nbd: Remove signal usage
> 
> 1f7b5cf1be4351e60cf8ae7aab976503dd73c5f8
>   nbd: Timeouts are not user requested disconnects
> 
> 0e4f0f6f63d3416a9e529d99febfe98545427b81
>   nbd: Cleanup reset of nbd and bdev after a disconnect
> 
> c261189862c6f65117eb3b1748622a08ef49c262
>   nbd: don't shutdown sock with irq's disabled
> 
> 97240963eb308d8d21a89c0459822f7ea98463b4
>   nbd: fix race in ioctl
> 
> The first 3 patches are prerequisites required for the latter two fixes to apply and work correctly.  Most of these backports are minor patch wiggles
> required because later patches have been applied to the driver in earlier fixes to this driver.
>    
> 
> == Regression Potential ==
> 
> These fixes just touch nbd, so the regression potential is just limited to this. Secondly, we are pulling in upstream fixes that exist in Bionic and Cosmic kernels, so these are tried and tested fixes.
> 
> == Test Case ==
> 
>   1. Deploy a node with 4.4 Xenial
>   2. Run the ubuntu_nbd_smoke_test
> 
> Without the fix, we get hang/crashes.  With the fix one can run this test
> multiple times without any issues at all.
> 
> ----
> 
> Josef Bacik (1):
>   nbd: don't shutdown sock with irq's disabled
> 
> Markus Pargmann (3):
>   nbd: Remove signal usage
>   nbd: Timeouts are not user requested disconnects
>   nbd: Cleanup reset of nbd and bdev after a disconnect
> 
> Vegard Nossum (1):
>   nbd: fix race in ioctl
> 
>  drivers/block/nbd.c | 200 ++++++++++++++++++++++++++--------------------------
>  1 file changed, 99 insertions(+), 101 deletions(-)
> 

Limited to nbd, tested.

Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Khalid Elmously Oct. 23, 2018, 6:14 a.m. UTC | #3
On 2018-10-11 15:41:18 , Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1793464
> 
> == SRU Justification ==
> 
> When running the Ubuntu nbd autotest regression test we trip a hang
> and then a little later a panic message.  There are two upstream
> fixes required as this is actually two issues in one. One fix is to 
> not to shutdown the sock when IRQs are disable and a second to fix is
> to race in the nbd ioctl.
> 
> == Fix ==
> 
> Upstream commits:
> 
> 23272a6754b81ff6503e09c743bb4ceeeab39997
>   nbd: Remove signal usage
> 
> 1f7b5cf1be4351e60cf8ae7aab976503dd73c5f8
>   nbd: Timeouts are not user requested disconnects
> 
> 0e4f0f6f63d3416a9e529d99febfe98545427b81
>   nbd: Cleanup reset of nbd and bdev after a disconnect
> 
> c261189862c6f65117eb3b1748622a08ef49c262
>   nbd: don't shutdown sock with irq's disabled
> 
> 97240963eb308d8d21a89c0459822f7ea98463b4
>   nbd: fix race in ioctl
> 
> The first 3 patches are prerequisites required for the latter two fixes to apply and work correctly.  Most of these backports are minor patch wiggles
> required because later patches have been applied to the driver in earlier fixes to this driver.
>    
> 
> == Regression Potential ==
> 
> These fixes just touch nbd, so the regression potential is just limited to this. Secondly, we are pulling in upstream fixes that exist in Bionic and Cosmic kernels, so these are tried and tested fixes.
> 
> == Test Case ==
> 
>   1. Deploy a node with 4.4 Xenial
>   2. Run the ubuntu_nbd_smoke_test
> 
> Without the fix, we get hang/crashes.  With the fix one can run this test
> multiple times without any issues at all.
> 
> ----
> 
> Josef Bacik (1):
>   nbd: don't shutdown sock with irq's disabled
> 
> Markus Pargmann (3):
>   nbd: Remove signal usage
>   nbd: Timeouts are not user requested disconnects
>   nbd: Cleanup reset of nbd and bdev after a disconnect
> 
> Vegard Nossum (1):
>   nbd: fix race in ioctl
> 
>  drivers/block/nbd.c | 200 ++++++++++++++++++++++++++--------------------------
>  1 file changed, 99 insertions(+), 101 deletions(-)
> 
> -- 
> 2.7.4
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team