mbox series

[SRU,Bionic,0/1] Fix md raid deadlock during resync

Message ID 20190715225724.21890-1-connor.kuehl@canonical.com
Headers show
Series Fix md raid deadlock during resync | expand

Message

Connor Kuehl July 15, 2019, 10:57 p.m. UTC
[Impact]

* If regular and resync IO happen at the same time during a regular IO
  split, the split bio will wait until resync IO finishes while at the
  same time the resync IO is waiting for regular IO to finish. This
  results in deadlock.

* I believe this only impacts Bionic as Disco+ already contains this
  commit. Xenial doesn't contain the commit that this one fixes.

[Test Case]

The test kernel containing this commit received positive feedback in the
launchpad bug.

From the launchpad bug comment #10:

"For reproduce on 4.15.0-50-generic: Make new raid-10, add some io fio/dd, 
unpack anaconda archives, after minute or two deadlocked"

[Regression Potential]

* This fix has been in mainline since December 2018 and I don't see any
  fixup commits upstream referencing this one. The small number of
  changes in this commit seem reasonable for managing the `nr_pending`
  adjustments which preclude either regular or resync IO.

Guoqing Jiang (1):
  md: fix raid10 hang issue caused by barrier

 drivers/md/raid10.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Andrea Righi July 16, 2019, 6:41 a.m. UTC | #1
On Mon, Jul 15, 2019 at 03:57:23PM -0700, Connor Kuehl wrote:
> [Impact]
> 
> * If regular and resync IO happen at the same time during a regular IO
>   split, the split bio will wait until resync IO finishes while at the
>   same time the resync IO is waiting for regular IO to finish. This
>   results in deadlock.
> 
> * I believe this only impacts Bionic as Disco+ already contains this
>   commit. Xenial doesn't contain the commit that this one fixes.
> 
> [Test Case]
> 
> The test kernel containing this commit received positive feedback in the
> launchpad bug.
> 
> From the launchpad bug comment #10:
> 
> "For reproduce on 4.15.0-50-generic: Make new raid-10, add some io fio/dd, 
> unpack anaconda archives, after minute or two deadlocked"
> 
> [Regression Potential]
> 
> * This fix has been in mainline since December 2018 and I don't see any
>   fixup commits upstream referencing this one. The small number of
>   changes in this commit seem reasonable for managing the `nr_pending`
>   adjustments which preclude either regular or resync IO.
> 
> Guoqing Jiang (1):
>   md: fix raid10 hang issue caused by barrier
> 
>  drivers/md/raid10.c | 4 ++++
>  1 file changed, 4 insertions(+)

Clean cherry pick with positive test results. Looks good to me.

Acked-by: Andrea Righi <andrea.righi@canonical.com>
Po-Hsu Lin July 17, 2019, 2:30 a.m. UTC | #2
+1 with clean cherry-pick and positive test result.
Acked-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Kleber Sacilotto de Souza July 17, 2019, 9:58 a.m. UTC | #3
On 7/16/19 12:57 AM, Connor Kuehl wrote:
> [Impact]
> 
> * If regular and resync IO happen at the same time during a regular IO
>   split, the split bio will wait until resync IO finishes while at the
>   same time the resync IO is waiting for regular IO to finish. This
>   results in deadlock.
> 
> * I believe this only impacts Bionic as Disco+ already contains this
>   commit. Xenial doesn't contain the commit that this one fixes.
> 
> [Test Case]
> 
> The test kernel containing this commit received positive feedback in the
> launchpad bug.
> 
> From the launchpad bug comment #10:
> 
> "For reproduce on 4.15.0-50-generic: Make new raid-10, add some io fio/dd, 
> unpack anaconda archives, after minute or two deadlocked"
> 
> [Regression Potential]
> 
> * This fix has been in mainline since December 2018 and I don't see any
>   fixup commits upstream referencing this one. The small number of
>   changes in this commit seem reasonable for managing the `nr_pending`
>   adjustments which preclude either regular or resync IO.
> 
> Guoqing Jiang (1):
>   md: fix raid10 hang issue caused by barrier
> 
>  drivers/md/raid10.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 

This patch was already applied to bionic/master-next branch.

Thanks,
Kleber