mbox series

[C,0/1] Fix write()/fsync() deadlock in write_cache_pages()

Message ID 20190415163210.17463-1-mfo@canonical.com
Headers show
Series Fix write()/fsync() deadlock in write_cache_pages() | expand

Message

Mauricio Faria de Oliveira April 15, 2019, 4:32 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/1824827

[Impact] 

 * Tasks of a multi-threaded workload doing write() and fsync()
   might deadlock in write_cache_pages(), preventing progress.

 * The fix addresses a corner case in write_cache_pages() on
   the range_cyclic implementation which allows the deadlock.

 * Patch:
   - commit 64081362e8ff4587b4554087f3cfc73d3e0a4cd7
     ("mm/page-writeback.c: fix range_cyclic writeback vs
     writepages deadlock"), present in v4.20-rc1~92^2~19.

[Test Case]

 * This issue originally has been hit by the 'perforce' (p4d)
   tool in a XFS filesystem, but it's difficult/rare to occur.

 * We've written an userspace + kernel module (kprobes-based)
   to reproduce this problem and verify the test kernel/patch.

 * The kprobes are strictly tied to particular kernel versions
   because of the assembly instruction offsets.  We'll provide
   updated versions for -updates and -proposed for verification.

 * Steps 
   (see output examples in comments):

   - Userspace part:
   $ gcc -o test test.c -pthread

   - Kernel part:
   $ touch Makefile 
   $ make -C /lib/modules/$(uname -r)/build M=$(pwd) obj-m=kprobe-test.o clean
   $ make -C /lib/modules/$(uname -r)/build M=$(pwd) obj-m=kprobe-test.o modules 

   - Shorter hung task timeout and higher console logging level
     to notice the deadlocked tasks sooner, and watch progress:
   $ echo 10 | sudo tee /proc/sys/kernel/hung_task_timeout_secs
   $ echo 9 | sudo tee /proc/sys/kernel/printk 

   - Load module / Run userspace part (logging to kernel log) in XFS:
   $ sudo insmod kprobe-test.ko
   $ cd /path/to/xfs-mountpoint && sudo sh -c 'stdbuf -oL /path/to/test >/dev/kmsg'
   $ sudo rmmod kprobe-test

   You may need to ctrl-z with the original kernel as 'test' doesn't finish.

   - Check kernel log or watch the system console:
   $ dmesg

   Check threads in D state.
   $ ps -eLo pid,tid,state,comm | grep D | grep -e test -e kworker


[Regression Potential] 

 * The patch is small but changes core writeback infrastructure,
   so there's a chance this may _affect_ some or other behavior
   that has not been validated with our regression testing; not
   exactly _break_ it.  Please note our regression testing.

 * This has been verified with 'xfstests' (not only for XFS fs,
   despite its original name), used by major Linux filesystems
   for regression testing during development. It's been tested
   on systems with 24 and 4 CPUs (to exercise differences in
   scalability, parallelism, and workload) and XFS and ext4
   (reporter's environment + Ubuntu's default).
   No regressions were observed (the set of failed tests is
   the same in each system and tests failed in the same way).
   
 * This has also been verified with 'iozone' for write intensive
   tests, to exercise the writeback mechanism and no errors were
   observed.

 * The reporter has been running the test kernel with the patch
   for weeks and has not observed any other issues/regressions.

[Other Info]
 
 * This is only required in Cosmic (for the Bionic HWE kernel),
   and is already applied in Disco.

Dave Chinner (1):
  mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock

 mm/page-writeback.c | 33 +++++++++++++++------------------
 1 file changed, 15 insertions(+), 18 deletions(-)

Comments

Khalid Elmously April 23, 2019, 5:14 a.m. UTC | #1
On 2019-04-15 13:32:09 , Mauricio Faria de Oliveira wrote:
> BugLink: https://bugs.launchpad.net/bugs/1824827
> 
> [Impact] 
> 
>  * Tasks of a multi-threaded workload doing write() and fsync()
>    might deadlock in write_cache_pages(), preventing progress.
> 
>  * The fix addresses a corner case in write_cache_pages() on
>    the range_cyclic implementation which allows the deadlock.
> 
>  * Patch:
>    - commit 64081362e8ff4587b4554087f3cfc73d3e0a4cd7
>      ("mm/page-writeback.c: fix range_cyclic writeback vs
>      writepages deadlock"), present in v4.20-rc1~92^2~19.
> 
> [Test Case]
> 
>  * This issue originally has been hit by the 'perforce' (p4d)
>    tool in a XFS filesystem, but it's difficult/rare to occur.
> 
>  * We've written an userspace + kernel module (kprobes-based)
>    to reproduce this problem and verify the test kernel/patch.
> 
>  * The kprobes are strictly tied to particular kernel versions
>    because of the assembly instruction offsets.  We'll provide
>    updated versions for -updates and -proposed for verification.
> 
>  * Steps 
>    (see output examples in comments):
> 
>    - Userspace part:
>    $ gcc -o test test.c -pthread
> 
>    - Kernel part:
>    $ touch Makefile 
>    $ make -C /lib/modules/$(uname -r)/build M=$(pwd) obj-m=kprobe-test.o clean
>    $ make -C /lib/modules/$(uname -r)/build M=$(pwd) obj-m=kprobe-test.o modules 
> 
>    - Shorter hung task timeout and higher console logging level
>      to notice the deadlocked tasks sooner, and watch progress:
>    $ echo 10 | sudo tee /proc/sys/kernel/hung_task_timeout_secs
>    $ echo 9 | sudo tee /proc/sys/kernel/printk 
> 
>    - Load module / Run userspace part (logging to kernel log) in XFS:
>    $ sudo insmod kprobe-test.ko
>    $ cd /path/to/xfs-mountpoint && sudo sh -c 'stdbuf -oL /path/to/test >/dev/kmsg'
>    $ sudo rmmod kprobe-test
> 
>    You may need to ctrl-z with the original kernel as 'test' doesn't finish.
> 
>    - Check kernel log or watch the system console:
>    $ dmesg
> 
>    Check threads in D state.
>    $ ps -eLo pid,tid,state,comm | grep D | grep -e test -e kworker
> 
> 
> [Regression Potential] 
> 
>  * The patch is small but changes core writeback infrastructure,
>    so there's a chance this may _affect_ some or other behavior
>    that has not been validated with our regression testing; not
>    exactly _break_ it.  Please note our regression testing.
> 
>  * This has been verified with 'xfstests' (not only for XFS fs,
>    despite its original name), used by major Linux filesystems
>    for regression testing during development. It's been tested
>    on systems with 24 and 4 CPUs (to exercise differences in
>    scalability, parallelism, and workload) and XFS and ext4
>    (reporter's environment + Ubuntu's default).
>    No regressions were observed (the set of failed tests is
>    the same in each system and tests failed in the same way).
>    
>  * This has also been verified with 'iozone' for write intensive
>    tests, to exercise the writeback mechanism and no errors were
>    observed.
> 
>  * The reporter has been running the test kernel with the patch
>    for weeks and has not observed any other issues/regressions.
> 
> [Other Info]
>  
>  * This is only required in Cosmic (for the Bionic HWE kernel),
>    and is already applied in Disco.
> 
> Dave Chinner (1):
>   mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock
> 
>  mm/page-writeback.c | 33 +++++++++++++++------------------
>  1 file changed, 15 insertions(+), 18 deletions(-)
> 
> -- 
> 2.17.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team