Message ID | 20190708005038.13184-1-mfo@canonical.com |
---|---|
Headers | show |
Series | LP#1829563 bcache: risk of data loss on I/O errors in backing or caching devices | expand |
On Sun, Jul 07, 2019 at 09:50:27PM -0300, Mauricio Faria de Oliveira wrote: > BugLink: https://bugs.launchpad.net/bugs/1829563 > > Note: the patchset is relatively large because the support > for error detection/handling is mostly non-existent in 4.15. > All patches are in Cosmic/4.18 except PATCH 11 (other series). > > [Impact] > > * The bcache code in Bionic lacks several fixes to handle > I/O errors in both backing devices and caching devices. > > * Partial or permanent errors in backing or caching devices, > specially in writeback mode, can lead to data loss and/or > the application is not notified about failed I/O requests. > > * The bcache device might remain available for I/O requests > even if backing device is offline, so writes are undefined. > > [Test Case] > > * Detailed test cases/steps for the behavior of many patches > with code logic changes are provided in bug comments. > > * The patchset has been tested for regressions on each cache > mode (writethrough, writeback, writearound, none) with the > xfstests test suite (on ext4) and fio (sequential & random > read-write). > > [Regression Potential] > > * The patchset is relatively large and touches several areas > in bcache code, however, synthetic testing of the patches > has been performed, and extensive regression/stress tests > were run (as mentioned in Test Case section). > > * Many patches in the patchset are 'Fixes' patches to other > patches, and no further 'Fixes' currently exist upstream. > > [Other Info] > > * Canonical Field Eng. deploys bcache+writeback extensively > (e.g., BootStack, UA cloud, except rare all-flash cases). > > [Original Bug Description] > > This is a request for a backport of the following upstream patch from 4.18: > > "bcache: stop bcache device when backing device is offline" > https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee > > Field engineering uses bcache quite extensively and it would be good to have this in the GA/bionic kernel. > > > Coly Li (9): > bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags > bcache: add stop_when_cache_set_failed option to backing device > bcache: add backing_request_endio() for bi_end_io > bcache: add io_disable to struct cached_dev > bcache: store disk name in struct cache and struct cached_dev > bcache: count backing device I/O error for writeback I/O > bcache: add wait_for_kthread_stop() in bch_allocator_thread() > bcache: set dc->io_disable to true in conditional_stop_bcache_device() > bcache: stop bcache device when backing device is offline > > Tang Junhui (2): > bcache: fix inaccurate io state for detached bcache devices > bcache: fix ioctl in flash device > > drivers/md/bcache/alloc.c | 8 +- > drivers/md/bcache/bcache.h | 54 +++++++++ > drivers/md/bcache/btree.c | 11 +- > drivers/md/bcache/debug.c | 3 +- > drivers/md/bcache/io.c | 20 +++- > drivers/md/bcache/journal.c | 4 +- > drivers/md/bcache/request.c | 186 +++++++++++++++++++++++++----- > drivers/md/bcache/super.c | 207 +++++++++++++++++++++++++++++----- > drivers/md/bcache/sysfs.c | 50 +++++++- > drivers/md/bcache/util.h | 6 - > drivers/md/bcache/writeback.c | 37 ++++-- > 11 files changed, 497 insertions(+), 89 deletions(-) I had to backport the same commits for a separate bug (https://bugs.launchpad.net/bugs/1796292) and I produced the same patches with positive test results, therefore: Acked-by: Andrea Righi <andrea.righi@canonical.com>
Hi kernel team, On Sun, Jul 7, 2019 at 9:51 PM Mauricio Faria de Oliveira <mfo@canonical.com> wrote: > > BugLink: https://bugs.launchpad.net/bugs/1829563 Looking for an additional reviewer for this series as it's desired on next cycle (which last day for commits is tomorrow/Jul 17) if at all possible, please. Thank you,
On 7/7/19 5:50 PM, Mauricio Faria de Oliveira wrote: > BugLink: https://bugs.launchpad.net/bugs/1829563 > > Note: the patchset is relatively large because the support > for error detection/handling is mostly non-existent in 4.15. > All patches are in Cosmic/4.18 except PATCH 11 (other series). > > [Impact] > > * The bcache code in Bionic lacks several fixes to handle > I/O errors in both backing devices and caching devices. > > * Partial or permanent errors in backing or caching devices, > specially in writeback mode, can lead to data loss and/or > the application is not notified about failed I/O requests. > > * The bcache device might remain available for I/O requests > even if backing device is offline, so writes are undefined. > > [Test Case] > > * Detailed test cases/steps for the behavior of many patches > with code logic changes are provided in bug comments. > > * The patchset has been tested for regressions on each cache > mode (writethrough, writeback, writearound, none) with the > xfstests test suite (on ext4) and fio (sequential & random > read-write). > > [Regression Potential] > > * The patchset is relatively large and touches several areas > in bcache code, however, synthetic testing of the patches > has been performed, and extensive regression/stress tests > were run (as mentioned in Test Case section). > > * Many patches in the patchset are 'Fixes' patches to other > patches, and no further 'Fixes' currently exist upstream. > > [Other Info] > > * Canonical Field Eng. deploys bcache+writeback extensively > (e.g., BootStack, UA cloud, except rare all-flash cases). > > [Original Bug Description] > > This is a request for a backport of the following upstream patch from 4.18: > > "bcache: stop bcache device when backing device is offline" > https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee > > Field engineering uses bcache quite extensively and it would be good to have this in the GA/bionic kernel. > > > Coly Li (9): > bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags > bcache: add stop_when_cache_set_failed option to backing device > bcache: add backing_request_endio() for bi_end_io > bcache: add io_disable to struct cached_dev > bcache: store disk name in struct cache and struct cached_dev > bcache: count backing device I/O error for writeback I/O > bcache: add wait_for_kthread_stop() in bch_allocator_thread() > bcache: set dc->io_disable to true in conditional_stop_bcache_device() > bcache: stop bcache device when backing device is offline > > Tang Junhui (2): > bcache: fix inaccurate io state for detached bcache devices > bcache: fix ioctl in flash device > > drivers/md/bcache/alloc.c | 8 +- > drivers/md/bcache/bcache.h | 54 +++++++++ > drivers/md/bcache/btree.c | 11 +- > drivers/md/bcache/debug.c | 3 +- > drivers/md/bcache/io.c | 20 +++- > drivers/md/bcache/journal.c | 4 +- > drivers/md/bcache/request.c | 186 +++++++++++++++++++++++++----- > drivers/md/bcache/super.c | 207 +++++++++++++++++++++++++++++----- > drivers/md/bcache/sysfs.c | 50 +++++++- > drivers/md/bcache/util.h | 6 - > drivers/md/bcache/writeback.c | 37 ++++-- > 11 files changed, 497 insertions(+), 89 deletions(-) > + very thoroughly tested Acked-by: Connor Kuehl <connor.kuehl@canonical.com>
On Tue, Jul 16, 2019 at 2:51 PM Connor Kuehl <connor.kuehl@canonical.com> wrote: [snip] > + very thoroughly tested > > Acked-by: Connor Kuehl <connor.kuehl@canonical.com> Connor, thanks for reviewing!
On Fri, Jul 12, 2019 at 3:25 AM Andrea Righi <andrea.righi@canonical.com> wrote: [snip] > I had to backport the same commits for a separate bug > (https://bugs.launchpad.net/bugs/1796292) and I produced the same > patches with positive test results, therefore: > > Acked-by: Andrea Righi <andrea.righi@canonical.com> Thanks for the review, Andrea!
On 7/8/19 2:50 AM, Mauricio Faria de Oliveira wrote: > BugLink: https://bugs.launchpad.net/bugs/1829563 > > Note: the patchset is relatively large because the support > for error detection/handling is mostly non-existent in 4.15. > All patches are in Cosmic/4.18 except PATCH 11 (other series). > > [Impact] > > * The bcache code in Bionic lacks several fixes to handle > I/O errors in both backing devices and caching devices. > > * Partial or permanent errors in backing or caching devices, > specially in writeback mode, can lead to data loss and/or > the application is not notified about failed I/O requests. > > * The bcache device might remain available for I/O requests > even if backing device is offline, so writes are undefined. > > [Test Case] > > * Detailed test cases/steps for the behavior of many patches > with code logic changes are provided in bug comments. > > * The patchset has been tested for regressions on each cache > mode (writethrough, writeback, writearound, none) with the > xfstests test suite (on ext4) and fio (sequential & random > read-write). > > [Regression Potential] > > * The patchset is relatively large and touches several areas > in bcache code, however, synthetic testing of the patches > has been performed, and extensive regression/stress tests > were run (as mentioned in Test Case section). > > * Many patches in the patchset are 'Fixes' patches to other > patches, and no further 'Fixes' currently exist upstream. > > [Other Info] > > * Canonical Field Eng. deploys bcache+writeback extensively > (e.g., BootStack, UA cloud, except rare all-flash cases). > > [Original Bug Description] > > This is a request for a backport of the following upstream patch from 4.18: > > "bcache: stop bcache device when backing device is offline" > https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee > > Field engineering uses bcache quite extensively and it would be good to have this in the GA/bionic kernel. Applied to bionic/master-next branch, with some context adjustments to - bcache: add io_disable to struct cached_dev Thanks, Kleber > > > Coly Li (9): > bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags > bcache: add stop_when_cache_set_failed option to backing device > bcache: add backing_request_endio() for bi_end_io > bcache: add io_disable to struct cached_dev > bcache: store disk name in struct cache and struct cached_dev > bcache: count backing device I/O error for writeback I/O > bcache: add wait_for_kthread_stop() in bch_allocator_thread() > bcache: set dc->io_disable to true in conditional_stop_bcache_device() > bcache: stop bcache device when backing device is offline > > Tang Junhui (2): > bcache: fix inaccurate io state for detached bcache devices > bcache: fix ioctl in flash device > > drivers/md/bcache/alloc.c | 8 +- > drivers/md/bcache/bcache.h | 54 +++++++++ > drivers/md/bcache/btree.c | 11 +- > drivers/md/bcache/debug.c | 3 +- > drivers/md/bcache/io.c | 20 +++- > drivers/md/bcache/journal.c | 4 +- > drivers/md/bcache/request.c | 186 +++++++++++++++++++++++++----- > drivers/md/bcache/super.c | 207 +++++++++++++++++++++++++++++----- > drivers/md/bcache/sysfs.c | 50 +++++++- > drivers/md/bcache/util.h | 6 - > drivers/md/bcache/writeback.c | 37 ++++-- > 11 files changed, 497 insertions(+), 89 deletions(-) >