From patchwork Mon Jul 8 00:50:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 1128795 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45hn2t173nz9sN4; Mon, 8 Jul 2019 10:51:50 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1hkHse-0002cQ-N9; Mon, 08 Jul 2019 00:51:44 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1hkHsS-0002UJ-3Y for kernel-team@lists.ubuntu.com; Mon, 08 Jul 2019 00:51:32 +0000 Received: from mail-qt1-f198.google.com ([209.85.160.198]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1hkHsR-0004d2-Jm for kernel-team@lists.ubuntu.com; Mon, 08 Jul 2019 00:51:31 +0000 Received: by mail-qt1-f198.google.com with SMTP id x7so15134588qtp.15 for ; Sun, 07 Jul 2019 17:51:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=mq2bLuc42S6O87fvRJTHZBEy4qb46+dmEVNeTKBI9DU=; b=YDlnYJ5yETAyaSuL+9wRjLFVhcyr9LTmx6Epo5di12oOkctOeSbGYXMmXKjHAdsKxt 1/np6IQ2Ss/JfN3l2xRias6AgTwnlGXo2W7KFdsS8dDeWeSMWgqT6lizhDUIZxFYaF33 Rk6daByRAhQ2jPzqIW8gTvaKfU7yD+zzfC1KsU0tAAazXh1KNHIOWf4QmwOs0F4r1eGU Px4/+vcGQjTzwAukx/buxX99HlRjjOVF5QQrTA4VlilWI4fBeKpY+LVlxbOZ4h3Semsn Junsb0ju2/98o4bWLe3+mHaGuyJmvi+pNQG9KQy4lNnaY1WSoxKT2pvqVU5YzzLY4bTf /nBw== X-Gm-Message-State: APjAAAW3ITItRgRoTH+NRUn/75+nxARlN14xrU5V4CmugLJ3RhE66C8P gUd6q+c102gDrlNq1SXxfN2lWt353xQ0GEXT06D9sXDKF1NSGm6Op18HY9DA21o/6GdplN2BvvJ 21c3ZwoC6+YBBei4DsTeFmRniZ8bExiV6pEa/g5HYeA== X-Received: by 2002:a0c:ba0b:: with SMTP id w11mr12728017qvf.71.1562547090675; Sun, 07 Jul 2019 17:51:30 -0700 (PDT) X-Google-Smtp-Source: APXvYqxlVZeqNMc2oHRqVH4u2+xx76mGZv23ZrAPPdxmYxHVe6aogxo1Oy87NxcuNwWVjrdR8WD5eQ== X-Received: by 2002:a0c:ba0b:: with SMTP id w11mr12728006qvf.71.1562547090388; Sun, 07 Jul 2019 17:51:30 -0700 (PDT) Received: from localhost.localdomain ([2804:14c:4e7:1017:3da7:3d04:ea25:3a0]) by smtp.gmail.com with ESMTPSA id t197sm6697527qke.2.2019.07.07.17.51.29 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Sun, 07 Jul 2019 17:51:30 -0700 (PDT) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [B][PATCH 05/11] bcache: add io_disable to struct cached_dev Date: Sun, 7 Jul 2019 21:50:32 -0300 Message-Id: <20190708005038.13184-6-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190708005038.13184-1-mfo@canonical.com> References: <20190708005038.13184-1-mfo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Coly Li BugLink: https://bugs.launchpad.net/bugs/1829563 If a bcache device is configured to writeback mode, current code does not handle write I/O errors on backing devices properly. In writeback mode, write request is written to cache device, and latter being flushed to backing device. If I/O failed when writing from cache device to the backing device, bcache code just ignores the error and upper layer code is NOT noticed that the backing device is broken. This patch tries to handle backing device failure like how the cache device failure is handled, - Add a error counter 'io_errors' and error limit 'error_limit' in struct cached_dev. Add another io_disable to struct cached_dev to disable I/Os on the problematic backing device. - When I/O error happens on backing device, increase io_errors counter. And if io_errors reaches error_limit, set cache_dev->io_disable to true, and stop the bcache device. The result is, if backing device is broken of disconnected, and I/O errors reach its error limit, backing device will be disabled and the associated bcache device will be removed from system. Changelog: v2: remove "bcache: " prefix in pr_error(), and use correct name string to print out bcache device gendisk name. v1: indeed this is new added in v2 patch set. Signed-off-by: Coly Li Reviewed-by: Hannes Reinecke Reviewed-by: Michael Lyle Cc: Michael Lyle Cc: Junhui Tang Signed-off-by: Jens Axboe (backported from commit c7b7bd07404c52d8b9c6fd2fe794052ac367a818) [mfo: backport: bcache.h, hunk 3: refresh context line with the signature of bch_bbio_count_io_errors() due to lack of non-required upstream commit 5138ac6748e3 ("bcache: fix misleading error message in bch_count_io_errors()")] Signed-off-by: Mauricio Faria de Oliveira --- drivers/md/bcache/bcache.h | 6 ++++++ drivers/md/bcache/io.c | 14 ++++++++++++++ drivers/md/bcache/request.c | 14 ++++++++++++-- drivers/md/bcache/super.c | 21 +++++++++++++++++++++ drivers/md/bcache/sysfs.c | 15 ++++++++++++++- 5 files changed, 67 insertions(+), 3 deletions(-) diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index a195abfd3eae..567016ee7455 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -358,6 +358,7 @@ struct cached_dev { unsigned sequential_cutoff; unsigned readahead; + unsigned io_disable:1; unsigned verify:1; unsigned bypass_torture_test:1; @@ -379,6 +380,9 @@ struct cached_dev { unsigned writeback_rate_minimum; enum stop_on_failure stop_when_cache_set_failed; +#define DEFAULT_CACHED_DEV_ERROR_LIMIT 64 + atomic_t io_errors; + unsigned error_limit; }; enum alloc_reserve { @@ -894,6 +898,7 @@ static inline void wait_for_kthread_stop(void) /* Forward declarations */ +void bch_count_backing_io_errors(struct cached_dev *dc, struct bio *bio); void bch_count_io_errors(struct cache *, blk_status_t, const char *); void bch_bbio_count_io_errors(struct cache_set *, struct bio *, blk_status_t, const char *); @@ -921,6 +926,7 @@ int bch_bucket_alloc_set(struct cache_set *, unsigned, struct bkey *, int, bool); bool bch_alloc_sectors(struct cache_set *, struct bkey *, unsigned, unsigned, unsigned, bool); +bool bch_cached_dev_error(struct cached_dev *dc); __printf(2, 3) bool bch_cache_set_error(struct cache_set *, const char *, ...); diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c index c456095d2bbe..091c90b5d4d7 100644 --- a/drivers/md/bcache/io.c +++ b/drivers/md/bcache/io.c @@ -50,6 +50,20 @@ void bch_submit_bbio(struct bio *bio, struct cache_set *c, } /* IO errors */ +void bch_count_backing_io_errors(struct cached_dev *dc, struct bio *bio) +{ + char buf[BDEVNAME_SIZE]; + unsigned errors; + + WARN_ONCE(!dc, "NULL pointer of struct cached_dev"); + + errors = atomic_add_return(1, &dc->io_errors); + if (errors < dc->error_limit) + pr_err("%s: IO error on backing device, unrecoverable", + bio_devname(bio, buf)); + else + bch_cached_dev_error(dc); +} void bch_count_io_errors(struct cache *ca, blk_status_t error, const char *m) { diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index 80761d03fc00..3b13b3a714d9 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -637,6 +637,8 @@ static void backing_request_endio(struct bio *bio) if (bio->bi_status) { struct search *s = container_of(cl, struct search, cl); + struct cached_dev *dc = container_of(s->d, + struct cached_dev, disk); /* * If a bio has REQ_PREFLUSH for writeback mode, it is * speically assembled in cached_dev_write() for a non-zero @@ -657,6 +659,7 @@ static void backing_request_endio(struct bio *bio) } s->recoverable = false; /* should count I/O error for backing device here */ + bch_count_backing_io_errors(dc, bio); } bio_put(bio); @@ -1065,8 +1068,14 @@ static void detached_dev_end_io(struct bio *bio) bio_data_dir(bio), &ddip->d->disk->part0, ddip->start_time); - kfree(ddip); + if (bio->bi_status) { + struct cached_dev *dc = container_of(ddip->d, + struct cached_dev, disk); + /* should count I/O error for backing device here */ + bch_count_backing_io_errors(dc, bio); + } + kfree(ddip); bio->bi_end_io(bio); } @@ -1105,7 +1114,8 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q, struct cached_dev *dc = container_of(d, struct cached_dev, disk); int rw = bio_data_dir(bio); - if (unlikely(d->c && test_bit(CACHE_SET_IO_DISABLE, &d->c->flags))) { + if (unlikely((d->c && test_bit(CACHE_SET_IO_DISABLE, &d->c->flags)) || + dc->io_disable)) { bio->bi_status = BLK_STS_IOERR; bio_endio(bio); return BLK_QC_T_NONE; diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index fa0d98f21b14..158096ffe810 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1210,6 +1210,9 @@ static int cached_dev_init(struct cached_dev *dc, unsigned block_size) max(dc->disk.disk->queue->backing_dev_info->ra_pages, q->backing_dev_info->ra_pages); + atomic_set(&dc->io_errors, 0); + dc->io_disable = false; + dc->error_limit = DEFAULT_CACHED_DEV_ERROR_LIMIT; /* default to auto */ dc->stop_when_cache_set_failed = BCH_CACHED_DEV_STOP_AUTO; @@ -1364,6 +1367,24 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size) return flash_dev_run(c, u); } +bool bch_cached_dev_error(struct cached_dev *dc) +{ + char name[BDEVNAME_SIZE]; + + if (!dc || test_bit(BCACHE_DEV_CLOSING, &dc->disk.flags)) + return false; + + dc->io_disable = true; + /* make others know io_disable is true earlier */ + smp_mb(); + + pr_err("stop %s: too many IO errors on backing device %s\n", + dc->disk.disk->disk_name, bdevname(dc->bdev, name)); + + bcache_device_stop(&dc->disk); + return true; +} + /* Cache set */ __printf(2, 3) diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index 48198b47df14..746f7eccc48e 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -138,7 +138,9 @@ SHOW(__bch_cached_dev) var_print(writeback_delay); var_print(writeback_percent); sysfs_hprint(writeback_rate, dc->writeback_rate.rate << 9); - + sysfs_hprint(io_errors, atomic_read(&dc->io_errors)); + sysfs_printf(io_error_limit, "%i", dc->error_limit); + sysfs_printf(io_disable, "%i", dc->io_disable); var_print(writeback_rate_update_seconds); var_print(writeback_rate_i_term_inverse); var_print(writeback_rate_p_term_inverse); @@ -229,6 +231,14 @@ STORE(__cached_dev) d_strtoul(writeback_rate_i_term_inverse); d_strtoul_nonzero(writeback_rate_p_term_inverse); + sysfs_strtoul_clamp(io_error_limit, dc->error_limit, 0, INT_MAX); + + if (attr == &sysfs_io_disable) { + int v = strtoul_or_return(buf); + + dc->io_disable = v ? 1 : 0; + } + d_strtoi_h(sequential_cutoff); d_strtoi_h(readahead); @@ -349,6 +359,9 @@ static struct attribute *bch_cached_dev_files[] = { &sysfs_writeback_rate_i_term_inverse, &sysfs_writeback_rate_p_term_inverse, &sysfs_writeback_rate_debug, + &sysfs_errors, + &sysfs_io_error_limit, + &sysfs_io_disable, &sysfs_dirty_data, &sysfs_stripe_size, &sysfs_partial_stripes_expensive,