From patchwork Wed Jul 20 03:12:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Ruffell X-Patchwork-Id: 1658321 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=JdHpDWmi; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4Lngjr63KKz9s2R for ; Wed, 20 Jul 2022 13:12:32 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1oE08Q-000863-1c; Wed, 20 Jul 2022 03:12:26 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1oE08K-00084F-ML for kernel-team@lists.ubuntu.com; Wed, 20 Jul 2022 03:12:20 +0000 Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 757F43F0C8 for ; Wed, 20 Jul 2022 03:12:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1658286740; bh=/+VxWX74pvDdsCkz5PBL23bSjZBhte/xupwXQd8nXdM=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JdHpDWmiQ6vlNCxsPWTTkE6N4aE+PQ4s1DuQ5wJLGCqoLa86pjeJYvoV8H0mv6EkK tMKhT6NE4xHjaGa2PaP2raah9JxO8VbQ7jWwHSJ5ijhSwrV7zMVCqHCm9KMrA+3EMW ZFTrYlf059rxVjoog4RWN0ChxnH45w4rRrx7JilTJznSMVYGUUtTyzjHfyngrnY/bI I+ryd0Jti2zIngUo8xbBnIihANSNUzQ23oGrNFsDXw/Us7f5U41k5PbGsS/afc4aRm 9OYtEMrCjFNqnv4kH8rStDtP2b0kLwNhmBScmBR0ftyiwNqHwhnu4hU2q767GxJIyz sTwPRBNjXSzbA== Received: by mail-pf1-f197.google.com with SMTP id d18-20020aa78692000000b0052abaa9a6bbso5215921pfo.2 for ; Tue, 19 Jul 2022 20:12:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/+VxWX74pvDdsCkz5PBL23bSjZBhte/xupwXQd8nXdM=; b=M0pSIKQzm5WWwPltRLsgug1svox2/qYS+kF4C2fteTpll1PqS6RLyOLlP25H7O1u21 cdSViA/Eqc8FwZCdjhtNWIp1Lso01dZtVKGsCIFPtOszy58GqdYsHNv/FdcDdZXX4ltf koN853PgbCD55gefVrFPUcK8MmGyZ/aLXop+wb3z8GaHCH4H+31TMvBSqm6oPESigKnw zn1//pXeUiP2liK40Z36W93lAhhGO01s6AkQwNyP0a+IFs2WRTYL0ZYclQoodnLKKeZm 2UWwKzu1GvxP0oF78opxoeNriEsP6TiyRguVdEv6i9ylsz5hZSxICmG/oMeRBz/85vqR p66Q== X-Gm-Message-State: AJIora9BiMwpfR5++KBc588JrzCRHN4ziEJYwNkyvwRv9xY6tQQ0rzZG BBse0jAosF/ZEVfmid98WhSls5qJKLF4j8kOONSusKPr1dbIhMROpL/kAfh6GQ4SnhHyH1YRSXb H2TIRWunL6wZdt85W6rYX26W2rq1VSuCfdzfn8Ib6Xg== X-Received: by 2002:a05:6a00:a93:b0:528:77d6:f660 with SMTP id b19-20020a056a000a9300b0052877d6f660mr36836401pfl.50.1658286739122; Tue, 19 Jul 2022 20:12:19 -0700 (PDT) X-Google-Smtp-Source: AGRyM1umIUSMBkJ8+NuQrDLGMk4xfA0tA02CHg+pIZeVpLs7eYRt6VKE1JdH9Kcvd8CS7Ck2P9wmkA== X-Received: by 2002:a05:6a00:a93:b0:528:77d6:f660 with SMTP id b19-20020a056a000a9300b0052877d6f660mr36836382pfl.50.1658286738790; Tue, 19 Jul 2022 20:12:18 -0700 (PDT) Received: from desktop.. (125-239-70-54-fibre.sparkbb.co.nz. [125.239.70.54]) by smtp.gmail.com with ESMTPSA id i2-20020a17090ac40200b001efbc3ad105sm331812pjt.54.2022.07.19.20.12.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Jul 2022 20:12:18 -0700 (PDT) From: Matthew Ruffell To: kernel-team@lists.ubuntu.com Subject: [SRU][F][PATCH V2 1/6] blk-mq: blk-mq: provide forced completion method Date: Wed, 20 Jul 2022 15:12:05 +1200 Message-Id: <20220720031210.17801-2-matthew.ruffell@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220720031210.17801-1-matthew.ruffell@canonical.com> References: <20220720031210.17801-1-matthew.ruffell@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Keith Busch BugLink: https://bugs.launchpad.net/bugs/1896350 Drivers may need to bypass error injection for error recovery. Rename __blk_mq_complete_request() to blk_mq_force_complete_rq() and export that function so drivers may skip potential fake timeouts after they've reclaimed lost requests. Signed-off-by: Keith Busch Reviewed-by: Daniel Wagner Signed-off-by: Jens Axboe (cherry picked from commit 7b11eab041dacfeaaa6d27d9183b247a995bc16d) Signed-off-by: Matthew Ruffell --- block/blk-mq.c | 15 +++++++++++++-- include/linux/blk-mq.h | 1 + 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 84798d09ca46..82e93cd9f60d 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -579,7 +579,17 @@ static void __blk_mq_complete_request_remote(void *data) q->mq_ops->complete(rq); } -static void __blk_mq_complete_request(struct request *rq) +/** + * blk_mq_force_complete_rq() - Force complete the request, bypassing any error + * injection that could drop the completion. + * @rq: Request to be force completed + * + * Drivers should use blk_mq_complete_request() to complete requests in their + * normal IO path. For timeout error recovery, drivers may call this forced + * completion routine after they've reclaimed timed out requests to bypass + * potentially subsequent fake timeouts. + */ +void blk_mq_force_complete_rq(struct request *rq) { struct blk_mq_ctx *ctx = rq->mq_ctx; struct request_queue *q = rq->q; @@ -625,6 +635,7 @@ static void __blk_mq_complete_request(struct request *rq) } put_cpu(); } +EXPORT_SYMBOL_GPL(blk_mq_force_complete_rq); static void hctx_unlock(struct blk_mq_hw_ctx *hctx, int srcu_idx) __releases(hctx->srcu) @@ -658,7 +669,7 @@ bool blk_mq_complete_request(struct request *rq) { if (unlikely(blk_should_fake_timeout(rq->q))) return false; - __blk_mq_complete_request(rq); + blk_mq_force_complete_rq(rq); return true; } EXPORT_SYMBOL(blk_mq_complete_request); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 0bf056de5cc3..92b48a8e4af3 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -312,6 +312,7 @@ void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list); void blk_mq_kick_requeue_list(struct request_queue *q); void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs); bool blk_mq_complete_request(struct request *rq); +void blk_mq_force_complete_rq(struct request *rq); bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list, struct bio *bio, unsigned int nr_segs); bool blk_mq_queue_stopped(struct request_queue *q); From patchwork Wed Jul 20 03:12:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Ruffell X-Patchwork-Id: 1658322 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=Mw1OGvmh; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4Lngjv53ZKz9s2R for ; Wed, 20 Jul 2022 13:12:35 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1oE08S-00087m-8f; Wed, 20 Jul 2022 03:12:28 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1oE08N-00084q-1i for kernel-team@lists.ubuntu.com; Wed, 20 Jul 2022 03:12:23 +0000 Received: from mail-pj1-f69.google.com (mail-pj1-f69.google.com [209.85.216.69]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 9E96E3F0C8 for ; Wed, 20 Jul 2022 03:12:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1658286742; bh=oMW91NIc6z8z6YZipqwJz3MMIB99iOy7LHudVAe4h9o=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Mw1OGvmh7rFLqND+alnrbluvieyVEGiHWUbqCfcOz+wGhZO+YQ96Q2aK7Ukk3QSNY LUN4jg+xjhJ5LZX3NstwqzF18ImUiohYX0J049zkjYeEDID9oVNrExGcV+1n3p4xAL QF0lgkRSe0xGx2YsPTAHUOWDGc4CIW5lFPbhZoAisjQJFDcb/RWM3NUqX+q7dqY4+a 7SIcnAxyT0p7ABUhC2hBVJgBlW4W2q+w093856s156kBWa6/mhEoFxa75KItGqJElh RrvqLfD+mDWYrbA1mQX28U8TyW3e3KqbbqyA+3hdwSLNaDpeC/czDA9JZ87mfQn67L TTXeODzNhm/7Q== Received: by mail-pj1-f69.google.com with SMTP id c14-20020a17090abf0e00b001f2096d876bso1118400pjs.4 for ; Tue, 19 Jul 2022 20:12:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oMW91NIc6z8z6YZipqwJz3MMIB99iOy7LHudVAe4h9o=; b=i/S5E2e/UxB0XDKngZPpxgCS6m8h1NKFHcRS23K9OOeJeAWgKn7BQeVKkINd3LjlwU g9gXm0DYetIsDd92L//8DPJb8FDABIApIrHl0yWZ92v2Lz+KAoZvo6/MNcgcy7it+iTG TlNRiQaKTeAh1r6GeG4KfPWf8mR2zv9x6KzMf5MV0zWaIsPa4DevLJ7IhqE55Tj9lYI6 173V2ZW7BaxU3qt3fKkhit1mAApRe9uktXV4ay7+YsVLCmysB/5ZicY4xu+s06xl2Oeb BGL8uce9Ckzr/kDZilxCb5ggp7GwxNa1H/Tul1sq8wqrMKHhZ467Iud3hwljD1hHts1l uYug== X-Gm-Message-State: AJIora8eHqucaTRm0ctrXEqmsYNuAfQb8exOWFPCsiL1w1B6qkgAl3VN tKQC0K+Tc37vVPNXsXogGxiPFB5QEeFJJYeZ2pE0kUoQuCaiIRArtb7lsOl/yNZTByOK2COprhV 8RE3OEPvf2T/74Z7TwX5FeX9gRK3M9GfArIzUyp3PEA== X-Received: by 2002:a63:fd14:0:b0:41a:20e8:c1e2 with SMTP id d20-20020a63fd14000000b0041a20e8c1e2mr12323492pgh.286.1658286741020; Tue, 19 Jul 2022 20:12:21 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tkEe0d7ZivteepDs+vowreZsxhtXxzYU2egd3Qw5wqyMsUIxd5yBJJifeyjnGGqBU7/tOuSQ== X-Received: by 2002:a63:fd14:0:b0:41a:20e8:c1e2 with SMTP id d20-20020a63fd14000000b0041a20e8c1e2mr12323476pgh.286.1658286740657; Tue, 19 Jul 2022 20:12:20 -0700 (PDT) Received: from desktop.. (125-239-70-54-fibre.sparkbb.co.nz. [125.239.70.54]) by smtp.gmail.com with ESMTPSA id i2-20020a17090ac40200b001efbc3ad105sm331812pjt.54.2022.07.19.20.12.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Jul 2022 20:12:20 -0700 (PDT) From: Matthew Ruffell To: kernel-team@lists.ubuntu.com Subject: [SRU][F][PATCH V2 2/6] blk-mq: move failure injection out of blk_mq_complete_request Date: Wed, 20 Jul 2022 15:12:06 +1200 Message-Id: <20220720031210.17801-3-matthew.ruffell@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220720031210.17801-1-matthew.ruffell@canonical.com> References: <20220720031210.17801-1-matthew.ruffell@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Christoph Hellwig BugLink: https://bugs.launchpad.net/bugs/1896350 Move the call to blk_should_fake_timeout out of blk_mq_complete_request and into the drivers, skipping call sites that are obvious error handlers, and remove the now superflous blk_mq_force_complete_rq helper. This ensures we don't keep injecting errors into completions that just terminate the Linux request after the hardware has been reset or the command has been aborted. Reviewed-by: Daniel Wagner Signed-off-by: Christoph Hellwig Signed-off-by: Jens Axboe (backported from commit 15f73f5b3e5958f2d169fe13c420eeeeae07bbf2) [mruffell: extensive context adjustments, no major changes] Signed-off-by: Matthew Ruffell --- block/blk-mq.c | 34 +++++++------------------------ block/blk-timeout.c | 6 ++---- block/blk.h | 8 -------- block/bsg-lib.c | 5 ++++- drivers/block/loop.c | 6 ++++-- drivers/block/mtip32xx/mtip32xx.c | 3 ++- drivers/block/nbd.c | 5 ++++- drivers/block/null_blk_main.c | 3 ++- drivers/block/skd_main.c | 9 +++++--- drivers/block/virtio_blk.c | 3 ++- drivers/block/xen-blkfront.c | 3 ++- drivers/md/dm-rq.c | 3 ++- drivers/mmc/core/block.c | 6 +++--- drivers/nvme/host/nvme.h | 3 ++- drivers/s390/block/dasd.c | 2 +- drivers/s390/block/scm_blk.c | 3 ++- drivers/scsi/scsi_lib.c | 12 +++-------- include/linux/blk-mq.h | 12 +++++++++-- 18 files changed, 58 insertions(+), 68 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 82e93cd9f60d..844ca4a61247 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -580,16 +580,13 @@ static void __blk_mq_complete_request_remote(void *data) } /** - * blk_mq_force_complete_rq() - Force complete the request, bypassing any error - * injection that could drop the completion. - * @rq: Request to be force completed + * blk_mq_complete_request - end I/O on a request + * @rq: the request being processed * - * Drivers should use blk_mq_complete_request() to complete requests in their - * normal IO path. For timeout error recovery, drivers may call this forced - * completion routine after they've reclaimed timed out requests to bypass - * potentially subsequent fake timeouts. - */ -void blk_mq_force_complete_rq(struct request *rq) + * Description: + * Complete a request by scheduling the ->complete_rq operation. + **/ +void blk_mq_complete_request(struct request *rq) { struct blk_mq_ctx *ctx = rq->mq_ctx; struct request_queue *q = rq->q; @@ -635,7 +632,7 @@ void blk_mq_force_complete_rq(struct request *rq) } put_cpu(); } -EXPORT_SYMBOL_GPL(blk_mq_force_complete_rq); +EXPORT_SYMBOL(blk_mq_complete_request); static void hctx_unlock(struct blk_mq_hw_ctx *hctx, int srcu_idx) __releases(hctx->srcu) @@ -657,23 +654,6 @@ static void hctx_lock(struct blk_mq_hw_ctx *hctx, int *srcu_idx) *srcu_idx = srcu_read_lock(hctx->srcu); } -/** - * blk_mq_complete_request - end I/O on a request - * @rq: the request being processed - * - * Description: - * Ends all I/O on a request. It does not handle partial completions. - * The actual completion happens out-of-order, through a IPI handler. - **/ -bool blk_mq_complete_request(struct request *rq) -{ - if (unlikely(blk_should_fake_timeout(rq->q))) - return false; - blk_mq_force_complete_rq(rq); - return true; -} -EXPORT_SYMBOL(blk_mq_complete_request); - int blk_mq_request_started(struct request *rq) { return blk_mq_rq_state(rq) != MQ_RQ_IDLE; diff --git a/block/blk-timeout.c b/block/blk-timeout.c index 8aa68fae96ad..3a1ac6434758 100644 --- a/block/blk-timeout.c +++ b/block/blk-timeout.c @@ -20,13 +20,11 @@ static int __init setup_fail_io_timeout(char *str) } __setup("fail_io_timeout=", setup_fail_io_timeout); -int blk_should_fake_timeout(struct request_queue *q) +bool __blk_should_fake_timeout(struct request_queue *q) { - if (!test_bit(QUEUE_FLAG_FAIL_IO, &q->queue_flags)) - return 0; - return should_fail(&fail_io_timeout, 1); } +EXPORT_SYMBOL_GPL(__blk_should_fake_timeout); static int __init fail_io_timeout_debugfs(void) { diff --git a/block/blk.h b/block/blk.h index ee3d5664d962..9c39d4efa4b9 100644 --- a/block/blk.h +++ b/block/blk.h @@ -214,17 +214,9 @@ static inline void elevator_exit(struct request_queue *q, struct hd_struct *__disk_get_part(struct gendisk *disk, int partno); -#ifdef CONFIG_FAIL_IO_TIMEOUT -int blk_should_fake_timeout(struct request_queue *); ssize_t part_timeout_show(struct device *, struct device_attribute *, char *); ssize_t part_timeout_store(struct device *, struct device_attribute *, const char *, size_t); -#else -static inline int blk_should_fake_timeout(struct request_queue *q) -{ - return 0; -} -#endif void __blk_queue_split(struct request_queue *q, struct bio **bio, unsigned int *nr_segs); diff --git a/block/bsg-lib.c b/block/bsg-lib.c index 6cbb7926534c..fb7b347f8010 100644 --- a/block/bsg-lib.c +++ b/block/bsg-lib.c @@ -181,9 +181,12 @@ EXPORT_SYMBOL_GPL(bsg_job_get); void bsg_job_done(struct bsg_job *job, int result, unsigned int reply_payload_rcv_len) { + struct request *rq = blk_mq_rq_from_pdu(job); + job->result = result; job->reply_payload_rcv_len = reply_payload_rcv_len; - blk_mq_complete_request(blk_mq_rq_from_pdu(job)); + if (likely(!blk_should_fake_timeout(rq->q))) + blk_mq_complete_request(rq); } EXPORT_SYMBOL_GPL(bsg_job_done); diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 67014f84fa71..5ab7985fff8a 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -497,7 +497,8 @@ static void lo_rw_aio_do_completion(struct loop_cmd *cmd) return; kfree(cmd->bvec); cmd->bvec = NULL; - blk_mq_complete_request(rq); + if (likely(!blk_should_fake_timeout(rq->q))) + blk_mq_complete_request(rq); } static void lo_rw_aio_complete(struct kiocb *iocb, long ret, long ret2) @@ -2027,7 +2028,8 @@ static void loop_handle_cmd(struct loop_cmd *cmd) /* complete non-aio request */ if (!cmd->use_aio || ret) { cmd->ret = ret ? -EIO : 0; - blk_mq_complete_request(rq); + if (likely(!blk_should_fake_timeout(rq->q))) + blk_mq_complete_request(rq); } } diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 964f78cfffa0..2a7be513c5da 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -492,7 +492,8 @@ static void mtip_complete_command(struct mtip_cmd *cmd, blk_status_t status) struct request *req = blk_mq_rq_from_pdu(cmd); cmd->status = status; - blk_mq_complete_request(req); + if (likely(!blk_should_fake_timeout(req->q))) + blk_mq_complete_request(req); } /* diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 9361539258af..15a8b499b738 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -773,6 +773,7 @@ static void recv_work(struct work_struct *work) struct nbd_device *nbd = args->nbd; struct nbd_config *config = nbd->config; struct nbd_cmd *cmd; + struct request *rq; while (1) { cmd = nbd_read_stat(nbd, args->index); @@ -785,7 +786,9 @@ static void recv_work(struct work_struct *work) break; } - blk_mq_complete_request(blk_mq_rq_from_pdu(cmd)); + rq = blk_mq_rq_from_pdu(cmd); + if (likely(!blk_should_fake_timeout(rq->q))) + blk_mq_complete_request(rq); } nbd_config_put(nbd); atomic_dec(&config->recv_threads); diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c index 13eae973eaea..6938e00e08ea 100644 --- a/drivers/block/null_blk_main.c +++ b/drivers/block/null_blk_main.c @@ -1190,7 +1190,8 @@ static inline void nullb_complete_cmd(struct nullb_cmd *cmd) case NULL_IRQ_SOFTIRQ: switch (cmd->nq->dev->queue_mode) { case NULL_Q_MQ: - blk_mq_complete_request(cmd->rq); + if (likely(!blk_should_fake_timeout(cmd->rq->q))) + blk_mq_complete_request(cmd->rq); break; case NULL_Q_BIO: /* diff --git a/drivers/block/skd_main.c b/drivers/block/skd_main.c index 51569c199a6c..3a476dc1d14f 100644 --- a/drivers/block/skd_main.c +++ b/drivers/block/skd_main.c @@ -1417,7 +1417,8 @@ static void skd_resolve_req_exception(struct skd_device *skdev, case SKD_CHECK_STATUS_REPORT_GOOD: case SKD_CHECK_STATUS_REPORT_SMART_ALERT: skreq->status = BLK_STS_OK; - blk_mq_complete_request(req); + if (likely(!blk_should_fake_timeout(req->q))) + blk_mq_complete_request(req); break; case SKD_CHECK_STATUS_BUSY_IMMINENT: @@ -1440,7 +1441,8 @@ static void skd_resolve_req_exception(struct skd_device *skdev, case SKD_CHECK_STATUS_REPORT_ERROR: default: skreq->status = BLK_STS_IOERR; - blk_mq_complete_request(req); + if (likely(!blk_should_fake_timeout(req->q))) + blk_mq_complete_request(req); break; } } @@ -1560,7 +1562,8 @@ static int skd_isr_completion_posted(struct skd_device *skdev, */ if (likely(cmp_status == SAM_STAT_GOOD)) { skreq->status = BLK_STS_OK; - blk_mq_complete_request(rq); + if (likely(!blk_should_fake_timeout(rq->q))) + blk_mq_complete_request(rq); } else { skd_resolve_req_exception(skdev, skreq, rq); } diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 2a5cd502feae..15bf7dceddf6 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -272,7 +272,8 @@ static void virtblk_done(struct virtqueue *vq) while ((vbr = virtqueue_get_buf(vblk->vqs[qid].vq, &len)) != NULL) { struct request *req = blk_mq_rq_from_pdu(vbr); - blk_mq_complete_request(req); + if (likely(!blk_should_fake_timeout(req->q))) + blk_mq_complete_request(req); req_done = true; } if (unlikely(virtqueue_is_broken(vq))) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 3731066f2c1c..a1d1cd25a10f 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -1699,7 +1699,8 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) BUG(); } - blk_mq_complete_request(req); + if (likely(!blk_should_fake_timeout(req->q))) + blk_mq_complete_request(req); } rinfo->ring.rsp_cons = i; diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 6bc61927d320..fb2add32d96e 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -285,7 +285,8 @@ static void dm_complete_request(struct request *rq, blk_status_t error) struct dm_rq_target_io *tio = tio_from_request(rq); tio->error = error; - blk_mq_complete_request(rq); + if (likely(!blk_should_fake_timeout(rq->q))) + blk_mq_complete_request(rq); } /* diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c index 362ad361d586..4f885a050301 100644 --- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -1512,7 +1512,7 @@ static void mmc_blk_cqe_req_done(struct mmc_request *mrq) */ if (mq->in_recovery) mmc_blk_cqe_complete_rq(mq, req); - else + else if (likely(!blk_should_fake_timeout(req->q))) blk_mq_complete_request(req); } @@ -1946,7 +1946,7 @@ void mmc_blk_mq_complete(struct request *req) if (mq->use_cqe) mmc_blk_cqe_complete_rq(mq, req); - else + else if (likely(!blk_should_fake_timeout(req->q))) mmc_blk_mq_complete_rq(mq, req); } @@ -1998,7 +1998,7 @@ static void mmc_blk_mq_post_req(struct mmc_queue *mq, struct request *req) */ if (mq->in_recovery) mmc_blk_mq_complete_rq(mq, req); - else + else if (likely(!blk_should_fake_timeout(req->q))) blk_mq_complete_request(req); mmc_blk_mq_dec_in_flight(mq, req); diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 2df90d4355b9..14bfdc5d8782 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -453,7 +453,8 @@ static inline void nvme_end_request(struct request *req, __le16 status, rq->result = result; /* inject error when permitted by fault injection framework */ nvme_should_fail(req); - blk_mq_complete_request(req); + if (likely(!blk_should_fake_timeout(req->q))) + blk_mq_complete_request(req); } static inline void nvme_get_ctrl(struct nvme_ctrl *ctrl) diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c index e0570cd0e520..f4edfe383e9d 100644 --- a/drivers/s390/block/dasd.c +++ b/drivers/s390/block/dasd.c @@ -2814,7 +2814,7 @@ static void __dasd_cleanup_cqr(struct dasd_ccw_req *cqr) if (proc_bytes) { blk_update_request(req, BLK_STS_OK, proc_bytes); blk_mq_requeue_request(req, true); - } else { + } else if (likely(!blk_should_fake_timeout(req->q))) { blk_mq_complete_request(req); } } diff --git a/drivers/s390/block/scm_blk.c b/drivers/s390/block/scm_blk.c index e01889394c84..a4f6f2e62b1d 100644 --- a/drivers/s390/block/scm_blk.c +++ b/drivers/s390/block/scm_blk.c @@ -256,7 +256,8 @@ static void scm_request_finish(struct scm_request *scmrq) for (i = 0; i < nr_requests_per_io && scmrq->request[i]; i++) { error = blk_mq_rq_to_pdu(scmrq->request[i]); *error = scmrq->error; - blk_mq_complete_request(scmrq->request[i]); + if (likely(!blk_should_fake_timeout(scmrq->request[i]->q))) + blk_mq_complete_request(scmrq->request[i]); } atomic_dec(&bdev->queued_reqs); diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 8e6d7ba95df1..6ebb3cf52578 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1618,18 +1618,12 @@ static blk_status_t scsi_mq_prep_fn(struct request *req) static void scsi_mq_done(struct scsi_cmnd *cmd) { + if (unlikely(blk_should_fake_timeout(cmd->request->q))) + return; if (unlikely(test_and_set_bit(SCMD_STATE_COMPLETE, &cmd->state))) return; trace_scsi_dispatch_cmd_done(cmd); - - /* - * If the block layer didn't complete the request due to a timeout - * injection, scsi must clear its internal completed state so that the - * timeout handler will see it needs to escalate its own error - * recovery. - */ - if (unlikely(!blk_mq_complete_request(cmd->request))) - clear_bit(SCMD_STATE_COMPLETE, &cmd->state); + blk_mq_complete_request(cmd->request); } static void scsi_mq_put_budget(struct blk_mq_hw_ctx *hctx) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 92b48a8e4af3..0732dec55650 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -311,8 +311,7 @@ void __blk_mq_end_request(struct request *rq, blk_status_t error); void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list); void blk_mq_kick_requeue_list(struct request_queue *q); void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs); -bool blk_mq_complete_request(struct request *rq); -void blk_mq_force_complete_rq(struct request *rq); +void blk_mq_complete_request(struct request *rq); bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list, struct bio *bio, unsigned int nr_segs); bool blk_mq_queue_stopped(struct request_queue *q); @@ -344,6 +343,15 @@ void blk_mq_quiesce_queue_nowait(struct request_queue *q); unsigned int blk_mq_rq_cpu(struct request *rq); +bool __blk_should_fake_timeout(struct request_queue *q); +static inline bool blk_should_fake_timeout(struct request_queue *q) +{ + if (IS_ENABLED(CONFIG_FAIL_IO_TIMEOUT) && + test_bit(QUEUE_FLAG_FAIL_IO, &q->queue_flags)) + return __blk_should_fake_timeout(q); + return false; +} + /* * Driver command data is immediately after the request. So subtract request * size to get back to the original request, add request size to get the PDU. From patchwork Wed Jul 20 03:12:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Ruffell X-Patchwork-Id: 1658323 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=tMZI16nS; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4Lngjy22CNz9s2R for ; Wed, 20 Jul 2022 13:12:38 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1oE08U-0008Ae-Gg; Wed, 20 Jul 2022 03:12:30 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1oE08O-00085K-FI for kernel-team@lists.ubuntu.com; Wed, 20 Jul 2022 03:12:24 +0000 Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 3226D3F0C8 for ; Wed, 20 Jul 2022 03:12:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1658286744; bh=n6aBxKJZy3eWCQlVuW7wS5BNXagElBfPu4J3FZgNt1Y=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tMZI16nSbhQL2Q9QvfIxiPmNHw6dRkbCEMqfVY+K/pZzsP74RIivJw93SOmo2jb+d n/lKBxZy1/A8y0rZyCPOeE4kkZyfhbID76dJSHBtjlRmdqg74Ex3NAyKaEz4CLrN8D udov7fqSKUDK+FUJEyOYUgPCU1mGT4KYsD3H15eXWZenTkzSh2nJlH4tt17q/7O3IQ NmGzLANM6gPQgVf46JZ5I9bmGH2YKv5Y6MDAj4nA/QGSqIi7/KqjEayBMk28DOCzkU ueK8FwwVPUnRPJb84cRoNAnYUXHQcrepkyPWCeVl23JKMsqk7aHYQmEX8Aj1lj0Lok 6GteGr+X6Kiww== Received: by mail-pl1-f198.google.com with SMTP id a17-20020a170902ecd100b0016c012c4cf3so9743032plh.15 for ; Tue, 19 Jul 2022 20:12:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=n6aBxKJZy3eWCQlVuW7wS5BNXagElBfPu4J3FZgNt1Y=; b=ofYQuV0iESXEc3IiJJeNy3l2EF57wIni+mvJ0aem9mBdX/N6Xv4vsp4/4a2jG7TUil CKcorcUFMhJooUF3sH4p/EnYZxnf5fEHugBrFRLSPbhcYw4NcsG7aCu/XQ4naG2kILPr PeccNdgW5j0D0L2z1zEJnEJGUVkS+HNXXiMbeWnX2KMGKiqvi+L3Zx2sIQxs+Bfsuwfy xfzJdlh26finYvMru8xgMb0Wme4CTeLLSQ6762S7WJVVYeYqRm3okpZURuAPJwA3pbLG 8yajI1Spmf2dzaqRMHLA1q3891L+9MXpZelS+wFj1tJdztHk1PPgkxph1KX+XYo2DVtt Y3qw== X-Gm-Message-State: AJIora+EtTPxmSKcxDax3TSFJKJ+7WvNbS2qB0rsZzg710/4fjtIUdFS tES91uKTfhkuOFVuhB86IsR52RiN5gAAemCqWxTkjHjeGnN8uhtPY2TNY/ckAW9mv3O8tzR17zz QmjmIjJTOQKMWTduwYQtOSRA5/OAHK9CkhHPDyDk28A== X-Received: by 2002:a17:90b:1808:b0:1ef:b5cd:ad8b with SMTP id lw8-20020a17090b180800b001efb5cdad8bmr2957113pjb.18.1658286742826; Tue, 19 Jul 2022 20:12:22 -0700 (PDT) X-Google-Smtp-Source: AGRyM1strzBrzMl5daVZ0M0MNzXLZ0jOzh/tRi1i4HcU+JOKQUELaATeNQfWz+d/a3SkMx4my4m7og== X-Received: by 2002:a17:90b:1808:b0:1ef:b5cd:ad8b with SMTP id lw8-20020a17090b180800b001efb5cdad8bmr2957079pjb.18.1658286742478; Tue, 19 Jul 2022 20:12:22 -0700 (PDT) Received: from desktop.. (125-239-70-54-fibre.sparkbb.co.nz. [125.239.70.54]) by smtp.gmail.com with ESMTPSA id i2-20020a17090ac40200b001efbc3ad105sm331812pjt.54.2022.07.19.20.12.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Jul 2022 20:12:22 -0700 (PDT) From: Matthew Ruffell To: kernel-team@lists.ubuntu.com Subject: [SRU][F][PATCH V2 3/6] nbd: don't handle response without a corresponding request message Date: Wed, 20 Jul 2022 15:12:07 +1200 Message-Id: <20220720031210.17801-4-matthew.ruffell@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220720031210.17801-1-matthew.ruffell@canonical.com> References: <20220720031210.17801-1-matthew.ruffell@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Yu Kuai BugLink: https://bugs.launchpad.net/bugs/1896350 While handling a response message from server, nbd_read_stat() will try to get request by tag, and then complete the request. However, this is problematic if nbd haven't sent a corresponding request message: t1 t2 submit_bio nbd_queue_rq blk_mq_start_request recv_work nbd_read_stat blk_mq_tag_to_rq blk_mq_complete_request nbd_send_cmd Thus add a new cmd flag 'NBD_CMD_INFLIGHT', it will be set in nbd_send_cmd() and checked in nbd_read_stat(). Noted that this patch can't fix that blk_mq_tag_to_rq() might return a freed request, and this will be fixed in following patches. Signed-off-by: Yu Kuai Reviewed-by: Ming Lei Reviewed-by: Josef Bacik Link: https://lore.kernel.org/r/20210916093350.1410403-2-yukuai3@huawei.com Signed-off-by: Jens Axboe (cherry picked from 4e6eef5dc25b528e08ac5b5f64f6ca9d9987241d) Signed-off-by: Matthew Ruffell --- drivers/block/nbd.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 15a8b499b738..b960e29b0b57 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -122,6 +122,12 @@ struct nbd_device { }; #define NBD_CMD_REQUEUED 1 +/* + * This flag will be set if nbd_queue_rq() succeed, and will be checked and + * cleared in completion. Both setting and clearing of the flag are protected + * by cmd->lock. + */ +#define NBD_CMD_INFLIGHT 2 struct nbd_cmd { struct nbd_device *nbd; @@ -388,6 +394,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req, if (!mutex_trylock(&cmd->lock)) return BLK_EH_RESET_TIMER; + __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags); if (!refcount_inc_not_zero(&nbd->config_refs)) { cmd->status = BLK_STS_TIMEOUT; mutex_unlock(&cmd->lock); @@ -704,6 +711,12 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index) cmd = blk_mq_rq_to_pdu(req); mutex_lock(&cmd->lock); + if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) { + dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d (status %u flags %lu)", + tag, cmd->status, cmd->flags); + ret = -ENOENT; + goto out; + } if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) { dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n", req, cmd->cmd_cookie, nbd_handle_to_cookie(handle)); @@ -805,6 +818,7 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved) return true; mutex_lock(&cmd->lock); + __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags); cmd->status = BLK_STS_IOERR; mutex_unlock(&cmd->lock); @@ -941,7 +955,13 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index) * returns EAGAIN can be retried on a different socket. */ ret = nbd_send_cmd(nbd, cmd, index); - if (ret == -EAGAIN) { + /* + * Access to this flag is protected by cmd->lock, thus it's safe to set + * the flag after nbd_send_cmd() succeed to send request to server. + */ + if (!ret) + __set_bit(NBD_CMD_INFLIGHT, &cmd->flags); + else if (ret == -EAGAIN) { dev_err_ratelimited(disk_to_dev(nbd->disk), "Request send failed, requeueing\n"); nbd_mark_nsock_dead(nbd, nsock, 1); From patchwork Wed Jul 20 03:12:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Ruffell X-Patchwork-Id: 1658324 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=OmylfbOg; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4Lngjz6lD6z9s2R for ; Wed, 20 Jul 2022 13:12:39 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1oE08W-0008DH-Ee; Wed, 20 Jul 2022 03:12:32 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1oE08Q-000867-7g for kernel-team@lists.ubuntu.com; Wed, 20 Jul 2022 03:12:26 +0000 Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id E49C43F0C8 for ; Wed, 20 Jul 2022 03:12:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1658286745; bh=gDHDt/srQTe5fpzQYuYYMiFsH7GWUkisZgdnjg8rkCs=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=OmylfbOgi9a4SvU8X5ae/THTMRu3X+avnTijCGbwA+goGV9lGZoHQiDAxI612cpKu 5uKxYzQKbk7KUPW1NLWUw1w6WChS7vYpucZzVCmFbFZ+/Cg27wdh2B1PIV7ukKXUsI Evnx3S22dg7d5X0HbQIQb3g8SdLr9A9RSaqV46NO3XB4evAtCJ/Y7Y3vz111mezz27 KEzsYIvmLgKr4zEtL448XOh74cZqUIYZeHcp4gJKSYsuKAE/B8FSeXpKqvw1wtpbC0 MKo0HE3RhItqVfUSUqUKvBiUWQe1w/GKw9VQ3Lu5eqE5CSBkbrtcnALM8uh3g6/mjV SW7CvjLz8cz7w== Received: by mail-pf1-f199.google.com with SMTP id 64-20020a621643000000b0052b51cf6b2aso2352375pfw.0 for ; Tue, 19 Jul 2022 20:12:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gDHDt/srQTe5fpzQYuYYMiFsH7GWUkisZgdnjg8rkCs=; b=S1ankKnnLMngF06BsvCbzsCnA/SyzDPw6LOmMb/MOuDk9QTUQos85qjkwM1Zak5XL2 J4yDu4eJwA1SNNV4uUuWQ6uJ/gVrvwBzPFhZKsx9qQrgk39w3abSSwzI/CsrAhZ1cj+H 5qdybNMyL69/9a+w+WnBIKxh1kATv15KO19On7OCqLab6oFPnIQcXyOLrHvUm0TQVnT4 +FfUC2Yle4TInlkGVdTKJUC4COyQ802l8lEVmXkiNwNlcPf1KnJgM8YEVq+t+fOQoAI/ fIvc8s0b12cnmki092W61yPrXhwge1fjYLaFLUN8hahLMiXlz57fKYeJBlhnT3pQmUoQ nA2w== X-Gm-Message-State: AJIora8K4yEazeIIgOx5xazIzVRxlSDIURwvaAxAX2V4ljQOIsI5rGFZ W2pd+JHJa8/tXlHMPeJv2RYxpQNBFTwK9sMjZFH4NtPnuVzCbiIXIwa4ICxJ5t+ft/zO9YZ5U+z 3LHf687qk9LBA7aPo+RHWT+O5BsEzPPjNqAdh9dyBSw== X-Received: by 2002:a17:90b:3907:b0:1f0:386e:c0ee with SMTP id ob7-20020a17090b390700b001f0386ec0eemr2853811pjb.141.1658286744515; Tue, 19 Jul 2022 20:12:24 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sPfcF81NbMr6dOMlwF4Ic5v4v4NeHD3nz22KnLIcUG3RZ623Pe7NmY69gPdoGcO5se0l2F+g== X-Received: by 2002:a17:90b:3907:b0:1f0:386e:c0ee with SMTP id ob7-20020a17090b390700b001f0386ec0eemr2853787pjb.141.1658286744258; Tue, 19 Jul 2022 20:12:24 -0700 (PDT) Received: from desktop.. (125-239-70-54-fibre.sparkbb.co.nz. [125.239.70.54]) by smtp.gmail.com with ESMTPSA id i2-20020a17090ac40200b001efbc3ad105sm331812pjt.54.2022.07.19.20.12.22 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Jul 2022 20:12:23 -0700 (PDT) From: Matthew Ruffell To: kernel-team@lists.ubuntu.com Subject: [SRU][F][PATCH V2 4/6] nbd: make sure request completion won't concurrent Date: Wed, 20 Jul 2022 15:12:08 +1200 Message-Id: <20220720031210.17801-5-matthew.ruffell@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220720031210.17801-1-matthew.ruffell@canonical.com> References: <20220720031210.17801-1-matthew.ruffell@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Yu Kuai BugLink: https://bugs.launchpad.net/bugs/1896350 commit cddce0116058 ("nbd: Aovid double completion of a request") try to fix that nbd_clear_que() and recv_work() can complete a request concurrently. However, the problem still exists: t1 t2 t3 nbd_disconnect_and_put flush_workqueue recv_work blk_mq_complete_request blk_mq_complete_request_remote -> this is true WRITE_ONCE(rq->state, MQ_RQ_COMPLETE) blk_mq_raise_softirq blk_done_softirq blk_complete_reqs nbd_complete_rq blk_mq_end_request blk_mq_free_request WRITE_ONCE(rq->state, MQ_RQ_IDLE) nbd_clear_que blk_mq_tagset_busy_iter nbd_clear_req __blk_mq_free_request blk_mq_put_tag blk_mq_complete_request -> complete again There are three places where request can be completed in nbd: recv_work(), nbd_clear_que() and nbd_xmit_timeout(). Since they all hold cmd->lock before completing the request, it's easy to avoid the problem by setting and checking a cmd flag. Signed-off-by: Yu Kuai Reviewed-by: Ming Lei Reviewed-by: Josef Bacik Link: https://lore.kernel.org/r/20210916093350.1410403-3-yukuai3@huawei.com Signed-off-by: Jens Axboe (cherry picked from 07175cb1baf4c51051b1fbd391097e349f9a02a9) Signed-off-by: Matthew Ruffell --- drivers/block/nbd.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index b960e29b0b57..01d030e9f301 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -394,7 +394,11 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req, if (!mutex_trylock(&cmd->lock)) return BLK_EH_RESET_TIMER; - __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags); + if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) { + mutex_unlock(&cmd->lock); + return BLK_EH_DONE; + } + if (!refcount_inc_not_zero(&nbd->config_refs)) { cmd->status = BLK_STS_TIMEOUT; mutex_unlock(&cmd->lock); @@ -818,7 +822,10 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved) return true; mutex_lock(&cmd->lock); - __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags); + if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) { + mutex_unlock(&cmd->lock); + return true; + } cmd->status = BLK_STS_IOERR; mutex_unlock(&cmd->lock); From patchwork Wed Jul 20 03:12:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Ruffell X-Patchwork-Id: 1658325 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=Ewtq4Z+V; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4Lngk12thTz9s2R for ; Wed, 20 Jul 2022 13:12:41 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1oE08X-0008FT-GX; Wed, 20 Jul 2022 03:12:33 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1oE08S-00087s-Hu for kernel-team@lists.ubuntu.com; Wed, 20 Jul 2022 03:12:28 +0000 Received: from mail-pj1-f69.google.com (mail-pj1-f69.google.com [209.85.216.69]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 939353F0C8 for ; Wed, 20 Jul 2022 03:12:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1658286747; bh=xe++JPUwPctTQX4ooJYcTyGAp/kAFKatmjxj2SqTePg=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Ewtq4Z+VRJS7wtoDjvxDJptohjOrc3DpFgM7gSyV1fPEen9SKBVsw5JmUD052ix83 kOkDcOXC8sMLkpziuUl+v1hitoyilItnVJ0hD2ZkBl9FasOAHTwEE5apnl2FNU8W7q HJN1gqgfkzefgwW8T1lSbSxXxB66nLjtL80YNTFzzD3U0T+YlTPoCdacX7vTVvs6Wu /5QRjrESLf5bSpCjKwhzftU+TPmGtYukqlf0k4+s1OHgNv+xWdCyTnd0nl6SZBhakd YvBooYodfQ82TUnSPt5sB8jQR7CNPnOeHbHuLFs37eCfQRMwx8bjFUnmxB7IjkinXZ gmuG4laOjBokg== Received: by mail-pj1-f69.google.com with SMTP id b8-20020a17090a010800b001f1f4fc8178so465192pjb.8 for ; Tue, 19 Jul 2022 20:12:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xe++JPUwPctTQX4ooJYcTyGAp/kAFKatmjxj2SqTePg=; b=ApwWnToiMyjd8kU+Qj2iFr8J7026luFYROPjCBFZRvSUMnwuDQF+OWH5O7E3eBlbVz 3hhCO2GNcDArFcWAuqGqvTOmtbiPzJsTTdmEtQSp1+yh0QHKBsN47hPUP7v9gTgk7vxT KwhK6A1autJInqGUoc087hnyfVZqr5QTYkUGdrGs0OT1wTwu+e7J3sRdqK4XeYR/p8q0 XEfluQ2to5qI7Ah7v2Hm2T2wbXGYVnrStdVQPu94S480C6rfhhhiq3CsSesQIlTtxN2m 9W/UIa0+V0kyAZyAuxD40f2ESHjwYQ0EakKWMuohzud6wVsqCm1sACqt8tg2iTVaDjJq WzAQ== X-Gm-Message-State: AJIora80I02uShzBv5F4v7aNBg4TyVCQ+V43GFcm/Lj650s63c/DSoOP 8a3DT88IvYB4ubn4TSIk4cYxk3tYtU3vjitWrzSeY/wROAUcPO/Jmh2rz+yczdLM2LdVRNDY73y u8JzkqQi0VwCbeGb7NSs4AZVRwyy82FgxSj2k+ma4Nw== X-Received: by 2002:a17:90b:17d1:b0:1f0:6f1:90d1 with SMTP id me17-20020a17090b17d100b001f006f190d1mr2845710pjb.221.1658286746313; Tue, 19 Jul 2022 20:12:26 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vMfTuHyeF1itAuFMbVe52/Eb+6BLr1IyyRZ3y0xX6kcgI4rO1E7BB9Yo8vINcZL3Edw7c+ww== X-Received: by 2002:a17:90b:17d1:b0:1f0:6f1:90d1 with SMTP id me17-20020a17090b17d100b001f006f190d1mr2845691pjb.221.1658286746069; Tue, 19 Jul 2022 20:12:26 -0700 (PDT) Received: from desktop.. (125-239-70-54-fibre.sparkbb.co.nz. [125.239.70.54]) by smtp.gmail.com with ESMTPSA id i2-20020a17090ac40200b001efbc3ad105sm331812pjt.54.2022.07.19.20.12.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Jul 2022 20:12:25 -0700 (PDT) From: Matthew Ruffell To: kernel-team@lists.ubuntu.com Subject: [SRU][F][PATCH V2 5/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed Date: Wed, 20 Jul 2022 15:12:09 +1200 Message-Id: <20220720031210.17801-6-matthew.ruffell@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220720031210.17801-1-matthew.ruffell@canonical.com> References: <20220720031210.17801-1-matthew.ruffell@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Yu Kuai BugLink: https://bugs.launchpad.net/bugs/1896350 Otherwise io will hung because request will only be completed if the cmd has the flag 'NBD_CMD_INFLIGHT'. Fixes: 07175cb1baf4 ("nbd: make sure request completion won't concurrent") Signed-off-by: Yu Kuai Link: https://lore.kernel.org/r/20220521073749.3146892-4-yukuai3@huawei.com Signed-off-by: Jens Axboe (backported from 2895f1831e911ca87d4efdf43e35eb72a0c7e66e) [mruffell: context adjustment removing percpu_ref_put in recv_work()] Signed-off-by: Matthew Ruffell --- drivers/block/nbd.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 01d030e9f301..6b165101f84b 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -394,13 +394,14 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req, if (!mutex_trylock(&cmd->lock)) return BLK_EH_RESET_TIMER; - if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) { + if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) { mutex_unlock(&cmd->lock); return BLK_EH_DONE; } if (!refcount_inc_not_zero(&nbd->config_refs)) { cmd->status = BLK_STS_TIMEOUT; + __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags); mutex_unlock(&cmd->lock); goto done; } @@ -456,6 +457,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req, dev_err_ratelimited(nbd_to_dev(nbd), "Connection timed out\n"); set_bit(NBD_RT_TIMEDOUT, &config->runtime_flags); cmd->status = BLK_STS_IOERR; + __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags); mutex_unlock(&cmd->lock); sock_shutdown(nbd); nbd_config_put(nbd); @@ -715,7 +717,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index) cmd = blk_mq_rq_to_pdu(req); mutex_lock(&cmd->lock); - if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) { + if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) { dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d (status %u flags %lu)", tag, cmd->status, cmd->flags); ret = -ENOENT; @@ -804,8 +806,16 @@ static void recv_work(struct work_struct *work) } rq = blk_mq_rq_from_pdu(cmd); - if (likely(!blk_should_fake_timeout(rq->q))) - blk_mq_complete_request(rq); + if (likely(!blk_should_fake_timeout(rq->q))) { + bool complete; + + mutex_lock(&cmd->lock); + complete = __test_and_clear_bit(NBD_CMD_INFLIGHT, + &cmd->flags); + mutex_unlock(&cmd->lock); + if (complete) + blk_mq_complete_request(rq); + } } nbd_config_put(nbd); atomic_dec(&config->recv_threads); From patchwork Wed Jul 20 03:12:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Ruffell X-Patchwork-Id: 1658326 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=YTAAiVMD; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4Lngk32SmTz9s2R for ; Wed, 20 Jul 2022 13:12:43 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1oE08Z-0008HP-11; Wed, 20 Jul 2022 03:12:35 +0000 Received: from smtp-relay-internal-0.internal ([10.131.114.225] helo=smtp-relay-internal-0.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1oE08U-0008AO-K6 for kernel-team@lists.ubuntu.com; Wed, 20 Jul 2022 03:12:30 +0000 Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-0.canonical.com (Postfix) with ESMTPS id EF9F13F11D for ; Wed, 20 Jul 2022 03:12:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1658286749; bh=KjSY5ubDg0IC/QEKFdYTqLvZGJCDOKcORcSwv+Tn+Bg=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=YTAAiVMDcMIcxCIel3tr/lkVKdcLM1nsKd+wXFPH6Xz1+H+uaGmk8ANSx7GnQD5je 1XWwquT8W5o9d8emMYDRrvydc736iStFn8HE37jR8DeTfcFSRTkRY0FHQpnfKl0iPi ECph2NZBAfohne0rVKLGvOYQwlwV6pF2sooPYjKAmTXEe/1zvSYf7IBWxxWKllx8Hl Crtz1UKisqXyvoFChSwSagkg5SBMNt2vh2VwO71Or/skIc0PcK4xAUhJidGDMTyQKG 41LIgMAd0/YhodpgIcv1vOX8i/00G8xLGaTfeybEdADHZIVvThFGcCi3dinsG6shep Iz3A5qReE7OdA== Received: by mail-pl1-f197.google.com with SMTP id i9-20020a170902cf0900b0016d1e277547so670649plg.0 for ; Tue, 19 Jul 2022 20:12:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KjSY5ubDg0IC/QEKFdYTqLvZGJCDOKcORcSwv+Tn+Bg=; b=gye7+om3R31+TSlR5vV0k6/22MkiLx3+VP6f2TCJLMFkA44p9ee5a8SNam0GTNy0T7 EWsDykBjPC7gFrNbsD5Ox8I79n+wOhslqvQwOb85nDo6T4F5Cszetq7A4y+wLqm+hdXB nEGl9I1nyWHVn4uTn4Rhuce7UjWdsDgNETgxyNpLQLsvwjnH7/5dCRsc6UpOQy7KgKwz tZyYw9Atj1/2JD5tOPT+ouweFkW7aiSYuTpyl0M5KtuGej/jkR+/S44MhprGavCsFTAj RIOjOzDzwd1Nk0bwUbj5FAeSlmMigPIwDsrC+bT5ZoT/nFkJnp2kpHni8ajcR6y8Tekn iffA== X-Gm-Message-State: AJIora8eSfY2Glzaw4AQpTTZ4u1CFKk8ZA4Q+8YmrlLYPc1p6NvDfk6B DamxmTfQoDU0klmgzXf97Wo2GzhNN6g44BHP0kgVhuy/CK8tLGEpomXb5B9eIuW/MX5DBpfThgb LtT+K3dAAc31hb2rPR8R/G1uU0N5NAJkHGG+oHLAFfg== X-Received: by 2002:a17:90a:ce96:b0:1f1:feb7:28b4 with SMTP id g22-20020a17090ace9600b001f1feb728b4mr2897147pju.53.1658286748332; Tue, 19 Jul 2022 20:12:28 -0700 (PDT) X-Google-Smtp-Source: AGRyM1u/nWp3KPq9nUQ3eo/Ty3Ft2/2WDBopoSaJji0F+QyPAJ3d59XcmOl04wBIN8Xi/uS3ikstTg== X-Received: by 2002:a17:90a:ce96:b0:1f1:feb7:28b4 with SMTP id g22-20020a17090ace9600b001f1feb728b4mr2897126pju.53.1658286748021; Tue, 19 Jul 2022 20:12:28 -0700 (PDT) Received: from desktop.. (125-239-70-54-fibre.sparkbb.co.nz. [125.239.70.54]) by smtp.gmail.com with ESMTPSA id i2-20020a17090ac40200b001efbc3ad105sm331812pjt.54.2022.07.19.20.12.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Jul 2022 20:12:27 -0700 (PDT) From: Matthew Ruffell To: kernel-team@lists.ubuntu.com Subject: [SRU][F][PATCH V2 6/6] nbd: fix io hung while disconnecting device Date: Wed, 20 Jul 2022 15:12:10 +1200 Message-Id: <20220720031210.17801-7-matthew.ruffell@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220720031210.17801-1-matthew.ruffell@canonical.com> References: <20220720031210.17801-1-matthew.ruffell@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Yu Kuai BugLink: https://bugs.launchpad.net/bugs/1896350 In our tests, "qemu-nbd" triggers a io hung: INFO: task qemu-nbd:11445 blocked for more than 368 seconds. Not tainted 5.18.0-rc3-next-20220422-00003-g2176915513ca #884 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:qemu-nbd state:D stack: 0 pid:11445 ppid: 1 flags:0x00000000 Call Trace: __schedule+0x480/0x1050 ? _raw_spin_lock_irqsave+0x3e/0xb0 schedule+0x9c/0x1b0 blk_mq_freeze_queue_wait+0x9d/0xf0 ? ipi_rseq+0x70/0x70 blk_mq_freeze_queue+0x2b/0x40 nbd_add_socket+0x6b/0x270 [nbd] nbd_ioctl+0x383/0x510 [nbd] blkdev_ioctl+0x18e/0x3e0 __x64_sys_ioctl+0xac/0x120 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fd8ff706577 RSP: 002b:00007fd8fcdfebf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000040000000 RCX: 00007fd8ff706577 RDX: 000000000000000d RSI: 000000000000ab00 RDI: 000000000000000f RBP: 000000000000000f R08: 000000000000fbe8 R09: 000055fe497c62b0 R10: 00000002aff20000 R11: 0000000000000246 R12: 000000000000006d R13: 0000000000000000 R14: 00007ffe82dc5e70 R15: 00007fd8fcdff9c0 "qemu-ndb -d" will call ioctl 'NBD_DISCONNECT' first, however, following message was found: block nbd0: Send disconnect failed -32 Which indicate that something is wrong with the server. Then, "qemu-nbd -d" will call ioctl 'NBD_CLEAR_SOCK', however ioctl can't clear requests after commit 2516ab1543fd("nbd: only clear the queue on device teardown"). And in the meantime, request can't complete through timeout because nbd_xmit_timeout() will always return 'BLK_EH_RESET_TIMER', which means such request will never be completed in this situation. Now that the flag 'NBD_CMD_INFLIGHT' can make sure requests won't complete multiple times, switch back to call nbd_clear_sock() in nbd_clear_sock_ioctl(), so that inflight requests can be cleared. Signed-off-by: Yu Kuai Reviewed-by: Josef Bacik Link: https://lore.kernel.org/r/20220521073749.3146892-5-yukuai3@huawei.com Signed-off-by: Jens Axboe (cherry picked from commit 09dadb5985023e27d4740ebd17e6fea4640110e5) Signed-off-by: Matthew Ruffell --- drivers/block/nbd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 6b165101f84b..aff24ecb2898 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1380,7 +1380,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b static void nbd_clear_sock_ioctl(struct nbd_device *nbd, struct block_device *bdev) { - sock_shutdown(nbd); + nbd_clear_sock(nbd); __invalidate_device(bdev, true); nbd_bdev_reset(bdev); if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF,