From patchwork Wed Nov 27 20:18:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Henrique Cerri X-Patchwork-Id: 1201799 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47NXDF0BxLz9sRQ; Thu, 28 Nov 2019 07:19:09 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ia3mC-0003KZ-Vd; Wed, 27 Nov 2019 20:19:04 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1ia3m7-0003FY-LD for kernel-team@lists.ubuntu.com; Wed, 27 Nov 2019 20:18:59 +0000 Received: from mail-qk1-f199.google.com ([209.85.222.199]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1ia3m6-0004VA-0m for kernel-team@lists.ubuntu.com; Wed, 27 Nov 2019 20:18:58 +0000 Received: by mail-qk1-f199.google.com with SMTP id q13so11629857qke.11 for ; Wed, 27 Nov 2019 12:18:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tHg8cLma/NEwvRB2TiNw5l9zyHA8DaFso1ivYyOqHpw=; b=hZtyb1QnFoncJUkI2ZfP3NQAYDeTc4YXj2Xt9u5+R4qCCNqnVQU95o30I3fL5dYheI Qs4BuyCjZzs0qogD+g1v3xXB3dmAMUxFDUrVrTs5/6bBTEh+29Q81TNXWQWgwDFEj3b7 74AVEsOwoN2aSKvquQSzJPMZv7b/0UbnFHWMpCcjbBKrEO1+Q/4XHprheCnAFg7BxE0Y uSLBc8oiIlU+JXMGjQVAKS/5NZQmHWsDliWriWAC/BCQCXe027D309Bb1zrHNLti75hV Rh1Ebj2HDiKD8onZw4ItAodzcCpvkvQ/AulGTvjNP1MVGlw6LLhPv2F8Do35ZYVk0S73 kr5g== X-Gm-Message-State: APjAAAWQzgYM9Tbgi8fZCi8XPok/OYwA1TpuOyEdORX5eOaJRN04lFJ4 bvLg429iaMXR6rpmD5D993DKv3i3YuPtXD+8Ea1xem63nlNPoqCr+4VQ1H3aTpLOtmYf0t/eyU7 gSPIal9ppDHIa/12eJzBkrH/JWheA4TPZX8WHs9Xc X-Received: by 2002:a0c:f412:: with SMTP id h18mr7240018qvl.124.1574885936702; Wed, 27 Nov 2019 12:18:56 -0800 (PST) X-Google-Smtp-Source: APXvYqxxJCMZEYguS9/S877llrbhiEPOkSaV6BlMYNoRMezCFqI9xAwdi/Kgbr67WvR1xjYLHq8zHw== X-Received: by 2002:a0c:f412:: with SMTP id h18mr7239993qvl.124.1574885936318; Wed, 27 Nov 2019 12:18:56 -0800 (PST) Received: from gallifrey.lan ([2804:14c:4e6:1bc:4960:b0eb:4714:41f]) by smtp.gmail.com with ESMTPSA id o13sm8284524qto.96.2019.11.27.12.18.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Nov 2019 12:18:55 -0800 (PST) From: Marcelo Henrique Cerri To: kernel-team@lists.ubuntu.com Subject: [xenial:linux-azure][PATCH 15/15] blk-mq: punt failed direct issue to dispatch list Date: Wed, 27 Nov 2019 17:18:20 -0300 Message-Id: <20191127201820.32174-16-marcelo.cerri@canonical.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191127201820.32174-1-marcelo.cerri@canonical.com> References: <20191127201820.32174-1-marcelo.cerri@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jens Axboe BugLink: https://bugs.launchpad.net/bugs/1848739 After the direct dispatch corruption fix, we permanently disallow direct dispatch of non read/write requests. This works fine off the normal IO path, as they will be retried like any other failed direct dispatch request. But for the blk_insert_cloned_request() that only DM uses to bypass the bottom level scheduler, we always first attempt direct dispatch. For some types of requests, that's now a permanent failure, and no amount of retrying will make that succeed. This results in a livelock. Instead of making special cases for what we can direct issue, and now having to deal with DM solving the livelock while still retaining a BUSY condition feedback loop, always just add a request that has been through ->queue_rq() to the hardware queue dispatch list. These are safe to use as no merging can take place there. Additionally, if requests do have prepped data from drivers, we aren't dependent on them not sharing space in the request structure to safely add them to the IO scheduler lists. This basically reverts ffe81d45322c and is based on a patch from Ming, but with the list insert case covered as well. Fixes: ffe81d45322c ("blk-mq: fix corruption with direct issue") Cc: stable@vger.kernel.org Suggested-by: Ming Lei Reported-by: Bart Van Assche Tested-by: Ming Lei Acked-by: Mike Snitzer Signed-off-by: Jens Axboe (cherry picked from commit c616cbee97aed4bc6178f148a7240206dcdb85a6) Signed-off-by: Marcelo Henrique Cerri --- block/blk-mq.c | 33 +++++---------------------------- 1 file changed, 5 insertions(+), 28 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 6a1b7e3af232..8ef75eba264d 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1773,15 +1773,6 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, break; case BLK_STS_RESOURCE: case BLK_STS_DEV_RESOURCE: - /* - * If direct dispatch fails, we cannot allow any merging on - * this IO. Drivers (like SCSI) may have set up permanent state - * for this request, like SG tables and mappings, and if we - * merge to it later on then we'll still only do IO to the - * original part. - */ - rq->cmd_flags |= REQ_NOMERGE; - blk_mq_update_dispatch_busy(hctx, true); __blk_mq_requeue_request(rq); break; @@ -1794,18 +1785,6 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } -/* - * Don't allow direct dispatch of anything but regular reads/writes, - * as some of the other commands can potentially share request space - * with data we need for the IO scheduler. If we attempt a direct dispatch - * on those and fail, we can't safely add it to the scheduler afterwards - * without potentially overwriting data that the driver has already written. - */ -static bool blk_rq_can_direct_dispatch(struct request *rq) -{ - return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE; -} - static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_qc_t *cookie, @@ -1827,7 +1806,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, goto insert; } - if (!blk_rq_can_direct_dispatch(rq) || (q->elevator && !bypass_insert)) + if (q->elevator && !bypass_insert) goto insert; if (!blk_mq_get_dispatch_budget(hctx)) @@ -1843,7 +1822,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (bypass_insert) return BLK_STS_RESOURCE; - blk_mq_sched_insert_request(rq, false, run_queue, false); + blk_mq_request_bypass_insert(rq, run_queue); return BLK_STS_OK; } @@ -1859,7 +1838,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false); if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_sched_insert_request(rq, false, true, false); + blk_mq_request_bypass_insert(rq, true); else if (ret != BLK_STS_OK) blk_mq_end_request(rq, ret); @@ -1889,15 +1868,13 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, struct request *rq = list_first_entry(list, struct request, queuelist); - if (!blk_rq_can_direct_dispatch(rq)) - break; - list_del_init(&rq->queuelist); ret = blk_mq_request_issue_directly(rq); if (ret != BLK_STS_OK) { if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) { - list_add(&rq->queuelist, list); + blk_mq_request_bypass_insert(rq, + list_empty(list)); break; } blk_mq_end_request(rq, ret);