From patchwork Tue May 15 13:03:11 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
X-Patchwork-Id: 914799
Return-Path: <kernel-team-bounces@lists.ubuntu.com>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
	spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com
	(client-ip=91.189.94.19; helo=huckleberry.canonical.com;
	envelope-from=kernel-team-bounces@lists.ubuntu.com;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none)
	header.from=linux.vnet.ibm.com
Received: from huckleberry.canonical.com (huckleberry.canonical.com
	[91.189.94.19])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 40mLM746BLz9s4Y;
	Thu, 17 May 2018 03:01:55 +1000 (AEST)
Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com)
	by huckleberry.canonical.com with esmtp (Exim 4.86_2)
	(envelope-from <kernel-team-bounces@lists.ubuntu.com>)
	id 1fIzoB-0006Qn-Ez; Wed, 16 May 2018 17:01:47 +0000
Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1])
	by huckleberry.canonical.com with esmtps
	(TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2)
	(envelope-from <joserz@linux.vnet.ibm.com>) id 1fIZcK-0004HT-In
	for kernel-team@lists.ubuntu.com; Tue, 15 May 2018 13:03:48 +0000
Received: from pps.filterd (m0098409.ppops.net [127.0.0.1])
	by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id
	w4FCoPvm106848
	for <kernel-team@lists.ubuntu.com>; Tue, 15 May 2018 09:03:47 -0400
Received: from e18.ny.us.ibm.com (e18.ny.us.ibm.com [129.33.205.208])
	by mx0a-001b2d01.pphosted.com with ESMTP id 2hyw8gg1rr-1
	(version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT)
	for <kernel-team@lists.ubuntu.com>; Tue, 15 May 2018 09:03:46 -0400
Received: from localhost
	by e18.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use
	Only! Violators will be prosecuted
	for <kernel-team@lists.ubuntu.com> from <joserz@linux.vnet.ibm.com>;
	Tue, 15 May 2018 09:03:45 -0400
Received: from b01cxnp22035.gho.pok.ibm.com (9.57.198.25)
	by e18.ny.us.ibm.com (146.89.104.205) with IBM ESMTP SMTP Gateway:
	Authorized Use Only! Violators will be prosecuted;
	Tue, 15 May 2018 09:03:43 -0400
Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com
	[9.57.199.109])
	by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP
	id w4FD3gYl56623168; Tue, 15 May 2018 13:03:42 GMT
Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1])
	by IMSVA (Postfix) with ESMTP id 6AA37112034;
	Tue, 15 May 2018 09:03:45 -0400 (EDT)
Received: from pacoca.br.ibm.com (unknown [9.7.47.248])
	by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP id 07FB0112057;
	Tue, 15 May 2018 09:03:44 -0400 (EDT)
From: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
To: kernel-team@lists.ubuntu.com
Subject: [Bionic][PATCH v3 08/21] blk-mq: make sure hctx->next_cpu is set
	correctly
Date: Tue, 15 May 2018 10:03:11 -0300
X-Mailer: git-send-email 2.17.0
In-Reply-To: <20180515130324.23815-1-joserz@linux.vnet.ibm.com>
References: <20180515130324.23815-1-joserz@linux.vnet.ibm.com>
X-TM-AS-GCONF: 00
x-cbid: 18051513-0044-0000-0000-000004149EB4
X-IBM-SpamModules-Scores: 
X-IBM-SpamModules-Versions: BY=3.00009029; HX=3.00000241; KW=3.00000007;
	PH=3.00000004; SC=3.00000260; SDB=6.01032654; UDB=6.00527946;
	IPR=6.00811785;
	MB=3.00021126; MTD=3.00000008; XFM=3.00000015; UTC=2018-05-15 13:03:44
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 18051513-0045-0000-0000-00000846B2F7
Message-Id: <20180515130324.23815-9-joserz@linux.vnet.ibm.com>
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, ,
	definitions=2018-05-15_03:, , signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0
	priorityscore=1501
	malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0
	clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0
	classifier=spam adjust=0 reason=mlx scancount=1
	engine=8.0.1-1709140000
	definitions=main-1805150132
X-Mailman-Approved-At: Wed, 16 May 2018 17:01:18 +0000
X-BeenThere: kernel-team@lists.ubuntu.com
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Kernel team discussions <kernel-team.lists.ubuntu.com>
List-Unsubscribe: <https://lists.ubuntu.com/mailman/options/kernel-team>,
	<mailto:kernel-team-request@lists.ubuntu.com?subject=unsubscribe>
List-Archive: <https://lists.ubuntu.com/archives/kernel-team>
List-Post: <mailto:kernel-team@lists.ubuntu.com>
List-Help: <mailto:kernel-team-request@lists.ubuntu.com?subject=help>
List-Subscribe: <https://lists.ubuntu.com/mailman/listinfo/kernel-team>,
	<mailto:kernel-team-request@lists.ubuntu.com?subject=subscribe>
MIME-Version: 1.0
Errors-To: kernel-team-bounces@lists.ubuntu.com
Sender: "kernel-team" <kernel-team-bounces@lists.ubuntu.com>

From: Ming Lei <ming.lei@redhat.com>

BugLink: http://bugs.launchpad.net/bugs/1759723

When hctx->next_cpu is set from possible online CPUs, there is one
race in which hctx->next_cpu may be set as >= nr_cpu_ids, and finally
break workqueue.

The race can be triggered in the following two sitations:

1) when one CPU is becoming DEAD, blk_mq_hctx_notify_dead() is called
to dispatch requests from the DEAD cpu context, but at that
time, this DEAD CPU has been cleared from 'cpu_online_mask', so all
CPUs in hctx->cpumask may become offline, and cause hctx->next_cpu set
a bad value.

2) blk_mq_delay_run_hw_queue() is called from CPU B, and found the queue
should be run on the other CPU A, then CPU A may become offline at the
same time and all CPUs in hctx->cpumask become offline.

This patch deals with this issue by re-selecting next CPU, and making
sure it is set correctly.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Stefan Haberland <sth@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: "jianchao.wang" <jianchao.w.wang@oracle.com>
Tested-by: "jianchao.wang" <jianchao.w.wang@oracle.com>
Fixes: 20e4d81393 ("blk-mq: simplify queue mapping & schedule with each possisble CPU")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 7bed45954b95601230ebf387d3e4e20e4a3cc025)
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.ibm.com>
---
 block/blk-mq.c | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 50ef7dc4c41e..59f4127bdaed 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1257,21 +1257,47 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
  */
 static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
 {
+	bool tried = false;
+
 	if (hctx->queue->nr_hw_queues == 1)
 		return WORK_CPU_UNBOUND;
 
 	if (--hctx->next_cpu_batch <= 0) {
 		int next_cpu;
-
+select_cpu:
 		next_cpu = cpumask_next_and(hctx->next_cpu, hctx->cpumask,
 				cpu_online_mask);
 		if (next_cpu >= nr_cpu_ids)
 			next_cpu = cpumask_first_and(hctx->cpumask,cpu_online_mask);
 
-		hctx->next_cpu = next_cpu;
+		/*
+		 * No online CPU is found, so have to make sure hctx->next_cpu
+		 * is set correctly for not breaking workqueue.
+		 */
+		if (next_cpu >= nr_cpu_ids)
+			hctx->next_cpu = cpumask_first(hctx->cpumask);
+		else
+			hctx->next_cpu = next_cpu;
 		hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
 	}
 
+	/*
+	 * Do unbound schedule if we can't find a online CPU for this hctx,
+	 * and it should only happen in the path of handling CPU DEAD.
+	 */
+	if (!cpu_online(hctx->next_cpu)) {
+		if (!tried) {
+			tried = true;
+			goto select_cpu;
+		}
+
+		/*
+		 * Make sure to re-select CPU next time once after CPUs
+		 * in hctx->cpumask become online again.
+		 */
+		hctx->next_cpu_batch = 1;
+		return WORK_CPU_UNBOUND;
+	}
 	return hctx->next_cpu;
 }