From patchwork Fri Jun 1 15:23:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joseph Salisbury X-Patchwork-Id: 924055 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40y7R13Vntz9s2k; Sat, 2 Jun 2018 01:24:13 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1fOluN-0007TF-Br; Fri, 01 Jun 2018 15:24:03 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1fOluI-0007RZ-QN for kernel-team@lists.ubuntu.com; Fri, 01 Jun 2018 15:23:58 +0000 Received: from 1.general.jsalisbury.us.vpn ([10.172.67.212] helo=salisbury) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1fOluI-0003Im-B7 for kernel-team@lists.ubuntu.com; Fri, 01 Jun 2018 15:23:58 +0000 Received: by salisbury (Postfix, from userid 1000) id 3B3E37E2931; Fri, 1 Jun 2018 11:23:57 -0400 (EDT) From: Joseph Salisbury To: kernel-team@lists.ubuntu.com Subject: [SRU][Bionic][PATCH 4/4] scsi: cxlflash: Handle spurious interrupts Date: Fri, 1 Jun 2018 11:23:57 -0400 Message-Id: X-Mailer: git-send-email 2.17.0 In-Reply-To: References: In-Reply-To: References: X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Uma Krishnan BugLink: http://bugs.launchpad.net/bugs/1768431 The following Oops can occur when there is heavy I/O traffic and the host is reset by a tool such as sg_reset. [c000200fff3fbc90] c00800001690117c process_cmd_doneq+0x104/0x500 [cxlflash] (unreliable) [c000200fff3fbd80] c008000016901648 cxlflash_rrq_irq+0xd0/0x150 [cxlflash] [c000200fff3fbde0] c000000000193130 __handle_irq_event_percpu+0xa0/0x310 [c000200fff3fbea0] c0000000001933d8 handle_irq_event_percpu+0x38/0x90 [c000200fff3fbee0] c000000000193494 handle_irq_event+0x64/0xb0 [c000200fff3fbf10] c000000000198ea0 handle_fasteoi_irq+0xc0/0x230 [c000200fff3fbf40] c00000000019182c generic_handle_irq+0x4c/0x70 [c000200fff3fbf60] c00000000001794c __do_irq+0x7c/0x1c0 [c000200fff3fbf90] c00000000002a390 call_do_irq+0x14/0x24 [c000200e5828fab0] c000000000017b2c do_IRQ+0x9c/0x130 [c000200e5828fb00] c000000000009b04 h_virt_irq_common+0x114/0x120 When a context is reset, the pending commands are flushed and the AFU is notified. Before the AFU handles this request there could be command completion interrupts queued to PHB which are yet to be delivered to the context. In this scenario, a context could receive an interrupt for a command that has been flushed, leading to a possible crash when the memory for the flushed command is accessed. To resolve this problem, a boolean will indicate if the hardware queue is ready to process interrupts or not. This can be evaluated in the interrupt handler before proessing an interrupt. Signed-off-by: Uma Krishnan Acked-by: Matthew R. Ochs Signed-off-by: Martin K. Petersen (cherry picked from linux-next commit d2d354a606d5309fbfe81d5fca01122159e38c6e) Signed-off-by: Joseph Salisbury --- drivers/scsi/cxlflash/common.h | 1 + drivers/scsi/cxlflash/main.c | 11 +++++++++++ 2 files changed, 12 insertions(+) diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h index b69fd32ecd0d..3556b1ded999 100644 --- a/drivers/scsi/cxlflash/common.h +++ b/drivers/scsi/cxlflash/common.h @@ -224,6 +224,7 @@ struct hwq { u64 *hrrq_end; u64 *hrrq_curr; bool toggle; + bool hrrq_online; s64 room; diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c index c9203282d943..a24d7e6e51c1 100644 --- a/drivers/scsi/cxlflash/main.c +++ b/drivers/scsi/cxlflash/main.c @@ -801,6 +801,10 @@ static void term_mc(struct cxlflash_cfg *cfg, u32 index) WARN_ON(cfg->ops->release_context(hwq->ctx_cookie)); hwq->ctx_cookie = NULL; + spin_lock_irqsave(&hwq->hrrq_slock, lock_flags); + hwq->hrrq_online = false; + spin_unlock_irqrestore(&hwq->hrrq_slock, lock_flags); + spin_lock_irqsave(&hwq->hsq_slock, lock_flags); flush_pending_cmds(hwq); spin_unlock_irqrestore(&hwq->hsq_slock, lock_flags); @@ -1475,6 +1479,12 @@ static irqreturn_t cxlflash_rrq_irq(int irq, void *data) spin_lock_irqsave(&hwq->hrrq_slock, hrrq_flags); + /* Silently drop spurious interrupts when queue is not online */ + if (!hwq->hrrq_online) { + spin_unlock_irqrestore(&hwq->hrrq_slock, hrrq_flags); + return IRQ_HANDLED; + } + if (afu_is_irqpoll_enabled(afu)) { irq_poll_sched(&hwq->irqpoll); spin_unlock_irqrestore(&hwq->hrrq_slock, hrrq_flags); @@ -1781,6 +1791,7 @@ static int init_global(struct cxlflash_cfg *cfg) writeq_be((u64) hwq->hrrq_start, &hmap->rrq_start); writeq_be((u64) hwq->hrrq_end, &hmap->rrq_end); + hwq->hrrq_online = true; if (afu_is_sq_cmd_mode(afu)) { writeq_be((u64)hwq->hsq_start, &hmap->sq_start);