From patchwork Fri Jun 1 15:23:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joseph Salisbury X-Patchwork-Id: 924057 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40y7R14CKKz9s4Y; Sat, 2 Jun 2018 01:24:13 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1fOluL-0007SM-1x; Fri, 01 Jun 2018 15:24:01 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1fOluI-0007RY-Ob for kernel-team@lists.ubuntu.com; Fri, 01 Jun 2018 15:23:58 +0000 Received: from 1.general.jsalisbury.us.vpn ([10.172.67.212] helo=salisbury) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1fOluI-0003Il-Al for kernel-team@lists.ubuntu.com; Fri, 01 Jun 2018 15:23:58 +0000 Received: by salisbury (Postfix, from userid 1000) id 395127E292E; Fri, 1 Jun 2018 11:23:57 -0400 (EDT) From: Joseph Salisbury To: kernel-team@lists.ubuntu.com Subject: [SRU][Bionic][PATCH 3/4] scsi: cxlflash: Remove commmands from pending list on timeout Date: Fri, 1 Jun 2018 11:23:56 -0400 Message-Id: X-Mailer: git-send-email 2.17.0 In-Reply-To: References: In-Reply-To: References: X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Uma Krishnan BugLink: http://bugs.launchpad.net/bugs/1768431 The following Oops can occur if an internal command sent to the AFU does not complete within the timeout: [c000000ff101b810] c008000016020d94 term_mc+0xfc/0x1b0 [cxlflash] [c000000ff101b8a0] c008000016020fb0 term_afu+0x168/0x280 [cxlflash] [c000000ff101b930] c0080000160232ec cxlflash_pci_error_detected+0x184/0x230 [cxlflash] [c000000ff101b9e0] c00800000d95d468 cxl_vphb_error_detected+0x90/0x150[cxl] [c000000ff101ba20] c00800000d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl] [c000000ff101bac0] c00000000003eaf8 eeh_report_error+0xd8/0x1b0 [c000000ff101bb20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170 [c000000ff101bbb0] c00000000003f438 eeh_handle_normal_event+0x198/0x580 [c000000ff101bc60] c00000000003fba4 eeh_handle_event+0x2a4/0x338 [c000000ff101bd10] c0000000000400b8 eeh_event_handler+0x1f8/0x200 [c000000ff101bdc0] c00000000013da48 kthread+0x1a8/0x1b0 [c000000ff101be30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4 When an internal command times out, the command buffer is freed while it is still in the pending commands list of the context. This corrupts the list and when the context is cleaned up, a crash is encountered. To resolve this issue, when an AFU command or TMF command times out, the command should be deleted from the hardware queue pending command list before freeing the buffer. Signed-off-by: Uma Krishnan Acked-by: Matthew R. Ochs Signed-off-by: Martin K. Petersen (cherry picked from linux-next commit 9a597cd4c0cebd61657f7449cb8bcb681f464500) Signed-off-by: Joseph Salisbury --- drivers/scsi/cxlflash/main.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c index dfe76485aab6..c9203282d943 100644 --- a/drivers/scsi/cxlflash/main.c +++ b/drivers/scsi/cxlflash/main.c @@ -473,6 +473,7 @@ static int send_tmf(struct cxlflash_cfg *cfg, struct scsi_device *sdev, struct afu_cmd *cmd = NULL; struct device *dev = &cfg->dev->dev; struct hwq *hwq = get_hwq(afu, PRIMARY_HWQ); + bool needs_deletion = false; char *buf = NULL; ulong lock_flags; int rc = 0; @@ -527,6 +528,7 @@ static int send_tmf(struct cxlflash_cfg *cfg, struct scsi_device *sdev, if (!to) { dev_err(dev, "%s: TMF timed out\n", __func__); rc = -ETIMEDOUT; + needs_deletion = true; } else if (cmd->cmd_aborted) { dev_err(dev, "%s: TMF aborted\n", __func__); rc = -EAGAIN; @@ -537,6 +539,12 @@ static int send_tmf(struct cxlflash_cfg *cfg, struct scsi_device *sdev, } cfg->tmf_active = false; spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags); + + if (needs_deletion) { + spin_lock_irqsave(&hwq->hsq_slock, lock_flags); + list_del(&cmd->list); + spin_unlock_irqrestore(&hwq->hsq_slock, lock_flags); + } out: kfree(buf); return rc; @@ -2284,6 +2292,7 @@ static int send_afu_cmd(struct afu *afu, struct sisl_ioarcb *rcb) struct device *dev = &cfg->dev->dev; struct afu_cmd *cmd = NULL; struct hwq *hwq = get_hwq(afu, PRIMARY_HWQ); + ulong lock_flags; char *buf = NULL; int rc = 0; int nretry = 0; @@ -2329,6 +2338,11 @@ static int send_afu_cmd(struct afu *afu, struct sisl_ioarcb *rcb) case -ETIMEDOUT: rc = afu->context_reset(hwq); if (rc) { + /* Delete the command from pending_cmds list */ + spin_lock_irqsave(&hwq->hsq_slock, lock_flags); + list_del(&cmd->list); + spin_unlock_irqrestore(&hwq->hsq_slock, lock_flags); + cxlflash_schedule_async_reset(cfg); break; }