diff mbox

[1/3,Xenial,SRU] cxlflash: Fix to drain operations from previous reset

Message ID 1466684325-12658-1-git-send-email-tim.gardner@canonical.com
State New
Headers show

Commit Message

Tim Gardner June 23, 2016, 12:18 p.m. UTC
From: "Manoj N. Kumar" <manoj@linux.vnet.ibm.com>

BugLink: http://bugs.launchpad.net/bugs/1592114

While running 'sg_reset -H' in a loop with a user-space application active,
hit the following exception:

cpu 0x2: Vector: 300 (Data Access)
    pc: : afu_attach+0x50/0x240 [cxlflash]
    lr: : cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash]
    pid   = 20365, comm = run_block_fvt

Linux version 4.5.0-491-26f710d+

cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash]
cxlflash_ioctl+0x5a8/0x6f0 [cxlflash]
scsi_ioctl+0x3b0/0x4c0
sd_ioctl+0x110/0x190
blkdev_ioctl+0x28c/0xc20
block_ioctl+0xa4/0xd0
do_vfs_ioctl+0xd8/0x8c0
SyS_ioctl+0xd4/0xf0
system_call+0x38/0xb4

The problem here is that the problem space area is unmapped while the
application issues the DK_CXLFLASH_RECOVER_AFU ioctl.

This is the order I observe:

proc1				proc2
1) sg_reset
				2) ioctl(DK_CXLFLASH_RECOVER_AFU)
3) sg_reset again
   causing a PSA unmap
				4) continues RECOVER_AFU processing

The resolution to this problem is to have the reset handler drain all
outstanding user space initiated ioctls before proceeding.  It is safe
to drain after the state has been changed to STATE_RESET. Also since
drain_ioctls() was static, it had to be moved up a bit to be before
cxlflash_eh_host_reset_handler().

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from linux-next commit 894ef44ea6d14153136fc5e5fba2c15a71be404d)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
---
 drivers/scsi/cxlflash/main.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

Comments

Kamal Mostafa June 23, 2016, 5:28 p.m. UTC | #1

Tim Gardner June 24, 2016, 4:34 p.m. UTC | #2
The good folks at IBM pointed out that I missed a patch.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1592114/comments/5

Added 9526f36026f778e82b5175249443854c03b2e660 (cxlflash: Fix regression issue with re-ordering patch)

rtg
Stefan Bader June 28, 2016, 9:16 a.m. UTC | #3
On 24.06.2016 18:34, Tim Gardner wrote:
> The good folks at IBM pointed out that I missed a patch.
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1592114/comments/5

A lot of change but since those good folks will be the first benefiting or
suffering from it and its limited to a driver which we already carry a load of
stuff. And its all cherry-picks.

-Stefan

> 
> Added 9526f36026f778e82b5175249443854c03b2e660 (cxlflash: Fix regression issue with re-ordering patch)
> 
> rtg
> 
>
Brad Figg June 28, 2016, 2:23 p.m. UTC | #4
On Fri, Jun 24, 2016 at 10:34:54AM -0600, Tim Gardner wrote:
> The good folks at IBM pointed out that I missed a patch.
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1592114/comments/5
> 
> Added 9526f36026f778e82b5175249443854c03b2e660 (cxlflash: Fix regression issue with re-ordering patch)
> 
> rtg
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

Would have been nice to see some indication of positive testing with an
Ubuntu kernel and these patches.
Kamal Mostafa June 28, 2016, 3:32 p.m. UTC | #5

diff mbox

Patch

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 2766b10..087bc92 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1863,6 +1863,19 @@  static int afu_reset(struct cxlflash_cfg *cfg)
 }
 
 /**
+ * drain_ioctls() - wait until all currently executing ioctls have completed
+ * @cfg:	Internal structure associated with the host.
+ *
+ * Obtain write access to read/write semaphore that wraps ioctl
+ * handling to 'drain' ioctls currently executing.
+ */
+static void drain_ioctls(struct cxlflash_cfg *cfg)
+{
+	down_write(&cfg->ioctl_rwsem);
+	up_write(&cfg->ioctl_rwsem);
+}
+
+/**
  * cxlflash_eh_device_reset_handler() - reset a single LUN
  * @scp:	SCSI command to send.
  *
@@ -1933,6 +1946,7 @@  static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
 	switch (cfg->state) {
 	case STATE_NORMAL:
 		cfg->state = STATE_RESET;
+		drain_ioctls(cfg);
 		cxlflash_mark_contexts_error(cfg);
 		rcr = afu_reset(cfg);
 		if (rcr) {
@@ -2451,19 +2465,6 @@  out_remove:
 }
 
 /**
- * drain_ioctls() - wait until all currently executing ioctls have completed
- * @cfg:	Internal structure associated with the host.
- *
- * Obtain write access to read/write semaphore that wraps ioctl
- * handling to 'drain' ioctls currently executing.
- */
-static void drain_ioctls(struct cxlflash_cfg *cfg)
-{
-	down_write(&cfg->ioctl_rwsem);
-	up_write(&cfg->ioctl_rwsem);
-}
-
-/**
  * cxlflash_pci_error_detected() - called when a PCI error is detected
  * @pdev:	PCI device struct.
  * @state:	PCI channel state.