Patchwork [v8,09/13] libsas: enforce eh strategy handlers only in eh context

login
register
mail settings
Submitter Dan Williams
Date Feb. 10, 2012, 8:45 a.m.
Message ID <20120210084525.25701.51128.stgit@dwillia2-linux.jf.intel.com>
Download mbox | patch
Permalink /patch/140531/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Dan Williams - Feb. 10, 2012, 8:45 a.m.
The strategy handlers may be called in places that are problematic for
libsas (i.e. sata resets outside of domain revalidation filtering /
libata link recovery), or problematic for userspace (non-blocking ioctl
to sleeping reset functions).  However, these routines are also called
for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
as long as we are running in the host's error handler.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/libsas/sas_scsi_host.c |   15 +++++++++++----
 1 files changed, 11 insertions(+), 4 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley - Feb. 29, 2012, 10:05 p.m.
On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
> The strategy handlers may be called in places that are problematic for
> libsas (i.e. sata resets outside of domain revalidation filtering /
> libata link recovery), or problematic for userspace (non-blocking ioctl
> to sleeping reset functions).  However, these routines are also called
> for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
> as long as we are running in the host's error handler.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/scsi/libsas/sas_scsi_host.c |   15 +++++++++++----
>  1 files changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
> index f0b9b7b..1cabedc 100644
> --- a/drivers/scsi/libsas/sas_scsi_host.c
> +++ b/drivers/scsi/libsas/sas_scsi_host.c
> @@ -463,11 +463,14 @@ EXPORT_SYMBOL_GPL(sas_get_local_phy);
>  /* Attempt to send a LUN reset message to a device */
>  int sas_eh_device_reset_handler(struct scsi_cmnd *cmd)
>  {
> -	struct domain_device *dev = cmd_to_domain_dev(cmd);
> -	struct sas_internal *i =
> -		to_sas_internal(dev->port->ha->core.shost->transportt);
> -	struct scsi_lun lun;
>  	int res;
> +	struct scsi_lun lun;
> +	struct Scsi_Host *host = cmd->device->host;
> +	struct domain_device *dev = cmd_to_domain_dev(cmd);
> +	struct sas_internal *i = to_sas_internal(host->transportt);
> +
> +	if (current != host->ehandler)
> +		return FAILED;

Doing this will ensure that SG_SCSI_RESET now fails.

I don't mind checking for O_NONBLOCK in the sg handler and failing if it
is, but disallowing everything looks a trifle drastic.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams - March 1, 2012, 12:28 a.m.
On Wed, Feb 29, 2012 at 2:05 PM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
> On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
>> The strategy handlers may be called in places that are problematic for
>> libsas (i.e. sata resets outside of domain revalidation filtering /
>> libata link recovery), or problematic for userspace (non-blocking ioctl
>> to sleeping reset functions).  However, these routines are also called
>> for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
>> as long as we are running in the host's error handler.
>>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>  drivers/scsi/libsas/sas_scsi_host.c |   15 +++++++++++----
>>  1 files changed, 11 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
>> index f0b9b7b..1cabedc 100644
>> --- a/drivers/scsi/libsas/sas_scsi_host.c
>> +++ b/drivers/scsi/libsas/sas_scsi_host.c
>> @@ -463,11 +463,14 @@ EXPORT_SYMBOL_GPL(sas_get_local_phy);
>>  /* Attempt to send a LUN reset message to a device */
>>  int sas_eh_device_reset_handler(struct scsi_cmnd *cmd)
>>  {
>> -     struct domain_device *dev = cmd_to_domain_dev(cmd);
>> -     struct sas_internal *i =
>> -             to_sas_internal(dev->port->ha->core.shost->transportt);
>> -     struct scsi_lun lun;
>>       int res;
>> +     struct scsi_lun lun;
>> +     struct Scsi_Host *host = cmd->device->host;
>> +     struct domain_device *dev = cmd_to_domain_dev(cmd);
>> +     struct sas_internal *i = to_sas_internal(host->transportt);
>> +
>> +     if (current != host->ehandler)
>> +             return FAILED;
>
> Doing this will ensure that SG_SCSI_RESET now fails.
>
> I don't mind checking for O_NONBLOCK in the sg handler and failing if it
> is, but disallowing everything looks a trifle drastic.
>

The thought process here was following the lead of libata which does
not specify eh_reset handlers.  We can't permit "unmanaged" (outside
of eh) resets to hit ata devices otherwise we run the risk of a reset
turning into a link bounce / hotplug.

...and I can't take the same route as I did for the scsi_transport_sas
initiated reset since these handlers are called from both ioctl and eh
context, or can I?

Hmm, what about something like:

  if (current != host->ehandler) {
    schedule_reset_to_run_in_eh_context():
    wait_for_eh();
  } else
    do_reset();

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley - March 1, 2012, 2:29 p.m.
On Wed, 2012-02-29 at 16:28 -0800, Dan Williams wrote:
> On Wed, Feb 29, 2012 at 2:05 PM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
> >> The strategy handlers may be called in places that are problematic for
> >> libsas (i.e. sata resets outside of domain revalidation filtering /
> >> libata link recovery), or problematic for userspace (non-blocking ioctl
> >> to sleeping reset functions).  However, these routines are also called
> >> for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
> >> as long as we are running in the host's error handler.
> >>
> >> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >> ---
> >>  drivers/scsi/libsas/sas_scsi_host.c |   15 +++++++++++----
> >>  1 files changed, 11 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
> >> index f0b9b7b..1cabedc 100644
> >> --- a/drivers/scsi/libsas/sas_scsi_host.c
> >> +++ b/drivers/scsi/libsas/sas_scsi_host.c
> >> @@ -463,11 +463,14 @@ EXPORT_SYMBOL_GPL(sas_get_local_phy);
> >>  /* Attempt to send a LUN reset message to a device */
> >>  int sas_eh_device_reset_handler(struct scsi_cmnd *cmd)
> >>  {
> >> -     struct domain_device *dev = cmd_to_domain_dev(cmd);
> >> -     struct sas_internal *i =
> >> -             to_sas_internal(dev->port->ha->core.shost->transportt);
> >> -     struct scsi_lun lun;
> >>       int res;
> >> +     struct scsi_lun lun;
> >> +     struct Scsi_Host *host = cmd->device->host;
> >> +     struct domain_device *dev = cmd_to_domain_dev(cmd);
> >> +     struct sas_internal *i = to_sas_internal(host->transportt);
> >> +
> >> +     if (current != host->ehandler)
> >> +             return FAILED;
> >
> > Doing this will ensure that SG_SCSI_RESET now fails.
> >
> > I don't mind checking for O_NONBLOCK in the sg handler and failing if it
> > is, but disallowing everything looks a trifle drastic.
> >
> 
> The thought process here was following the lead of libata which does
> not specify eh_reset handlers.  We can't permit "unmanaged" (outside
> of eh) resets to hit ata devices otherwise we run the risk of a reset
> turning into a link bounce / hotplug.
> 
> ...and I can't take the same route as I did for the scsi_transport_sas
> initiated reset since these handlers are called from both ioctl and eh
> context, or can I?
> 
> Hmm, what about something like:
> 
>   if (current != host->ehandler) {
>     schedule_reset_to_run_in_eh_context():
>     wait_for_eh();
>   } else
>     do_reset();

I think that would work for me ... as long as the wait doesn't cause
entangled deadlocks (I can't think of any at the moment, but I'll think
a bit more deeply about it).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams - March 6, 2012, 7:17 p.m.
On Thu, Mar 1, 2012 at 6:29 AM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>> Hmm, what about something like:
>>
>>   if (current != host->ehandler) {
>>     schedule_reset_to_run_in_eh_context():
>>     wait_for_eh();
>>   } else
>>     do_reset();
>
> I think that would work for me ... as long as the wait doesn't cause
> entangled deadlocks (I can't think of any at the moment, but I'll think
> a bit more deeply about it).

So there is a deadlock due to:

commit d7a1bb0a04ca835bffc0a91e64ab827dfba7d8f5
Author: James Smart <James.Smart@Emulex.Com>
Date:   Wed Mar 8 14:50:12 2006 -0500

    [SCSI] Block I/O while SG reset operation in progress - the midlayer patch

    The scsi midlayer portion of the patch

    Signed-off-by: James Smart <James.Smart@emulex.com>
    Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

...this adds shost->tmf_in_progress to scsi_host_in_recovery(), so I
can't "wait for eh" because the exit condition for that wait is
"!scsi_host_in_recovery()".  But since sg_reset is opened O_NONBLOCK
by default and that is likely the only non-eh reason for calling the
eh_{bus|device}_reset_handler routines I'll just make this routine
asynchronously queue a reset and return.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index f0b9b7b..1cabedc 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -463,11 +463,14 @@  EXPORT_SYMBOL_GPL(sas_get_local_phy);
 /* Attempt to send a LUN reset message to a device */
 int sas_eh_device_reset_handler(struct scsi_cmnd *cmd)
 {
-	struct domain_device *dev = cmd_to_domain_dev(cmd);
-	struct sas_internal *i =
-		to_sas_internal(dev->port->ha->core.shost->transportt);
-	struct scsi_lun lun;
 	int res;
+	struct scsi_lun lun;
+	struct Scsi_Host *host = cmd->device->host;
+	struct domain_device *dev = cmd_to_domain_dev(cmd);
+	struct sas_internal *i = to_sas_internal(host->transportt);
+
+	if (current != host->ehandler)
+		return FAILED;
 
 	int_to_scsilun(cmd->device->lun, &lun);
 
@@ -486,8 +489,12 @@  int sas_eh_bus_reset_handler(struct scsi_cmnd *cmd)
 {
 	struct domain_device *dev = cmd_to_domain_dev(cmd);
 	struct sas_phy *phy = sas_get_local_phy(dev);
+	struct Scsi_Host *host = cmd->device->host;
 	int res;
 
+	if (current != host->ehandler)
+		return FAILED;
+
 	res = sas_phy_reset(phy, 1);
 	if (res)
 		SAS_DPRINTK("Bus reset of %s failed 0x%x\n",