Patchwork sata_mv port lockup on hotplug (kernel 2.6.38.2)

login
register
mail settings
Submitter Tejun Heo
Date May 25, 2011, 9:41 a.m.
Message ID <20110525094127.GF10146@htj.dyndns.org>
Download mbox | patch
Permalink /patch/97314/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Tejun Heo - May 25, 2011, 9:41 a.m.
Hello, sorry about the long delay.

On Tue, May 17, 2011 at 04:30:20PM +0100, Bruce Stenning wrote:
> __ata_port_freeze: ata4 port frozen
> ata4: hard resetting link
> sata_link_hardreset: ENTER
> ata4: COMRESET failed (errno=-32)
> sata_link_hardreset: EXIT, rc=-32
> ata4: reset failed (errno=-32), retrying in 33 secs
> __ata_port_freeze: ata4 port frozen
> ata4: hard resetting link
> sata_link_hardreset: ENTER
> ata4: COMRESET failed (errno=-32)
> sata_link_hardreset: EXIT, rc=-32
> ata4: reset failed, giving up
> ata_eh_recover: EXIT, rc=-32
> ata4.00: disabled
> ata4: EH complete
> ata_scsi_error: EXIT
> 
> The IRQ for that port is masked off afterwards.

This is a different issue.  libata EH plugs the port if reset fails
repeatedly.  This behavior was implemented to avoid causing continuous
resets on a port in case it has flaky PHY state reporting; however, it
seems to cause more trouble than fixing issues - ie. plugging in a
broken device may end up plugging the port even after the offending
device is removed until manual rescan or reboot.  I've been pondering
about changing the behavior like the following.

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bruce Stenning - May 25, 2011, 1:36 p.m.
> This is a different issue.  libata EH plugs the port if reset fails
> repeatedly.  This behavior was implemented to avoid causing continuous
> resets on a port in case it has flaky PHY state reporting; however, it
> seems to cause more trouble than fixing issues - ie. plugging in a
> broken device may end up plugging the port even after the offending
> device is removed until manual rescan or reboot.  I've been pondering
> about changing the behavior like the following.

Thanks for the explanation, Tejun, and for sending those changes upstream.
I shall incorporate these patches into my kernel and keep you informed as
to how they behave.

Mark, did you get a chance to clean up the sata_mv spinlock addition in
mv_set_main_irq_mask?

Many Thanks,

Bruce.


Latest News at: http://www.indigovision.com/index.php/en/news.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Lord - May 25, 2011, 10:27 p.m.
On 11-05-25 09:36 AM, Bruce Stenning wrote:
>> This is a different issue.  libata EH plugs the port if reset fails
>> repeatedly.  This behavior was implemented to avoid causing continuous
>> resets on a port in case it has flaky PHY state reporting; however, it
>> seems to cause more trouble than fixing issues - ie. plugging in a
>> broken device may end up plugging the port even after the offending
>> device is removed until manual rescan or reboot.  I've been pondering
>> about changing the behavior like the following.
> 
> Thanks for the explanation, Tejun, and for sending those changes upstream.
> I shall incorporate these patches into my kernel and keep you informed as
> to how they behave.
> 
> Mark, did you get a chance to clean up the sata_mv spinlock addition in
> mv_set_main_irq_mask?

Not yet.  I've been waiting to hear back from you as to
whether they visibly fix anything or not.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bruce Stenning - May 30, 2011, 1:07 p.m.
PiA+IE1hcmssIGRpZCB5b3UgZ2V0IGEgY2hhbmNlIHRvIGNsZWFuIHVwIHRoZSBzYXRhX212IHNw
aW5sb2NrIGFkZGl0aW9uIGluDQo+ID4gbXZfc2V0X21haW5faXJxX21hc2s/DQo+DQo+IE5vdCB5
ZXQuICBJJ3ZlIGJlZW4gd2FpdGluZyB0byBoZWFyIGJhY2sgZnJvbSB5b3UgYXMgdG8NCj4gd2hl
dGhlciB0aGV5IHZpc2libHkgZml4IGFueXRoaW5nIG9yIG5vdC4NCg0KSGF2aW5nIG5vdyB0ZXN0
ZWQgd2l0aCBUZWp1bidzIHR3byBwYXRjaGVzIGFuZCB3aXRob3V0IHRoZSBzcGlubG9jayBpbg0K
bXZfc2V0X21haW5faXJxX21hc2sgZm9yIGEgcmVhc29uYWJsZSB0aW1lLCB0aGUgaG90cGx1ZyBz
ZWVtcyB0byBiZSB2ZXJ5DQpyZWxpYWJsZS4gIFNvIEkgdGhpbmsgaXQncyBwcm9iYWJseSBiZXN0
IHRvIGxlYXZlIHRoZSBzcGlubG9jayBvdXQgYW5kDQpjb250aW51ZSB0byBvYnNlcnZlIHRoZSBi
ZWhhdmlvdXIgd2l0aCBmdXJ0aGVyIHRlc3RpbmcuDQoNClRoYW5rcywNCkJydWNlLg0KDQoNCkxh
dGVzdCBOZXdzIGF0OiBodHRwOi8vd3d3LmluZGlnb3Zpc2lvbi5jb20vaW5kZXgucGhwL2VuL25l
d3MuaHRtbA0K
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Lord - May 31, 2011, 2:04 a.m.
On 11-05-30 09:07 AM, Bruce Stenning wrote:
>>> Mark, did you get a chance to clean up the sata_mv spinlock addition in
>>> mv_set_main_irq_mask?
>>
>> Not yet.  I've been waiting to hear back from you as to
>> whether they visibly fix anything or not.
> 
> Having now tested with Tejun's two patches and without the spinlock in
> mv_set_main_irq_mask for a reasonable time, the hotplug seems to be very
> reliable.  So I think it's probably best to leave the spinlock out and
> continue to observe the behaviour with further testing.

Peachy.  I'll have another look at some point -- I think I thought
the spinlock was still needed for ATAPI under some conditions.
Gotta look closer to make sure.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Lord - June 10, 2011, 12:28 p.m.
On 11-05-25 05:41 AM, Tejun Heo wrote:
> Hello, sorry about the long delay.
> 
> On Tue, May 17, 2011 at 04:30:20PM +0100, Bruce Stenning wrote:
>> __ata_port_freeze: ata4 port frozen
>> ata4: hard resetting link
>> sata_link_hardreset: ENTER
>> ata4: COMRESET failed (errno=-32)
>> sata_link_hardreset: EXIT, rc=-32
>> ata4: reset failed (errno=-32), retrying in 33 secs
>> __ata_port_freeze: ata4 port frozen
>> ata4: hard resetting link
>> sata_link_hardreset: ENTER
>> ata4: COMRESET failed (errno=-32)
>> sata_link_hardreset: EXIT, rc=-32
>> ata4: reset failed, giving up
>> ata_eh_recover: EXIT, rc=-32
>> ata4.00: disabled
>> ata4: EH complete
>> ata_scsi_error: EXIT
>>
>> The IRQ for that port is masked off afterwards.
> 
> This is a different issue.  libata EH plugs the port if reset fails
> repeatedly.  This behavior was implemented to avoid causing continuous
> resets on a port in case it has flaky PHY state reporting; however, it
> seems to cause more trouble than fixing issues - ie. plugging in a
> broken device may end up plugging the port even after the offending
> device is removed until manual rescan or reboot.  I've been pondering
> about changing the behavior like the following.
> 
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index dfb6e9d..05797fe 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -2885,8 +2885,17 @@ int ata_eh_reset(struct ata_link *link, int classify,
>  	    sata_scr_read(link, SCR_STATUS, &sstatus))
>  		rc = -ERESTART;
>  
> -	if (rc == -ERESTART || try >= max_tries)
> +	if (rc == -ERESTART || try >= max_tries) {
> +		/*
> +		 * Thaw host port even if reset failed, so that the port
> +		 * can be retried on the next phy event.  This risks
> +		 * repeated EH runs but seems to be a better tradeoff than
> +		 * shutting down a port after a botched hotplug attempt.
> +		 */
> +		if (ata_is_host_link(link))
> +			ata_eh_thaw_port(ap);
>  		goto out;
> +	}
>  
>  	now = jiffies;
>  	if (time_before(now, deadline)) {


Tejun, did this ever go upstream and to -stable ??
I'm asking because I see the same issue with other SATA controllers,
in particular with sata_sil boards.  Hot plug generally works _once_
per port, and then stops working.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - June 10, 2011, 2:01 p.m.
Hello,

(cc'ing Jeff)

On Fri, Jun 10, 2011 at 08:28:51AM -0400, Mark Lord wrote:
> Tejun, did this ever go upstream and to -stable ??
> I'm asking because I see the same issue with other SATA controllers,
> in particular with sata_sil boards.  Hot plug generally works _once_
> per port, and then stops working.

The patch is still pending,

 http://article.gmane.org/gmane.linux.ide/49613

I don't think it would be wise to include the patch for -stable tho.
It's not a regression fix and has some possibility of introducing
regressions, so....

Thanks.
Mark Lord - June 10, 2011, 5:32 p.m.
On 11-06-10 10:01 AM, Tejun Heo wrote:
> Hello,
> 
> (cc'ing Jeff)
> 
> On Fri, Jun 10, 2011 at 08:28:51AM -0400, Mark Lord wrote:
>> Tejun, did this ever go upstream and to -stable ??
>> I'm asking because I see the same issue with other SATA controllers,
>> in particular with sata_sil boards.  Hot plug generally works _once_
>> per port, and then stops working.
> 
> The patch is still pending,
> 
>  http://article.gmane.org/gmane.linux.ide/49613
> 
> I don't think it would be wise to include the patch for -stable tho.
> It's not a regression fix and has some possibility of introducing
> regressions, so....

Not a regression fix?  Excuse me?
SATA hotplug always used to work fine,
and is more or less essential to reboot-free operations.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Garzik - June 10, 2011, 5:39 p.m.
On 06/10/2011 01:32 PM, Mark Lord wrote:
> On 11-06-10 10:01 AM, Tejun Heo wrote:
>> Hello,
>>
>> (cc'ing Jeff)
>>
>> On Fri, Jun 10, 2011 at 08:28:51AM -0400, Mark Lord wrote:
>>> Tejun, did this ever go upstream and to -stable ??
>>> I'm asking because I see the same issue with other SATA controllers,
>>> in particular with sata_sil boards.  Hot plug generally works _once_
>>> per port, and then stops working.
>>
>> The patch is still pending,
>>
>>   http://article.gmane.org/gmane.linux.ide/49613
>>
>> I don't think it would be wise to include the patch for -stable tho.
>> It's not a regression fix and has some possibility of introducing
>> regressions, so....
>
> Not a regression fix?  Excuse me?
> SATA hotplug always used to work fine,
> and is more or less essential to reboot-free operations.

I can move it from #upstream (linux-next) to #upstream-fixes, if we feel 
it is sufficient urgent.  It seemed like it needed more testing, but 
maybe the best way to do that is #upstream-fixes?

Was mainly a -rc question, to me.

	Jeff



--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Lord - June 10, 2011, 8:49 p.m.
On 11-06-10 01:39 PM, Jeff Garzik wrote:
> On 06/10/2011 01:32 PM, Mark Lord wrote:
>> On 11-06-10 10:01 AM, Tejun Heo wrote:
>>> Hello,
>>>
>>> (cc'ing Jeff)
>>>
>>> On Fri, Jun 10, 2011 at 08:28:51AM -0400, Mark Lord wrote:
>>>> Tejun, did this ever go upstream and to -stable ??
>>>> I'm asking because I see the same issue with other SATA controllers,
>>>> in particular with sata_sil boards.  Hot plug generally works _once_
>>>> per port, and then stops working.
>>>
>>> The patch is still pending,
>>>
>>>   http://article.gmane.org/gmane.linux.ide/49613
>>>
>>> I don't think it would be wise to include the patch for -stable tho.
>>> It's not a regression fix and has some possibility of introducing
>>> regressions, so....
>>
>> Not a regression fix?  Excuse me?
>> SATA hotplug always used to work fine,
>> and is more or less essential to reboot-free operations.
> 
> I can move it from #upstream (linux-next) to #upstream-fixes, if we feel it is
> sufficient urgent.  It seemed like it needed more testing, but maybe the best
> way to do that is #upstream-fixes?
> 
> Was mainly a -rc question, to me.

Well, it's not getting any testing sitting outside of the kernel right now.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Garzik - June 10, 2011, 9:20 p.m.
On 06/10/2011 04:49 PM, Mark Lord wrote:
> Well, it's not getting any testing sitting outside of the kernel right now.

It gets whatever testing linux-next gets...

	Jeff



--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - June 11, 2011, 12:19 p.m.
On Fri, Jun 10, 2011 at 01:32:22PM -0400, Mark Lord wrote:
> > I don't think it would be wise to include the patch for -stable tho.
> > It's not a regression fix and has some possibility of introducing
> > regressions, so....
> 
> Not a regression fix?  Excuse me?
> SATA hotplug always used to work fine,
> and is more or less essential to reboot-free operations.

There are two separate patches.  One is a fix which got pushed to
Linus by Jeff early yesterday.  The other changes behavior which has
been like that since the beginning of EH.  The former is marked
properly w/ -stable.  The latter shouldn't be in -stable.

Thanks.
Mark Lord - June 12, 2011, 5:10 p.m.
On 11-06-11 08:19 AM, Tejun Heo wrote:
> On Fri, Jun 10, 2011 at 01:32:22PM -0400, Mark Lord wrote:
>>> I don't think it would be wise to include the patch for -stable tho.
>>> It's not a regression fix and has some possibility of introducing
>>> regressions, so....
>>
>> Not a regression fix?  Excuse me?
>> SATA hotplug always used to work fine,
>> and is more or less essential to reboot-free operations.
> 
> There are two separate patches.  One is a fix which got pushed to
> Linus by Jeff early yesterday.  The other changes behavior which has
> been like that since the beginning of EH.  The former is marked
> properly w/ -stable.  The latter shouldn't be in -stable.

Peachy.  Got links for each of those?
I may try them with the sata_sil board that was having issues earlier.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - June 13, 2011, 10:48 a.m.
Hello,

On Sun, Jun 12, 2011 at 01:10:35PM -0400, Mark Lord wrote:
> On 11-06-11 08:19 AM, Tejun Heo wrote:
> > On Fri, Jun 10, 2011 at 01:32:22PM -0400, Mark Lord wrote:
> >>> I don't think it would be wise to include the patch for -stable tho.
> >>> It's not a regression fix and has some possibility of introducing
> >>> regressions, so....
> >>
> >> Not a regression fix?  Excuse me?
> >> SATA hotplug always used to work fine,
> >> and is more or less essential to reboot-free operations.
> > 
> > There are two separate patches.  One is a fix which got pushed to
> > Linus by Jeff early yesterday.  The other changes behavior which has
> > been like that since the beginning of EH.  The former is marked
> > properly w/ -stable.  The latter shouldn't be in -stable.
> 
> Peachy.  Got links for each of those?
> I may try them with the sata_sil board that was having issues earlier.

The fix is

 http://article.gmane.org/gmane.linux.ide/49615

and the behavior change is

 http://article.gmane.org/gmane.linux.ide/49613
Bruce Stenning - Aug. 29, 2011, 3:49 p.m.
Hi guys,

While the hotplug is now far more reliable, I have recently been able to coax
it into locking up SATA ports again while the device is heavily loaded.  This
looks very similar to the previous lockups, with the SATA port left frozen.
I have pasted some tracing below; at the end of the trace ata4 is no longer
responsive to hotplug events.  I have attempted to add further tracing to find
out where the freeze is occurring, but so far this has just prevented me from
triggering the condition.  I am currently trying to locate the problem by
inspecting the sources, but you may have some intuition that could shortcut
my efforts.

I should mention that we are now working on kernel 2.6.39.4 (and are choosing
not to go with 3.x to limit changes as we progress towards final testing.)

Kind Regards,

Bruce.


[...]
ata4.00: configured for UDMA/33
ata_eh_recover: EXIT, rc=0
ata4: EH complete
ata_scsi_error: EXIT
ata_scsiop_noop: ENTER
ata_scsiop_read_cap: ENTER
ata_scsiop_mode_sense: ENTER
ata_scsiop_mode_sense: ENTER
ata_scsiop_mode_sense: ENTER
__ata_port_freeze: ata4 port frozen
ata_scsi_timed_out: ENTER
ata_scsi_timed_out: EXIT, ret=0
ata_scsi_error: ENTER
ata_sff_flush_pio_task: ENTER
ata_eh_link_autopsy: ENTER
ata_eh_link_autopsy: EXIT
ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x190002 action 0xe frozen
ata4.00: edma_err_cause=00000020 pp_flags=00000000, SError=00180000
ata4: SError: { RecovComm PHYRdyChg 10B8B Dispar }
ata4.00: failed command: FLUSH CACHE EXT
ata4.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
         res 50/00:00:30:5a:27/00:08:00:00:00/a0 Emask 0x10 (ATA bus error)
ata4.00: status: { DRDY }
ata_eh_recover: ENTER
__ata_port_freeze: ata4 port frozen
ata4: hard resetting link
sata_link_hardreset: ENTER
sata_link_hardreset: EXIT, rc=0
ata_sff_softreset: ENTER
ata_sff_softreset: about to softreset, devmask=1
ata_bus_softreset: ata4: bus reset via SRST
ata_dev_classify: found ATA device by sig
ata_sff_softreset: EXIT, classes[0]=1 [1]=0
ata_eh_thaw_port: ata4 port thawed
ata_std_postreset: ENTER
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata_std_postreset: EXIT
ata_eh_revalidate_and_attach: ENTER
ata_sff_flush_pio_task: ENTER
ata_sff_flush_pio_task: ENTER
ata_dump_id: 49==0x2f00  53==0x0007  63==0x0007  64==0x0003  75==0x001f
ata_dump_id: 80==0x01f0  81==0x0029  82==0x346b  83==0x7d61  84==0x4133
ata_dump_id: 88==0x047f  93==0x0000
ata_dev_set_xfermode: set features - xfer mode
ata_sff_flush_pio_task: ENTER
ata_dev_set_xfermode: EXIT, err_mask=0
ata_sff_flush_pio_task: ENTER
ata_sff_flush_pio_task: ENTER
ata_dump_id: 49==0x2f00  53==0x0007  63==0x0007  64==0x0003  75==0x001f
ata_dump_id: 80==0x01f0  81==0x0029  82==0x346b  83==0x7d61  84==0x4133
ata_dump_id: 88==0x047f  93==0x0000
ata_dev_set_mode: xfer_shift=12, xfer_mode=0x42
ata4.00: configured for UDMA/33
ata4.00: retrying FLUSH 0xea Emask 0x10
ata_sff_flush_pio_task: ENTER
ata_eh_recover: EXIT, rc=0
ata4: EH complete
ata_scsi_error: EXIT
__ata_port_freeze: ata4 port frozen
ata_port_schedule_eh: port EH scheduled
ata_scsiop_noop: ENTER
ata_scsiop_read_cap: ENTER
ata_scsiop_mode_sense: ENTER
ata_scsiop_mode_sense: ENTER
ata_scsiop_mode_sense: ENTER

Latest News at: http://www.indigovision.com/index.php/en/news.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - Sept. 2, 2011, 1:13 a.m.
Hello,

On Mon, Aug 29, 2011 at 04:49:29PM +0100, Bruce Stenning wrote:
> I should mention that we are now working on kernel 2.6.39.4 (and are choosing
> not to go with 3.x to limit changes as we progress towards final testing.)

I don't think it would make any difference.  The core EH hasn't
changed for some time now.

> [...]
> ata4: EH complete
> ata_scsi_error: EXIT
> __ata_port_freeze: ata4 port frozen
> ata_port_schedule_eh: port EH scheduled
> ata_scsiop_noop: ENTER
> ata_scsiop_read_cap: ENTER
> ata_scsiop_mode_sense: ENTER
> ata_scsiop_mode_sense: ENTER
> ata_scsiop_mode_sense: ENTER

That's weird.  Another EH is scheduled after one completes but EH
doesn't kick in.  Hmmm... can't see how that can happen.  How easily
can you reproduce the problem?

Thanks.
Bruce Stenning - Sept. 2, 2011, 4:22 p.m.
> That's weird.  Another EH is scheduled after one completes but EH
> doesn't kick in.  Hmmm... can't see how that can happen.  How easily
> can you reproduce the problem?
>
> Thanks.

Hi Tejun,

I hope you are well :-)

I have been trying to determine if the problem is that it is failing to
schedule the EH, or whether it is correctly scheduling it but for some
reason it is getting blocked or lost.

Unfortunately it has so far been quite difficult to reproduce when specifically
attempting to.  In normal use cases I reproduced it twice by unplugging a drive
from a RAID array with redundancy intact.  This was out of around a dozen
cycles of waiting until redundancy was restored while the unit was under load,
popping the disk, reinserting, and triggering a RAID rebuild.

I have only twice managed to trigger a lockup deliberately.  In both cases the
tracing showed a scheduled EH which was subsequently not enacted.

How long could it take for the EH to be enacted?  In the lockups that I
have reproduced it did not seem to have recovered minutes later, but perhaps
if I had waited longer...?  I have noticed that error recovery sometimes backs
off for 8s and even 33s, but it always warns when that sort of delay is coming
up.

I shall continue to try to track down why the scheduled EH does not happen.

Kind Regards,

Bruce.


Latest News at: http://www.indigovision.com/index.php/en/news.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - Sept. 6, 2011, 3:45 a.m.
Hello,

On Fri, Sep 02, 2011 at 05:22:38PM +0100, Bruce Stenning wrote:
> Unfortunately it has so far been quite difficult to reproduce when specifically
> attempting to.  In normal use cases I reproduced it twice by unplugging a drive
> from a RAID array with redundancy intact.  This was out of around a dozen
> cycles of waiting until redundancy was restored while the unit was under load,
> popping the disk, reinserting, and triggering a RAID rebuild.

Hmm... that's unfortunate.

> I have only twice managed to trigger a lockup deliberately.  In both cases the
> tracing showed a scheduled EH which was subsequently not enacted.
> 
> How long could it take for the EH to be enacted?  In the lockups that I
> have reproduced it did not seem to have recovered minutes later, but perhaps
> if I had waited longer...?  I have noticed that error recovery sometimes backs
> off for 8s and even 33s, but it always warns when that sort of delay is coming
> up.

It should happen pretty quickly.  In such cases, fastdrain is
activated and all pending commands are aborted if they complete in 3
secs and then EH should kick in.  The backoff is from reset path only
to give breathing time for devices which take long time to spin up and
doesn't apply in this case.

> I shall continue to try to track down why the scheduled EH does not happen.

Can you please add some debug printk's to scsi_schedule_eh() and see
whether scsi_eh_wakeup() is invoked from there?  It seems likely that
the problem is caused by race conditions around
SHOST_[CANCEL_]RECOVERY flags.

Thanks.
Bruce Stenning - Sept. 6, 2011, 12:19 p.m.
> Can you please add some debug printk's to scsi_schedule_eh() and see
> whether scsi_eh_wakeup() is invoked from there?  It seems likely that
> the problem is caused by race conditions around
> SHOST_[CANCEL_]RECOVERY flags.

I did manage to reproduce the lockup again yesterday with a slightly
different mix of tracing, including adding tracing to scsi_eh_wakeup()
and scsi_schedule_eh().  It looks like the EH is being scheduled, but
the EH thread goes immediately back to sleep and doesn't wake up:

ata4: EH complete
Waking error handler thread
scsi_eh_wakeup: succeeded
scsi_schedule_eh: succeeded
scsi_restart_operations: waking up host to restart
Error handler scsi_eh_3 sleeping

Is it attempting to wake the scsi_eh_3 thread while scsi_error_handler
is still processing an EH, which then calls scsi_restart_operations and
puts the scsi_eh_3 thread back to sleep again?

Some while after the lockup, there was some tracing relating to SCSI
operations timing out, but the port was still unresponsive.  The unit
is not entirely stable in this state, and our application software was
no longer able to strobe softdog, so the unit rebooted.  Enough was
running for the serial console to be responsive before the reboot,
however.

Thanks,

Bruce.


Latest News at: http://www.indigovision.com/index.php/en/news.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bruce Stenning - Sept. 6, 2011, 12:26 p.m.
I wrote:
> ata4: EH complete
> Waking error handler thread
> scsi_eh_wakeup: succeeded
> scsi_schedule_eh: succeeded
> scsi_restart_operations: waking up host to restart
> Error handler scsi_eh_3 sleeping

Sorry, I hit 'send' before thinking fully.  A short while before the above,
the following was logged:

Waking error handler thread
scsi_eh_wakeup: succeeded
Error handler scsi_eh_3 waking up

This suggests that scsi_error_handler was indeed running when the subsequent
wake-up was attempted.

Bruce.


Latest News at: http://www.indigovision.com/index.php/en/news.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bruce Stenning - Sept. 6, 2011, 2:40 p.m.
I wrote:
> Is it attempting to wake the scsi_eh_3 thread while scsi_error_handler
> is still processing an EH, which then calls scsi_restart_operations and
> puts the scsi_eh_3 thread back to sleep again?

It does look to me as if there is a potential race between the scsi_eh thread
and the wake-up mechanism.

host_eh_scheduled is read here in scsi_error_handler:

        set_current_state(TASK_INTERRUPTIBLE);
        while (!kthread_should_stop()) {
                if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0) ||

For scheduling an EH, scsi_schedule_eh takes the shost->host_lock, increments
shost->host_eh_scheduled, and then wakes the EH thread.  If this happens
between the scsi_eh thread reading host_eh_scheduled and sending itself to
sleep, then nothing will wake up the thread and host_eh_scheduled will not
be inspected again.

Is there some mechanism that I've missed to prevent this from happening?


Bruce.


Latest News at: http://www.indigovision.com/index.php/en/news.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bruce Stenning - Sept. 6, 2011, 3:22 p.m.
> Is there some mechanism that I've missed to prevent this from happening?

Okay, no, I guess the thread will naturally be rescheduled at some point,
at which point the shost->host_eh_scheduled will be re-inspected.


Latest News at: http://www.indigovision.com/index.php/en/news.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index dfb6e9d..05797fe 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -2885,8 +2885,17 @@  int ata_eh_reset(struct ata_link *link, int classify,
 	    sata_scr_read(link, SCR_STATUS, &sstatus))
 		rc = -ERESTART;
 
-	if (rc == -ERESTART || try >= max_tries)
+	if (rc == -ERESTART || try >= max_tries) {
+		/*
+		 * Thaw host port even if reset failed, so that the port
+		 * can be retried on the next phy event.  This risks
+		 * repeated EH runs but seems to be a better tradeoff than
+		 * shutting down a port after a botched hotplug attempt.
+		 */
+		if (ata_is_host_link(link))
+			ata_eh_thaw_port(ap);
 		goto out;
+	}
 
 	now = jiffies;
 	if (time_before(now, deadline)) {