Patchwork sata_mv port lockup on hotplug (kernel 2.6.38.2)

login
register
mail settings
Submitter Tejun Heo
Date April 26, 2011, 1:50 p.m.
Message ID <20110426135027.GI878@htj.dyndns.org>
Download mbox | patch
Permalink /patch/92929/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Tejun Heo - April 26, 2011, 1:50 p.m.
On Tue, Apr 26, 2011 at 02:13:22PM +0100, Bruce Stenning wrote:
> Tejun wrote:
> > ata4 is reporting link down and the device is detached and port
> > disabled afterwards.  Doesn't seem to have much to do with
> > freezing/thawing?
> 
> After all processing of a normal unplug event has completed, the port is left
> thawed (and the irq enabled) to allow future hotplug events on this port to be
> actioned.  In the case of the port lockups that I have seen the port appears
> to be left frozen, with the irq disabled, and I am hypothesising that this is
> due to a race between the freeze and the thaw.
> 
> Below is some additional tracing leading up to a port lockup, with some more
> DPRINTK statements added.  The port freeze appears to be happening after the
> port thaw.  The irq mask writes that are output by mv_set_main_irq_mask seem
> to show the irqs left masked off.

Does the following patch resolve the problem?

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bruce Stenning - April 26, 2011, 3:41 p.m.
Tejun wrote:
> Does the following patch resolve the problem?
>
> Thanks.

I have applied that patch, keeping Mark's mv_set_main_irq_mask spinlock patch
in place.  I would not like to commit absolutely, for obvious reasons, but I
have abused the port considerably and have not been able to coax it into
locking up.


Bruce.


Latest News at: http://www.indigovision.com/index.php/en/news.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - April 26, 2011, 3:52 p.m.
Hello,

On Tue, Apr 26, 2011 at 04:41:25PM +0100, Bruce Stenning wrote:
> Tejun wrote:
> > Does the following patch resolve the problem?
> 
> I have applied that patch, keeping Mark's mv_set_main_irq_mask spinlock patch
> in place.  I would not like to commit absolutely, for obvious reasons, but I
> have abused the port considerably and have not been able to coax it into
> locking up.

Yeah, it makes sense.  Hmm... it seems I wasn't thinking straight when
I added that work around.  Not sure how to fix it properly at this
moment.  I'll think about it.  Can you please keep me posted if you
find something while testing?

Thanks.
Bruce Stenning - April 26, 2011, 4:05 p.m.
> Yeah, it makes sense.  Hmm... it seems I wasn't thinking straight when
> I added that work around.  Not sure how to fix it properly at this
> moment.  I'll think about it.  Can you please keep me posted if you
> find something while testing?

I'm away for two and a bit weeks so I'm not sure what progress (if any)
I will make during that time.  But yes, I shall certainly keep you posted
as soon as I find anything else.  Thank you very much for your inputs.

Cheers,

Bruce.


Latest News at: http://www.indigovision.com/index.php/en/news.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Lord - April 26, 2011, 6:37 p.m.
On 11-04-26 11:41 AM, Bruce Stenning wrote:
> Tejun wrote:
>> Does the following patch resolve the problem?
>>
>> Thanks.
> 
> I have applied that patch, keeping Mark's mv_set_main_irq_mask spinlock patch
> in place.  I would not like to commit absolutely, for obvious reasons, but I
> have abused the port considerably and have not been able to coax it into
> locking up.

I'm thinking perhaps I should dust off that spinlock patch
and send something more proper like that upstream.

With that mask register shared among the ports, it is really
hard to keep track of when we're locked and when not.
So having a lock just for the shared part of the chip
has got to be a less error-prone way to do it.

I'll pull down a recent -git to patch against first.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 88cd22f..73f0589 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -2802,19 +2802,6 @@  int ata_eh_reset(struct ata_link *link, int classify,
 	}
 
 	/*
-	 * Some controllers can't be frozen very well and may set
-	 * spuruious error conditions during reset.  Clear accumulated
-	 * error information.  As reset is the final recovery action,
-	 * nothing is lost by doing this.
-	 */
-	spin_lock_irqsave(link->ap->lock, flags);
-	memset(&link->eh_info, 0, sizeof(link->eh_info));
-	if (slave)
-		memset(&slave->eh_info, 0, sizeof(link->eh_info));
-	ap->pflags &= ~ATA_PFLAG_EH_PENDING;
-	spin_unlock_irqrestore(link->ap->lock, flags);
-
-	/*
 	 * Make sure onlineness and classification result correspond.
 	 * Hotplug could have happened during reset and some
 	 * controllers fail to wait while a drive is spinning up after