diff mbox

[git,patches] libata fixes

Message ID 20100323134226.GA14752@havoc.gtf.org
State Not Applicable
Delegated to: David Miller
Headers show

Commit Message

Jeff Garzik March 23, 2010, 1:42 p.m. UTC
This push includes a fix for the ata_piix timeouts people have been
seeing...  it has seen several successful test reports, and is ready for
wider testing in 2.6.34-rc.

Please pull from 'upstream-linus' branch of
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev.git upstream-linus

to receive the following updates:

 drivers/ata/libata-sff.c |   43 ++++++++++++++++++++++++++++++++++++-------
 drivers/ata/pata_via.c   |    1 +
 2 files changed, 37 insertions(+), 7 deletions(-)

JosephChan@via.com.tw (1):
      pata_via: Add VIA VX900 support

Tejun Heo (1):
      libata-sff: fix spurious IRQ handling

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Pavel Roskin March 23, 2010, 6:06 p.m. UTC | #1
On Tue, 2010-03-23 at 09:42 -0400, Jeff Garzik wrote:
> This push includes a fix for the ata_piix timeouts people have been
> seeing...  it has seen several successful test reports, and is ready for
> wider testing in 2.6.34-rc.

Tested-by: Pavel Roskin <proski@gnu.org>

My OCZ Solid was giving me timeouts.  I thought it was about to die, so
I bought an Intel X-25M urgently, copied my data, but the timeouts
continued.

After some googling I found that a patch for that was sent this very
morning!  After applying the patch, the timeouts have stopped.  They
would appear minutes after reboot, but it has been over an hour since I
rebooted and no timeouts.

My motherboard is ECS G31T-M, which is not AHCI capable, so I was using
ata_piix, which explains why I was affected.  My kernel is 2.6.34-rc2
(actually, wireless-testing.git).

By all means, please push this patch to all stable kernels, or more
money will be wasted on hard drives ;-)
Tejun Heo March 25, 2010, 4:21 p.m. UTC | #2
On 03/24/2010 03:06 AM, Pavel Roskin wrote:
> By all means, please push this patch to all stable kernels, or more
> money will be wasted on hard drives ;-)

Oh, the original patch which caused the problem never saw the light of
the day before 2.6.34-rc1 so there's no stable kernel to push the
patch into.

Thanks.
Pavel Roskin March 26, 2010, 8:17 p.m. UTC | #3
On Fri, 2010-03-26 at 01:21 +0900, Tejun Heo wrote:
> On 03/24/2010 03:06 AM, Pavel Roskin wrote:
> > By all means, please push this patch to all stable kernels, or more
> > money will be wasted on hard drives ;-)
> 
> Oh, the original patch which caused the problem never saw the light of
> the day before 2.6.34-rc1 so there's no stable kernel to push the
> patch into.

It looks like I'm still getting "clearing spurious IRQ".  In most cases,
it's not followed by anything, but in one case, there was a lost
interrupt followed by a link reset.

[ 5030.881191] ata2: clearing spurious IRQ
[ 5060.950036] ata2: lost interrupt (Status 0x50)
[ 5060.950055] ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
frozen
[ 5060.950059] ata2.01: failed command: WRITE DMA
[ 5060.950066] ata2.01: cmd ca/00:08:b7:11:a9/00:00:00:00:00/f0 tag 0
dma 4096 out
[ 5060.950068]          res 40/00:00:00:4f:c2/00:00:00:00:00/10 Emask
0x4 (timeout)
[ 5060.950071] ata2.01: status: { DRDY }
[ 5060.950080] ata2: soft resetting link
[ 5061.160268] ata2.01: configured for UDMA/133
[ 5061.160275] ata2.01: device reported invalid CHS sector 0
[ 5061.160283] ata2: EH complete
[80540.010652] ata2: clearing spurious IRQ

That last line is apparently a separate event.
Pavel Roskin March 26, 2010, 9:59 p.m. UTC | #4
On Fri, 2010-03-26 at 16:17 -0400, Pavel Roskin wrote:

> It looks like I'm still getting "clearing spurious IRQ".

Sorry, my mistake.  I reverted the patch by accident.  Please ignore.
diff mbox

Patch

diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c
index 561dec2..2774772 100644
--- a/drivers/ata/libata-sff.c
+++ b/drivers/ata/libata-sff.c
@@ -1667,6 +1667,7 @@  unsigned int ata_sff_host_intr(struct ata_port *ap,
 {
 	struct ata_eh_info *ehi = &ap->link.eh_info;
 	u8 status, host_stat = 0;
+	bool bmdma_stopped = false;
 
 	VPRINTK("ata%u: protocol %d task_state %d\n",
 		ap->print_id, qc->tf.protocol, ap->hsm_task_state);
@@ -1699,6 +1700,7 @@  unsigned int ata_sff_host_intr(struct ata_port *ap,
 
 			/* before we do anything else, clear DMA-Start bit */
 			ap->ops->bmdma_stop(qc);
+			bmdma_stopped = true;
 
 			if (unlikely(host_stat & ATA_DMA_ERR)) {
 				/* error when transfering data to/from memory */
@@ -1716,8 +1718,14 @@  unsigned int ata_sff_host_intr(struct ata_port *ap,
 
 	/* check main status, clearing INTRQ if needed */
 	status = ata_sff_irq_status(ap);
-	if (status & ATA_BUSY)
-		goto idle_irq;
+	if (status & ATA_BUSY) {
+		if (bmdma_stopped) {
+			/* BMDMA engine is already stopped, we're screwed */
+			qc->err_mask |= AC_ERR_HSM;
+			ap->hsm_task_state = HSM_ST_ERR;
+		} else
+			goto idle_irq;
+	}
 
 	/* ack bmdma irq events */
 	ap->ops->sff_irq_clear(ap);
@@ -1762,13 +1770,16 @@  EXPORT_SYMBOL_GPL(ata_sff_host_intr);
 irqreturn_t ata_sff_interrupt(int irq, void *dev_instance)
 {
 	struct ata_host *host = dev_instance;
+	bool retried = false;
 	unsigned int i;
-	unsigned int handled = 0, polling = 0;
+	unsigned int handled, idle, polling;
 	unsigned long flags;
 
 	/* TODO: make _irqsave conditional on x86 PCI IDE legacy mode */
 	spin_lock_irqsave(&host->lock, flags);
 
+retry:
+	handled = idle = polling = 0;
 	for (i = 0; i < host->n_ports; i++) {
 		struct ata_port *ap = host->ports[i];
 		struct ata_queued_cmd *qc;
@@ -1782,7 +1793,8 @@  irqreturn_t ata_sff_interrupt(int irq, void *dev_instance)
 				handled |= ata_sff_host_intr(ap, qc);
 			else
 				polling |= 1 << i;
-		}
+		} else
+			idle |= 1 << i;
 	}
 
 	/*
@@ -1790,7 +1802,9 @@  irqreturn_t ata_sff_interrupt(int irq, void *dev_instance)
 	 * asserting IRQ line, nobody cared will ensue.  Check IRQ
 	 * pending status if available and clear spurious IRQ.
 	 */
-	if (!handled) {
+	if (!handled && !retried) {
+		bool retry = false;
+
 		for (i = 0; i < host->n_ports; i++) {
 			struct ata_port *ap = host->ports[i];
 
@@ -1805,8 +1819,23 @@  irqreturn_t ata_sff_interrupt(int irq, void *dev_instance)
 				ata_port_printk(ap, KERN_INFO,
 						"clearing spurious IRQ\n");
 
-			ap->ops->sff_check_status(ap);
-			ap->ops->sff_irq_clear(ap);
+			if (idle & (1 << i)) {
+				ap->ops->sff_check_status(ap);
+				ap->ops->sff_irq_clear(ap);
+			} else {
+				/* clear INTRQ and check if BUSY cleared */
+				if (!(ap->ops->sff_check_status(ap) & ATA_BUSY))
+					retry |= true;
+				/*
+				 * With command in flight, we can't do
+				 * sff_irq_clear() w/o racing with completion.
+				 */
+			}
+		}
+
+		if (retry) {
+			retried = true;
+			goto retry;
 		}
 	}
 
diff --git a/drivers/ata/pata_via.c b/drivers/ata/pata_via.c
index 3059ec0..95d39c3 100644
--- a/drivers/ata/pata_via.c
+++ b/drivers/ata/pata_via.c
@@ -677,6 +677,7 @@  static const struct pci_device_id via[] = {
 	{ PCI_VDEVICE(VIA, 0x3164), },
 	{ PCI_VDEVICE(VIA, 0x5324), },
 	{ PCI_VDEVICE(VIA, 0xC409), VIA_IDFLAG_SINGLE },
+	{ PCI_VDEVICE(VIA, 0x9001), VIA_IDFLAG_SINGLE },
 
 	{ },
 };