Patchwork [git,patches] libata updates for 2.6.37

login
register
mail settings
Submitter Tejun Heo
Date Nov. 30, 2010, 4:29 p.m.
Message ID <4CF52652.4030802@kernel.org>
Download mbox | patch
Permalink /patch/73624/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Tejun Heo - Nov. 30, 2010, 4:29 p.m.
On 11/30/2010 04:38 PM, Kyle McMartin wrote:
> On Tue, Nov 30, 2010 at 03:13:58PM +0100, Tejun Heo wrote:
>>> Tejun, any ideas how I can debug this?
>>
>> Hmm... DIPM commands are failing with AC_ERR_OTHER.  Other than DIPM
>> not being configured and speed capped at 1.5Gbps, the machine works
>> fine afterwards, right?  Are you up for applying debug patches?
>>
> 
> Yup, it's chugging along happily, though I notice kernel builds are
> taking a minute or two longer (but I don't bother to time them so it's
> purely subjective.)
> 
> Be happy to apply any debug patches, but I plan on replacing the drive
> with a faster one at some point in the next few weeks.

Can you please apply the following patch and report the resulting
kernel log?  You're on ahci, right?

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kyle McMartin - Nov. 30, 2010, 4:31 p.m.
On Tue, Nov 30, 2010 at 05:29:06PM +0100, Tejun Heo wrote:
> > Be happy to apply any debug patches, but I plan on replacing the drive
> > with a faster one at some point in the next few weeks.
> 
> Can you please apply the following patch and report the resulting
> kernel log?  You're on ahci, right?
> 

Yup, building now.

> diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> index ebc08d6..b1c39db 100644
> --- a/drivers/ata/libahci.c
> +++ b/drivers/ata/libahci.c
> @@ -1560,6 +1560,10 @@ static void ahci_error_intr(struct ata_port *ap, u32 irq_stat)
>  	}
> 
>  	/* okay, let's hand over to EH */
> +	if (active_qc && ata_tag_internal(active_qc->tag))
> +		ata_dev_printk(active_qc->dev, KERN_WARNING,
> +			       "ahci: internal command failure, irq_stat=0x%x\n",
> +			       irq_stat);
> 
>  	if (irq_stat & PORT_IRQ_FREEZE)
>  		ata_port_freeze(ap);
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 7f77c67..7cf236b 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -1668,6 +1668,10 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
> 
>  	/* perform minimal error analysis */
>  	if (qc->flags & ATA_QCFLAG_FAILED) {
> +		ata_dev_printk(dev, KERN_WARNING,
> +			       "internal command failure: stat=0x%x ehi_desc=\"%s\"\n",
> +			       qc->tf.command, dev->link->eh_info.desc);
> +
>  		if (qc->result_tf.command & (ATA_ERR | ATA_DF))
>  			qc->err_mask |= AC_ERR_DEV;
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kyle McMartin - Nov. 30, 2010, 5:53 p.m.
On Tue, Nov 30, 2010 at 11:31:50AM -0500, Kyle McMartin wrote:
> On Tue, Nov 30, 2010 at 05:29:06PM +0100, Tejun Heo wrote:
> > > Be happy to apply any debug patches, but I plan on replacing the drive
> > > with a faster one at some point in the next few weeks.
> > 
> > Can you please apply the following patch and report the resulting
> > kernel log?  You're on ahci, right?
> > 
> 
> Yup, building now.
> 

Ok, running it, I haven't seen the issue across three suspends... Will
update when it inevitably happens.

--Kyle
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kyle McMartin - Nov. 30, 2010, 9:09 p.m.
On Tue, Nov 30, 2010 at 12:53:17PM -0500, Kyle McMartin wrote:
> On Tue, Nov 30, 2010 at 11:31:50AM -0500, Kyle McMartin wrote:
> > On Tue, Nov 30, 2010 at 05:29:06PM +0100, Tejun Heo wrote:
> > > > Be happy to apply any debug patches, but I plan on replacing the drive
> > > > with a faster one at some point in the next few weeks.
> > > 
> > > Can you please apply the following patch and report the resulting
> > > kernel log?  You're on ahci, right?
> > > 
> > 
> > Yup, building now.
> > 
> 
> Ok, running it, I haven't seen the issue across three suspends... Will
> update when it inevitably happens.
> 

[10272.544858] ata1.00: ahci: internal command failure,
irq_stat=0x400001
[10272.544943] ata1.00: internal command failure: stat=0xef
ehi_desc="irq_stat 0x00400001, PHY RDY changed"
[10272.544951] ata1.00: failed to enable DIPM, Emask 0x100
[10272.544960] ata1: hard resetting link
[10273.004718] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[10273.006108] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[10273.006116] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[10273.006410] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[10273.006417] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[10273.009337] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
succeeded
[10273.009344] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
LOCK) filtered out
[10273.009611] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES)
succeeded
[10273.009631] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
filtered out
[10273.010945] ata1.00: configured for UDMA/100
[10273.011206] ata1.00: ahci: internal command failure,
irq_stat=0x400001
[10273.011241] ata1.00: internal command failure: stat=0xef
ehi_desc="irq_stat 0x00400001, PHY RDY changed"
[10273.011247] ata1.00: failed to enable DIPM, Emask 0x100
[10273.011250] ata1: disabling LPM on the link
[10273.011255] ata1: limiting SATA link speed to 1.5 Gbps
[10273.011259] ata1.00: limiting speed to UDMA/100:PIO3

regards, Kyle
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - Dec. 1, 2010, 11:17 a.m.
Hello,

On 11/30/2010 10:09 PM, Kyle McMartin wrote:
>> Ok, running it, I haven't seen the issue across three suspends... Will
>> update when it inevitably happens.
> 
> [10273.011241] ata1.00: internal command failure: stat=0xef
> ehi_desc="irq_stat 0x00400001, PHY RDY changed"

Hmmm, so it's a timing problem.  For some reason, a PHY event is being
reported after reset is complete which unfortunately gets reported
while DIPM command is in progress.  Hmmm... what's broken is that the
PHY event happens after reset is complete.  Can you please attach the
output of "lspci -nn" and "hdparm -I /dev/sda"?  Looks a lot like a
timing issue.  Maybe we'll have to add a delay somewhere.  :-(
Kyle McMartin - Dec. 1, 2010, 12:44 p.m.
On Wed, Dec 01, 2010 at 12:17:52PM +0100, Tejun Heo wrote:
> > [10273.011241] ata1.00: internal command failure: stat=0xef
> > ehi_desc="irq_stat 0x00400001, PHY RDY changed"
> 
> Hmmm, so it's a timing problem.  For some reason, a PHY event is being
> reported after reset is complete which unfortunately gets reported
> while DIPM command is in progress.  Hmmm... what's broken is that the
> PHY event happens after reset is complete.  Can you please attach the
> output of "lspci -nn" and "hdparm -I /dev/sda"?  Looks a lot like a
> timing issue.  Maybe we'll have to add a delay somewhere.  :-(
> 
> 

Sure, np:

00:1f.2 SATA controller [0106]: Intel Corporation 5 Series/3400 Series
Chipset 6 port SATA AHCI Controller [8086:3b2f] (rev 06) (prog-if 01
[AHCI 1.0])
	Subsystem: Lenovo Device [17aa:2168]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin B routed to IRQ 42
	Region 0: I/O ports at 1860 [size=8]
	Region 1: I/O ports at 1814 [size=4]
	Region 2: I/O ports at 1818 [size=8]
	Region 3: I/O ports at 1810 [size=4]
	Region 4: I/O ports at 1840 [size=32]
	Region 5: Memory at f2727000 (32-bit, non-prefetchable)
[size=2K]
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee0100c  Data: 4161
	Capabilities: [70] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
	Capabilities: [b0] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: ahci

/dev/sda:

ATA device, with non-removable media
	Model Number:       HITACHI HTS725032A9A364                 
	Serial Number:      100407PCKC04VPHVB55K
	Firmware Revision:  PC3ZC70F
	Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II
Extensions, SATA Rev 2.5, SATA Rev 2.6; Revision: ATA8-AST T13 Project
D1697 Revision 0b
Standards:
	Used: unknown (minor revision code 0x0028) 
	Supported: 8 7 6 5 
	Likely used: 8
Configuration:
	Logical		max	current
	cylinders	16383	16383
	heads		16	16
	sectors/track	63	63
	--
	CHS current addressable sectors:   16514064
	LBA    user addressable sectors:  268435455
	LBA48  user addressable sectors:  625142448
	Logical  Sector size:                   512 bytes
	Physical Sector size:                   512 bytes
	device size with M = 1024*1024:      305245 MBytes
	device size with M = 1000*1000:      320072 MBytes (320 GB)
	cache/buffer size  = 15151 KBytes (type=DualPortCache)
	Form Factor: 2.5 inch
	Nominal Media Rotation Rate: 7200
Capabilities:
	LBA, IORDY(can be disabled)
	Queue depth: 32
	Standby timer values: spec'd by Vendor, no device specific
minimum
	R/W multiple sector transfer: Max = 16	Current = ?
	Advanced power management level: 254
	DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 
	     Cycle time: min=120ns recommended=120ns
	PIO: pio0 pio1 pio2 pio3 pio4 
	     Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
	Enabled	Supported:
	   *	SMART feature set
	    	Security Mode feature set
	   *	Power Management feature set
	   *	Write cache
	   *	Look-ahead
	   *	Host Protected Area feature set
	   *	WRITE_BUFFER command
	   *	READ_BUFFER command
	   *	DOWNLOAD_MICROCODE
	   *	Advanced Power Management feature set
	    	SET_MAX security extension
	   *	48-bit Address feature set
	   *	Device Configuration Overlay feature set
	   *	Mandatory FLUSH_CACHE
	   *	FLUSH_CACHE_EXT
	   *	SMART error logging
	   *	SMART self-test
	   *	General Purpose Logging feature set
	   *	64-bit World wide name
	   *	IDLE_IMMEDIATE with UNLOAD
	   *	WRITE_UNCORRECTABLE_EXT command
	   *	{READ,WRITE}_DMA_EXT_GPL commands
	   *	Segmented DOWNLOAD_MICROCODE
	   *	Gen1 signaling speed (1.5Gb/s)
	   *	Gen2 signaling speed (3.0Gb/s)
	   *	Native Command Queueing (NCQ)
	   *	Host-initiated interface power management
	   *	Phy event counters
	   *	Idle-Unload when NCQ is active
	   *	NCQ priority information
	   *	DMA Setup Auto-Activate optimization
	    	Device-initiated interface power management
	   *	Software settings preservation
	   *	SMART Command Transport (SCT) feature set
	   *	SCT LBA Segment Access (AC2)
	   *	SCT Error Recovery Control (AC3)
	   *	SCT Features Control (AC4)
	   *	SCT Data Tables (AC5)
Security: 
	Master password revision code = 65534
		supported
	not	enabled
	not	locked
	not	frozen
	not	expired: security count
		supported: enhanced erase
	88min for SECURITY ERASE UNIT. 88min for ENHANCED SECURITY ERASE
UNIT.
Logical Unit WWN Device Identifier: 5000cca5b7da1312
	NAA		: 5
	IEEE OUI	: 000cca
	Unique ID	: 5b7da1312
Checksum: correct

regards, Kyle
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index ebc08d6..b1c39db 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -1560,6 +1560,10 @@  static void ahci_error_intr(struct ata_port *ap, u32 irq_stat)
 	}

 	/* okay, let's hand over to EH */
+	if (active_qc && ata_tag_internal(active_qc->tag))
+		ata_dev_printk(active_qc->dev, KERN_WARNING,
+			       "ahci: internal command failure, irq_stat=0x%x\n",
+			       irq_stat);

 	if (irq_stat & PORT_IRQ_FREEZE)
 		ata_port_freeze(ap);
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 7f77c67..7cf236b 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -1668,6 +1668,10 @@  unsigned ata_exec_internal_sg(struct ata_device *dev,

 	/* perform minimal error analysis */
 	if (qc->flags & ATA_QCFLAG_FAILED) {
+		ata_dev_printk(dev, KERN_WARNING,
+			       "internal command failure: stat=0x%x ehi_desc=\"%s\"\n",
+			       qc->tf.command, dev->link->eh_info.desc);
+
 		if (qc->result_tf.command & (ATA_ERR | ATA_DF))
 			qc->err_mask |= AC_ERR_DEV;