Patchwork Hotplug borked after suspend/resume in Linux-3.3 ?

login
register
mail settings
Submitter Lin Ming
Date April 19, 2012, 8:16 a.m.
Message ID <1334823362.11188.172.camel@minggr>
Download mbox | patch
Permalink /patch/153680/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Lin Ming - April 19, 2012, 8:16 a.m.
On Thu, Apr 19, 2012 at 2:32 AM, Martin Mokrejs <mmokrejs@fold.natur.cuni.cz> wrote:
> Martin Mokrejs wrote:
>>
>>
>> Jeff Garzik wrote:
>>> On 04/18/2012 01:10 PM, Martin Mokrejs wrote:
>>>> Fix: I got my 3TB disk detected by this single command:
>>>>
>>>> # echo on>   /sys/devices/pci0000:00/0000:00:1f.2/ata6/power/control
>>>> #
>>>>
>>>> This is a Dell Vostro 3550 with A09 BIOS. Same happend with 3.4-rc3 kernel.
>>>>
>>>> I can do some more testing if you want me to.
>>>> Best,
>>>> Martin
>>>
>>>
>>> Can you test this one-line patch from Lin Ming?  Hopefully there is zero sysfs twiddling required with this one...
>>>
>>> --- a/drivers/ata/libata-transport.c
>>> +++ b/drivers/ata/libata-transport.c
>>> @@ -294,6 +294,7 @@ int ata_tport_add(struct device *parent,
>>>      device_enable_async_suspend(dev);
>>>      pm_runtime_set_active(dev);
>>>      pm_runtime_enable(dev);
>>> +    pm_runtime_forbid(dev);
>>>
>>>      transport_add_device(dev);
>>>      transport_configure_device(dev);
>
>
> There is one more minor issue. I cannot get my disk re-dectected at 3Gbps. Here is when I plugged it in
> for the very first time after bootup (plain 3.4-rc3 with the above one-line fix):

I did bisect and found that this is a really old regression introduced
in 2.6.37-rc1 with below commit.

commit d9027470b88631d0956ac37cdadfdeb9cdcf2c99
Author: Gwendal Grignou <gwendal@google.com>
Date:   Tue May 25 12:31:38 2010 -0700

    [libata] Add ATA transport class

    This is a scheleton for libata transport class.
    All information is read only, exporting information from libata:
    - ata_port class: one per ATA port
    - ata_link class: one per ATA port or 15 for SATA Port Multiplier
    - ata_device class: up to 2 for PATA link, usually one for SATA.

    Signed-off-by: Gwendal Grignou <gwendal@google.com>
    Reviewed-by: Grant Grundler <grundler@google.com>
    Signed-off-by: Jeff Garzik <jgarzik@redhat.com>


Here is the patch to fix it.

Gwendal and Grant,

Would you help to review it?


From f696daec7ff63e9b3697e8f7ef8f985152667965 Mon Sep 17 00:00:00 2001
From: Lin Ming <ming.m.lin@intel.com>
Date: Thu, 19 Apr 2012 15:45:51 +0800
Subject: [PATCH] libata: clear error mask of old error history

The old error history was cleared in ata_ering_clear().
It only sets ATA_EFLAG_OLD_ER eflags, but the err_mask was not cleared.
So ata_ering_map() still iterates the old error history.

This causes problem, for example, wrong probe trials count were returned in
ata_eh_schedule_probe(), which in turn causes SATA link speed to be slowed down
to 1.5Gbps.

Reported-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
---
 drivers/ata/libata-eh.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)
Martin Mokrejs - April 19, 2012, 5:28 p.m.
Could be the 3Gbps vs. 1.5Gbps link speed negotation be related to the
"Intel® 6 Series Express Chipset B2 stepping" chips issue? Applies to those
having enabled SATA ports 2-5. And this is my case:

[    3.037832] ahci 0000:00:1f.2: version 3.0
[    3.037880] ahci 0000:00:1f.2: irq 44 for MSI/MSI-X
[    3.037906] ahci: SSS flag set, parallel bus scan disabled
[    3.048233] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x31 impl SATA mode
[    3.048335] ahci 0000:00:1f.2: flags: 64bit ncq sntf stag pm led clo pio slum part ems sxs apst 
[    3.048371] ahci 0000:00:1f.2: setting latency timer to 64
[    3.088902] scsi0 : ahci
[    3.089010] scsi1 : ahci
[    3.089098] scsi2 : ahci
[    3.089185] scsi3 : ahci
[    3.089272] scsi4 : ahci
[    3.089361] scsi5 : ahci
[    3.090528] ata1: SATA max UDMA/133 abar m2048@0xf7f06000 port 0xf7f06100 irq 44
[    3.091400] ata2: DUMMY
[    3.092262] ata3: DUMMY
[    3.093112] ata4: DUMMY
[    3.093952] ata5: SATA max UDMA/133 abar m2048@0xf7f06000 port 0xf7f06300 irq 44
[    3.094803] ata6: SATA max UDMA/133 abar m2048@0xf7f06000 port 0xf7f06380 irq 44

http://support.dell.com/support/topics/global.aspx/support/kcs/document?c=us&cs=04&docid=389728&l=en&s=bsd
http://www.intel.com/support/chipsets/6/sb/CS-032521.htm
http://www.intel.com/support/chipsets/6/sb/CS-032521.htm

Martin



Lin Ming wrote:
> On Thu, Apr 19, 2012 at 2:32 AM, Martin Mokrejs <mmokrejs@fold.natur.cuni.cz> wrote:
>> Martin Mokrejs wrote:
>>>
>>>
>>> Jeff Garzik wrote:
>>>> On 04/18/2012 01:10 PM, Martin Mokrejs wrote:
>>>>> Fix: I got my 3TB disk detected by this single command:
>>>>>
>>>>> # echo on>   /sys/devices/pci0000:00/0000:00:1f.2/ata6/power/control
>>>>> #
>>>>>
>>>>> This is a Dell Vostro 3550 with A09 BIOS. Same happend with 3.4-rc3 kernel.
>>>>>
>>>>> I can do some more testing if you want me to.
>>>>> Best,
>>>>> Martin
>>>>
>>>>
>>>> Can you test this one-line patch from Lin Ming?  Hopefully there is zero sysfs twiddling required with this one...
>>>>
>>>> --- a/drivers/ata/libata-transport.c
>>>> +++ b/drivers/ata/libata-transport.c
>>>> @@ -294,6 +294,7 @@ int ata_tport_add(struct device *parent,
>>>>      device_enable_async_suspend(dev);
>>>>      pm_runtime_set_active(dev);
>>>>      pm_runtime_enable(dev);
>>>> +    pm_runtime_forbid(dev);
>>>>
>>>>      transport_add_device(dev);
>>>>      transport_configure_device(dev);
>>
>>
>> There is one more minor issue. I cannot get my disk re-dectected at 3Gbps. Here is when I plugged it in
>> for the very first time after bootup (plain 3.4-rc3 with the above one-line fix):
> 
> I did bisect and found that this is a really old regression introduced
> in 2.6.37-rc1 with below commit.
> 
> commit d9027470b88631d0956ac37cdadfdeb9cdcf2c99
> Author: Gwendal Grignou <gwendal@google.com>
> Date:   Tue May 25 12:31:38 2010 -0700
> 
>     [libata] Add ATA transport class
> 
>     This is a scheleton for libata transport class.
>     All information is read only, exporting information from libata:
>     - ata_port class: one per ATA port
>     - ata_link class: one per ATA port or 15 for SATA Port Multiplier
>     - ata_device class: up to 2 for PATA link, usually one for SATA.
> 
>     Signed-off-by: Gwendal Grignou <gwendal@google.com>
>     Reviewed-by: Grant Grundler <grundler@google.com>
>     Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
> 
> 
> Here is the patch to fix it.
> 
> Gwendal and Grant,
> 
> Would you help to review it?
> 
> 
>>From f696daec7ff63e9b3697e8f7ef8f985152667965 Mon Sep 17 00:00:00 2001
> From: Lin Ming <ming.m.lin@intel.com>
> Date: Thu, 19 Apr 2012 15:45:51 +0800
> Subject: [PATCH] libata: clear error mask of old error history
> 
> The old error history was cleared in ata_ering_clear().
> It only sets ATA_EFLAG_OLD_ER eflags, but the err_mask was not cleared.
> So ata_ering_map() still iterates the old error history.
> 
> This causes problem, for example, wrong probe trials count were returned in
> ata_eh_schedule_probe(), which in turn causes SATA link speed to be slowed down
> to 1.5Gbps.
> 
> Reported-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
> Signed-off-by: Lin Ming <ming.m.lin@intel.com>
> ---
>  drivers/ata/libata-eh.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index c61316e..4c6f49b 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -419,9 +419,10 @@ int ata_ering_map(struct ata_ering *ering,
>  	return rc;
>  }
>  
> -int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
> +static int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
>  {
>  	ent->eflags |= ATA_EFLAG_OLD_ER;
> +	ent->err_mask = 0;
>  	return 0;
>  }
>  
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martin Mokrejs - April 19, 2012, 6:22 p.m.
Hi Lin,

Lin Ming wrote:
> On Thu, Apr 19, 2012 at 2:32 AM, Martin Mokrejs <mmokrejs@fold.natur.cuni.cz> wrote:
>> Martin Mokrejs wrote:
>>>
>>>
>>> Jeff Garzik wrote:
>>>> On 04/18/2012 01:10 PM, Martin Mokrejs wrote:
>>>>> Fix: I got my 3TB disk detected by this single command:
>>>>>
>>>>> # echo on>   /sys/devices/pci0000:00/0000:00:1f.2/ata6/power/control
>>>>> #
>>>>>
>>>>> This is a Dell Vostro 3550 with A09 BIOS. Same happend with 3.4-rc3 kernel.
>>>>>
>>>>> I can do some more testing if you want me to.
>>>>> Best,
>>>>> Martin
>>>>
>>>>
>>>> Can you test this one-line patch from Lin Ming?  Hopefully there is zero sysfs twiddling required with this one...
>>>>
>>>> --- a/drivers/ata/libata-transport.c
>>>> +++ b/drivers/ata/libata-transport.c
>>>> @@ -294,6 +294,7 @@ int ata_tport_add(struct device *parent,
>>>>      device_enable_async_suspend(dev);
>>>>      pm_runtime_set_active(dev);
>>>>      pm_runtime_enable(dev);
>>>> +    pm_runtime_forbid(dev);
>>>>
>>>>      transport_add_device(dev);
>>>>      transport_configure_device(dev);
>>
>>
>> There is one more minor issue. I cannot get my disk re-dectected at 3Gbps. Here is when I plugged it in
>> for the very first time after bootup (plain 3.4-rc3 with the above one-line fix):
> 
> I did bisect and found that this is a really old regression introduced
> in 2.6.37-rc1 with below commit.
> 
> commit d9027470b88631d0956ac37cdadfdeb9cdcf2c99
> Author: Gwendal Grignou <gwendal@google.com>
> Date:   Tue May 25 12:31:38 2010 -0700
> 
>     [libata] Add ATA transport class
> 
>     This is a scheleton for libata transport class.
>     All information is read only, exporting information from libata:
>     - ata_port class: one per ATA port
>     - ata_link class: one per ATA port or 15 for SATA Port Multiplier
>     - ata_device class: up to 2 for PATA link, usually one for SATA.
> 
>     Signed-off-by: Gwendal Grignou <gwendal@google.com>
>     Reviewed-by: Grant Grundler <grundler@google.com>
>     Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
> 
> 
> Here is the patch to fix it.
> 
> Gwendal and Grant,
> 
> Would you help to review it?
> 
> 
>>From f696daec7ff63e9b3697e8f7ef8f985152667965 Mon Sep 17 00:00:00 2001
> From: Lin Ming <ming.m.lin@intel.com>
> Date: Thu, 19 Apr 2012 15:45:51 +0800
> Subject: [PATCH] libata: clear error mask of old error history
> 
> The old error history was cleared in ata_ering_clear().
> It only sets ATA_EFLAG_OLD_ER eflags, but the err_mask was not cleared.
> So ata_ering_map() still iterates the old error history.
> 
> This causes problem, for example, wrong probe trials count were returned in
> ata_eh_schedule_probe(), which in turn causes SATA link speed to be slowed down
> to 1.5Gbps.
> 
> Reported-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
> Signed-off-by: Lin Ming <ming.m.lin@intel.com>
> ---
>  drivers/ata/libata-eh.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index c61316e..4c6f49b 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -419,9 +419,10 @@ int ata_ering_map(struct ata_ering *ering,
>  	return rc;
>  }
>  
> -int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
> +static int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
>  {
>  	ent->eflags |= ATA_EFLAG_OLD_ER;
> +	ent->err_mask = 0;
>  	return 0;
>  }
>  

Confirming this patch fixed my problem. If you would like to improve something, add
some 1.5Gbp to 3.0Gbps re-negotiation. As you can see belo, I was desperate and plugged
in the cable just when kernel lowered the speed to 1.5Gbps. It seems it cannot re-negotiate
to go back up again. Tested 3.4-rc3 with some patches for pciehp from Yinghai, SATA
hotplug fix (lin Min), and for some reason have still reverted patch
486b10b9f43500741cd63a878d0ef23cd87fc66d (just for completeness).


I plugged in the cable.

[   75.818463] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action 0xe frozen
[   75.818472] ata6: irq_stat 0x00400040, connection status changed
[   75.818481] ata6: SError: { PHYRdyChg CommWake DevExch }
[   75.818500] ata6: hard resetting link
[   76.557738] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   76.558554] ata6.00: ATA-8: ST3000DM001-9YN166, CC4C, max UDMA/133
[   76.558564] ata6.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
[   76.559365] ata6.00: configured for UDMA/133
[   76.577645] ata6: EH complete
[   76.577802] scsi 5:0:0:0: Direct-Access     ATA      ST3000DM001-9YN1 CC4C PQ: 0 ANSI: 5
[   76.577926] sd 5:0:0:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
[   76.577929] sd 5:0:0:0: [sdc] 4096-byte physical blocks
[   76.577949] sd 5:0:0:0: Attached scsi generic sg3 type 0
[   76.577983] sd 5:0:0:0: [sdc] Write Protect is off
[   76.577986] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[   76.578010] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   76.618312]  sdc: sdc1
[   76.618641] sd 5:0:0:0: [sdc] Attached SCSI disk

Unplugged the cable.

[   80.966608] ata6: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[   80.966617] ata6: irq_stat 0x00400040, connection status changed
[   80.966625] ata6: SError: { HostInt PHYRdyChg 10B8B DevExch }
[   80.966643] ata6: hard resetting link
[   81.710063] ata6: SATA link down (SStatus 0 SControl 300)
[   84.575097] ata6: hard resetting link
[   84.915290] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   84.916778] ata6.00: configured for UDMA/133
[   84.935231] ata6: EH complete
[   90.042553] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4090000 action 0xe frozen
[   90.042562] ata6: irq_stat 0x00400040, connection status changed
[   90.042570] ata6: SError: { PHYRdyChg 10B8B DevExch }
[   90.042587] ata6: hard resetting link
[   90.786545] ata6: SATA link down (SStatus 0 SControl 300)
[   95.779093] ata6: hard resetting link
[   96.128574] ata6: SATA link down (SStatus 0 SControl 300)
[   96.128593] ata6: limiting SATA link speed to 1.5 Gbps

I plugged in the cable before the "SCSI" was disabled.

[   98.697321] ata6: hard resetting link
[   99.044242] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   99.045842] ata6.00: configured for UDMA/133
[   99.064156] ata6: EH complete
[  141.857958] EXT4-fs (sdc1): mounted filesystem with ordered data mode. Opts: (null)

I unounted the disk and unplugged the cable.

[  150.645109] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4090000 action 0xe frozen
[  150.645119] ata6: irq_stat 0x00400040, connection status changed
[  150.645127] ata6: SError: { PHYRdyChg 10B8B DevExch }
[  150.645144] ata6: hard resetting link
[  151.386278] ata6: SATA link down (SStatus 0 SControl 310)
[  156.378946] ata6: hard resetting link
[  156.728320] ata6: SATA link down (SStatus 0 SControl 310)
[  161.720864] ata6: hard resetting link
[  162.070361] ata6: SATA link down (SStatus 0 SControl 310)
[  162.070380] ata6.00: disabled
[  162.090321] ata6: EH complete
[  162.090345] ata6.00: detaching (SCSI 5:0:0:0)
[  162.090801] sd 5:0:0:0: [sdc] Synchronizing SCSI cache
[  162.090834] sd 5:0:0:0: [sdc]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  162.090837] sd 5:0:0:0: [sdc] Stopping disk
[  162.090843] sd 5:0:0:0: [sdc] START_STOP FAILED
[  162.090844] sd 5:0:0:0: [sdc]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

I plugged in the cable.

[  169.586995] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xe frozen
[  169.587005] ata6: irq_stat 0x00400040, connection status changed
[  169.587012] ata6: SError: { RecovComm PHYRdyChg CommWake DevExch }
[  169.587032] ata6: hard resetting link
[  170.328067] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  170.328913] ata6.00: ATA-8: ST3000DM001-9YN166, CC4C, max UDMA/133
[  170.328923] ata6.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
[  170.329731] ata6.00: configured for UDMA/133
[  170.347972] ata6: EH complete
[  170.348131] scsi 5:0:0:0: Direct-Access     ATA      ST3000DM001-9YN1 CC4C PQ: 0 ANSI: 5
[  170.348243] sd 5:0:0:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
[  170.348246] sd 5:0:0:0: [sdc] 4096-byte physical blocks
[  170.348272] sd 5:0:0:0: Attached scsi generic sg3 type 0
[  170.348300] sd 5:0:0:0: [sdc] Write Protect is off
[  170.348302] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[  170.348324] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  170.390492]  sdc: sdc1
[  170.390860] sd 5:0:0:0: [sdc] Attached SCSI disk

Unplugged the cable.

[  180.819680] ata6: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[  180.819689] ata6: irq_stat 0x00400040, connection status changed
[  180.819697] ata6: SError: { HostInt PHYRdyChg 10B8B DevExch }
[  180.819714] ata6: hard resetting link
[  181.561332] ata6: SATA link down (SStatus 0 SControl 300)
[  186.553880] ata6: hard resetting link
[  186.903376] ata6: SATA link down (SStatus 0 SControl 300)
[  186.903395] ata6: limiting SATA link speed to 1.5 Gbps
[  191.896050] ata6: hard resetting link
[  192.245416] ata6: SATA link down (SStatus 0 SControl 310)
[  192.245434] ata6.00: disabled
[  192.265374] ata6: EH complete
[  192.265398] ata6.00: detaching (SCSI 5:0:0:0)
[  192.265878] sd 5:0:0:0: [sdc] Synchronizing SCSI cache
[  192.265906] sd 5:0:0:0: [sdc]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  192.265909] sd 5:0:0:0: [sdc] Stopping disk
[  192.265916] sd 5:0:0:0: [sdc] START_STOP FAILED
[  192.265917] sd 5:0:0:0: [sdc]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

I plugged in the cable.

[  194.436542] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xe frozen
[  194.436551] ata6: irq_stat 0x00400040, connection status changed
[  194.436559] ata6: SError: { RecovComm PHYRdyChg CommWake DevExch }
[  194.436578] ata6: hard resetting link
[  195.181045] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  195.181795] ata6.00: ATA-8: ST3000DM001-9YN166, CC4C, max UDMA/133
[  195.181805] ata6.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
[  195.182571] ata6.00: configured for UDMA/133
[  195.200953] ata6: EH complete
[  195.201110] scsi 5:0:0:0: Direct-Access     ATA      ST3000DM001-9YN1 CC4C PQ: 0 ANSI: 5
[  195.201243] sd 5:0:0:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
[  195.201246] sd 5:0:0:0: [sdc] 4096-byte physical blocks
[  195.201270] sd 5:0:0:0: Attached scsi generic sg3 type 0
[  195.201314] sd 5:0:0:0: [sdc] Write Protect is off
[  195.201317] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[  195.201344] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  195.239595]  sdc: sdc1
[  195.240020] sd 5:0:0:0: [sdc] Attached SCSI disk



--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lin Ming - April 20, 2012, 1:46 a.m.
On Thu, 2012-04-19 at 20:22 +0200, Martin Mokrejs wrote:
> Hi Lin,
> 
> Lin Ming wrote:
> > On Thu, Apr 19, 2012 at 2:32 AM, Martin Mokrejs <mmokrejs@fold.natur.cuni.cz> wrote:
> >> Martin Mokrejs wrote:
> >>>
> >>>
> >>> Jeff Garzik wrote:
> >>>> On 04/18/2012 01:10 PM, Martin Mokrejs wrote:
> >>>>> Fix: I got my 3TB disk detected by this single command:
> >>>>>
> >>>>> # echo on>   /sys/devices/pci0000:00/0000:00:1f.2/ata6/power/control
> >>>>> #
> >>>>>
> >>>>> This is a Dell Vostro 3550 with A09 BIOS. Same happend with 3.4-rc3 kernel.
> >>>>>
> >>>>> I can do some more testing if you want me to.
> >>>>> Best,
> >>>>> Martin
> >>>>
> >>>>
> >>>> Can you test this one-line patch from Lin Ming?  Hopefully there is zero sysfs twiddling required with this one...
> >>>>
> >>>> --- a/drivers/ata/libata-transport.c
> >>>> +++ b/drivers/ata/libata-transport.c
> >>>> @@ -294,6 +294,7 @@ int ata_tport_add(struct device *parent,
> >>>>      device_enable_async_suspend(dev);
> >>>>      pm_runtime_set_active(dev);
> >>>>      pm_runtime_enable(dev);
> >>>> +    pm_runtime_forbid(dev);
> >>>>
> >>>>      transport_add_device(dev);
> >>>>      transport_configure_device(dev);
> >>
> >>
> >> There is one more minor issue. I cannot get my disk re-dectected at 3Gbps. Here is when I plugged it in
> >> for the very first time after bootup (plain 3.4-rc3 with the above one-line fix):
> > 
> > I did bisect and found that this is a really old regression introduced
> > in 2.6.37-rc1 with below commit.
> > 
> > commit d9027470b88631d0956ac37cdadfdeb9cdcf2c99
> > Author: Gwendal Grignou <gwendal@google.com>
> > Date:   Tue May 25 12:31:38 2010 -0700
> > 
> >     [libata] Add ATA transport class
> > 
> >     This is a scheleton for libata transport class.
> >     All information is read only, exporting information from libata:
> >     - ata_port class: one per ATA port
> >     - ata_link class: one per ATA port or 15 for SATA Port Multiplier
> >     - ata_device class: up to 2 for PATA link, usually one for SATA.
> > 
> >     Signed-off-by: Gwendal Grignou <gwendal@google.com>
> >     Reviewed-by: Grant Grundler <grundler@google.com>
> >     Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
> > 
> > 
> > Here is the patch to fix it.
> > 
> > Gwendal and Grant,
> > 
> > Would you help to review it?
> > 
> > 
> >>From f696daec7ff63e9b3697e8f7ef8f985152667965 Mon Sep 17 00:00:00 2001
> > From: Lin Ming <ming.m.lin@intel.com>
> > Date: Thu, 19 Apr 2012 15:45:51 +0800
> > Subject: [PATCH] libata: clear error mask of old error history
> > 
> > The old error history was cleared in ata_ering_clear().
> > It only sets ATA_EFLAG_OLD_ER eflags, but the err_mask was not cleared.
> > So ata_ering_map() still iterates the old error history.
> > 
> > This causes problem, for example, wrong probe trials count were returned in
> > ata_eh_schedule_probe(), which in turn causes SATA link speed to be slowed down
> > to 1.5Gbps.
> > 
> > Reported-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
> > Signed-off-by: Lin Ming <ming.m.lin@intel.com>
> > ---
> >  drivers/ata/libata-eh.c |    3 ++-
> >  1 files changed, 2 insertions(+), 1 deletions(-)
> > 
> > diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> > index c61316e..4c6f49b 100644
> > --- a/drivers/ata/libata-eh.c
> > +++ b/drivers/ata/libata-eh.c
> > @@ -419,9 +419,10 @@ int ata_ering_map(struct ata_ering *ering,
> >  	return rc;
> >  }
> >  
> > -int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
> > +static int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
> >  {
> >  	ent->eflags |= ATA_EFLAG_OLD_ER;
> > +	ent->err_mask = 0;
> >  	return 0;
> >  }
> >  
> 
> Confirming this patch fixed my problem. If you would like to improve something, add
> some 1.5Gbp to 3.0Gbps re-negotiation. As you can see belo, I was desperate and plugged
> in the cable just when kernel lowered the speed to 1.5Gbps. It seems it cannot re-negotiate
> to go back up again. Tested 3.4-rc3 with some patches for pciehp from Yinghai, SATA
> hotplug fix (lin Min), and for some reason have still reverted patch
> 486b10b9f43500741cd63a878d0ef23cd87fc66d (just for completeness).
> 
> 
> I plugged in the cable.
> 
> [   75.818463] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action 0xe frozen
> [   75.818472] ata6: irq_stat 0x00400040, connection status changed
> [   75.818481] ata6: SError: { PHYRdyChg CommWake DevExch }
> [   75.818500] ata6: hard resetting link
> [   76.557738] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [   76.558554] ata6.00: ATA-8: ST3000DM001-9YN166, CC4C, max UDMA/133
> [   76.558564] ata6.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
> [   76.559365] ata6.00: configured for UDMA/133
> [   76.577645] ata6: EH complete
> [   76.577802] scsi 5:0:0:0: Direct-Access     ATA      ST3000DM001-9YN1 CC4C PQ: 0 ANSI: 5
> [   76.577926] sd 5:0:0:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
> [   76.577929] sd 5:0:0:0: [sdc] 4096-byte physical blocks
> [   76.577949] sd 5:0:0:0: Attached scsi generic sg3 type 0
> [   76.577983] sd 5:0:0:0: [sdc] Write Protect is off
> [   76.577986] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> [   76.578010] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [   76.618312]  sdc: sdc1
> [   76.618641] sd 5:0:0:0: [sdc] Attached SCSI disk
> 
> Unplugged the cable.
> 
> [   80.966608] ata6: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
> [   80.966617] ata6: irq_stat 0x00400040, connection status changed
> [   80.966625] ata6: SError: { HostInt PHYRdyChg 10B8B DevExch }
> [   80.966643] ata6: hard resetting link
> [   81.710063] ata6: SATA link down (SStatus 0 SControl 300)
> [   84.575097] ata6: hard resetting link
> [   84.915290] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [   84.916778] ata6.00: configured for UDMA/133
> [   84.935231] ata6: EH complete
> [   90.042553] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4090000 action 0xe frozen
> [   90.042562] ata6: irq_stat 0x00400040, connection status changed
> [   90.042570] ata6: SError: { PHYRdyChg 10B8B DevExch }
> [   90.042587] ata6: hard resetting link
> [   90.786545] ata6: SATA link down (SStatus 0 SControl 300)
> [   95.779093] ata6: hard resetting link
> [   96.128574] ata6: SATA link down (SStatus 0 SControl 300)
> [   96.128593] ata6: limiting SATA link speed to 1.5 Gbps
> 
> I plugged in the cable before the "SCSI" was disabled.
> 
> [   98.697321] ata6: hard resetting link
> [   99.044242] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

SATA link can't go back up to 3.0 Gbps.

I can reproduce this issue too.
Will try to fix it.

I have sent out patch "libata: clear error mask of old error history".
Thanks for your test.

Lin Ming

> [   99.045842] ata6.00: configured for UDMA/133
> [   99.064156] ata6: EH complete
> [  141.857958] EXT4-fs (sdc1): mounted filesystem with ordered data mode. Opts: (null)
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Grant Grundler - April 20, 2012, 2:37 a.m.
On Thu, Apr 19, 2012 at 1:16 AM, Lin Ming <ming.m.lin@intel.com> wrote:
...
> I did bisect and found that this is a really old regression introduced
> in 2.6.37-rc1 with below commit.
>
> commit d9027470b88631d0956ac37cdadfdeb9cdcf2c99
> Author: Gwendal Grignou <gwendal@google.com>
> Date:   Tue May 25 12:31:38 2010 -0700
>
>    [libata] Add ATA transport class
>
>    This is a scheleton for libata transport class.
>    All information is read only, exporting information from libata:
>    - ata_port class: one per ATA port
>    - ata_link class: one per ATA port or 15 for SATA Port Multiplier
>    - ata_device class: up to 2 for PATA link, usually one for SATA.
>
>    Signed-off-by: Gwendal Grignou <gwendal@google.com>
>    Reviewed-by: Grant Grundler <grundler@google.com>
>    Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
>
>
> Here is the patch to fix it.
>
> Gwendal and Grant,
>
> Would you help to review it?
>
>
> From f696daec7ff63e9b3697e8f7ef8f985152667965 Mon Sep 17 00:00:00 2001
> From: Lin Ming <ming.m.lin@intel.com>
> Date: Thu, 19 Apr 2012 15:45:51 +0800
> Subject: [PATCH] libata: clear error mask of old error history
>
> The old error history was cleared in ata_ering_clear().
> It only sets ATA_EFLAG_OLD_ER eflags, but the err_mask was not cleared.
> So ata_ering_map() still iterates the old error history.
>
> This causes problem, for example, wrong probe trials count were returned in
> ata_eh_schedule_probe(), which in turn causes SATA link speed to be slowed down
> to 1.5Gbps.
>
> Reported-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
> Signed-off-by: Lin Ming <ming.m.lin@intel.com>

Reviewed-by: Grant Grundler <grundler@google.com>

LGTM. Caveat is I have looked at libata code only once or twice since
reviewing this patch for Gwendal.

cheers,
grant

> ---
>  drivers/ata/libata-eh.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index c61316e..4c6f49b 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -419,9 +419,10 @@ int ata_ering_map(struct ata_ering *ering,
>        return rc;
>  }
>
> -int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
> +static int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
>  {
>        ent->eflags |= ATA_EFLAG_OLD_ER;
> +       ent->err_mask = 0;
>        return 0;
>  }
>
> --
> 1.7.2.5
>
>
>
>>
>> [  146.876489] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action 0xe frozen
>> [  146.876499] ata6: irq_stat 0x00400040, connection status changed
>> [  146.876508] ata6: SError: { PHYRdyChg CommWake DevExch }
>> [  146.876527] ata6: hard resetting link
>> [  147.619956] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> [  147.869349] ata6.00: ATA-8: ST3000DM001-9YN166, CC4C, max UDMA/133
>> [  147.869360] ata6.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
>> [  147.870126] ata6.00: configured for UDMA/133
>> [  147.870131] ata6: EH complete
>> [  147.870220] scsi 5:0:0:0: Direct-Access     ATA      ST3000DM001-9YN1 CC4C PQ: 0 ANSI: 5
>> [  147.870391] sd 5:0:0:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
>> [  147.870393] sd 5:0:0:0: [sdc] 4096-byte physical blocks
>> [  147.870396] sd 5:0:0:0: Attached scsi generic sg3 type 0
>> [  147.870434] sd 5:0:0:0: [sdc] Write Protect is off
>> [  147.870436] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
>> [  147.870460] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>> [  147.904848]  sdc: sdc1
>> [  147.905196] sd 5:0:0:0: [sdc] Attached SCSI disk
>>
>>
>> Here is what happens on re-plug of the device. It is a 3.5" drive and the line
>> [  617.838013] ata6: hard resetting link
>> happens too early. I can hear the drive is still spinning up, it can't be ready yet.
>> I think the delay should be increased.
>>
>>
>> [  617.837966] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xe frozen
>> [  617.837976] ata6: irq_stat 0x00400040, connection status changed
>> [  617.837984] ata6: SError: { RecovComm PHYRdyChg CommWake DevExch }
>> [  617.838004] ata6: limiting SATA link speed to 1.5 Gbps
>> [  617.838013] ata6: hard resetting link
>> [  623.610941] ata6: link is slow to respond, please be patient (ready=0)
>> [  627.864604] ata6: COMRESET failed (errno=-16)
>> [  627.864615] ata6: hard resetting link
>> [  629.931538] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
>> [  629.932355] ata6.00: ATA-8: ST3000DM001-9YN166, CC4C, max UDMA/133
>> [  629.932365] ata6.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
>> [  629.933170] ata6.00: configured for UDMA/133
>> [  629.951629] ata6: EH complete
>> [  629.951700] scsi 5:0:0:0: Direct-Access     ATA      ST3000DM001-9YN1 CC4C PQ: 0 ANSI: 5
>> [  629.951816] sd 5:0:0:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
>> [  629.951819] sd 5:0:0:0: [sdc] 4096-byte physical blocks
>> [  629.951842] sd 5:0:0:0: Attached scsi generic sg3 type 0
>> [  629.951875] sd 5:0:0:0: [sdc] Write Protect is off
>> [  629.951877] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
>> [  629.951901] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>> [  629.995970]  sdc: sdc1
>> [  629.996359] sd 5:0:0:0: [sdc] Attached SCSI disk
>>
>>
>> Martin
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gwendal Grignou - April 26, 2012, 9:29 a.m.
Lin,

In the patch d9027470b88631d0956ac37cdadfdeb9cdcf2c99, I did limit the
amount of data cleaning in some of the ata objects.

If setting err_mask to 0 masks the regression I introduced in the
patch, I may have altered too much how ata_device object is
reinitialized when a device is found. I am digging deeper, I may have
change the code to try to preserve the ering as much as possible.

Concerning your patch, isn't adding a test (ent->eflags &
ATA_EFLAG_OLD_ER) in ata_count_probe_trials_cb() more in line with
speed_down_verdict_cb() code?

Gwendal.

On Fri, Apr 20, 2012 at 10:37 AM, Grant Grundler <grundler@google.com> wrote:
> On Thu, Apr 19, 2012 at 1:16 AM, Lin Ming <ming.m.lin@intel.com> wrote:
> ...
>> I did bisect and found that this is a really old regression introduced
>> in 2.6.37-rc1 with below commit.
>>
>> commit d9027470b88631d0956ac37cdadfdeb9cdcf2c99
>> Author: Gwendal Grignou <gwendal@google.com>
>> Date:   Tue May 25 12:31:38 2010 -0700
>>
>>    [libata] Add ATA transport class
>>
>>    This is a scheleton for libata transport class.
>>    All information is read only, exporting information from libata:
>>    - ata_port class: one per ATA port
>>    - ata_link class: one per ATA port or 15 for SATA Port Multiplier
>>    - ata_device class: up to 2 for PATA link, usually one for SATA.
>>
>>    Signed-off-by: Gwendal Grignou <gwendal@google.com>
>>    Reviewed-by: Grant Grundler <grundler@google.com>
>>    Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
>>
>>
>> Here is the patch to fix it.
>>
>> Gwendal and Grant,
>>
>> Would you help to review it?
>>
>>
>> From f696daec7ff63e9b3697e8f7ef8f985152667965 Mon Sep 17 00:00:00 2001
>> From: Lin Ming <ming.m.lin@intel.com>
>> Date: Thu, 19 Apr 2012 15:45:51 +0800
>> Subject: [PATCH] libata: clear error mask of old error history
>>
>> The old error history was cleared in ata_ering_clear().
>> It only sets ATA_EFLAG_OLD_ER eflags, but the err_mask was not cleared.
>> So ata_ering_map() still iterates the old error history.
>>
>> This causes problem, for example, wrong probe trials count were returned in
>> ata_eh_schedule_probe(), which in turn causes SATA link speed to be slowed down
>> to 1.5Gbps.
>>
>> Reported-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
>> Signed-off-by: Lin Ming <ming.m.lin@intel.com>
>
> Reviewed-by: Grant Grundler <grundler@google.com>
>
> LGTM. Caveat is I have looked at libata code only once or twice since
> reviewing this patch for Gwendal.
>
> cheers,
> grant
>
>> ---
>>  drivers/ata/libata-eh.c |    3 ++-
>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
>> index c61316e..4c6f49b 100644
>> --- a/drivers/ata/libata-eh.c
>> +++ b/drivers/ata/libata-eh.c
>> @@ -419,9 +419,10 @@ int ata_ering_map(struct ata_ering *ering,
>>        return rc;
>>  }
>>
>> -int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
>> +static int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
>>  {
>>        ent->eflags |= ATA_EFLAG_OLD_ER;
>> +       ent->err_mask = 0;
>>        return 0;
>>  }
>>
>> --
>> 1.7.2.5
>>
>>
>>
>>>
>>> [  146.876489] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action 0xe frozen
>>> [  146.876499] ata6: irq_stat 0x00400040, connection status changed
>>> [  146.876508] ata6: SError: { PHYRdyChg CommWake DevExch }
>>> [  146.876527] ata6: hard resetting link
>>> [  147.619956] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>>> [  147.869349] ata6.00: ATA-8: ST3000DM001-9YN166, CC4C, max UDMA/133
>>> [  147.869360] ata6.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
>>> [  147.870126] ata6.00: configured for UDMA/133
>>> [  147.870131] ata6: EH complete
>>> [  147.870220] scsi 5:0:0:0: Direct-Access     ATA      ST3000DM001-9YN1 CC4C PQ: 0 ANSI: 5
>>> [  147.870391] sd 5:0:0:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
>>> [  147.870393] sd 5:0:0:0: [sdc] 4096-byte physical blocks
>>> [  147.870396] sd 5:0:0:0: Attached scsi generic sg3 type 0
>>> [  147.870434] sd 5:0:0:0: [sdc] Write Protect is off
>>> [  147.870436] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
>>> [  147.870460] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>>> [  147.904848]  sdc: sdc1
>>> [  147.905196] sd 5:0:0:0: [sdc] Attached SCSI disk
>>>
>>>
>>> Here is what happens on re-plug of the device. It is a 3.5" drive and the line
>>> [  617.838013] ata6: hard resetting link
>>> happens too early. I can hear the drive is still spinning up, it can't be ready yet.
>>> I think the delay should be increased.
>>>
>>>
>>> [  617.837966] ata6: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xe frozen
>>> [  617.837976] ata6: irq_stat 0x00400040, connection status changed
>>> [  617.837984] ata6: SError: { RecovComm PHYRdyChg CommWake DevExch }
>>> [  617.838004] ata6: limiting SATA link speed to 1.5 Gbps
>>> [  617.838013] ata6: hard resetting link
>>> [  623.610941] ata6: link is slow to respond, please be patient (ready=0)
>>> [  627.864604] ata6: COMRESET failed (errno=-16)
>>> [  627.864615] ata6: hard resetting link
>>> [  629.931538] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
>>> [  629.932355] ata6.00: ATA-8: ST3000DM001-9YN166, CC4C, max UDMA/133
>>> [  629.932365] ata6.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
>>> [  629.933170] ata6.00: configured for UDMA/133
>>> [  629.951629] ata6: EH complete
>>> [  629.951700] scsi 5:0:0:0: Direct-Access     ATA      ST3000DM001-9YN1 CC4C PQ: 0 ANSI: 5
>>> [  629.951816] sd 5:0:0:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
>>> [  629.951819] sd 5:0:0:0: [sdc] 4096-byte physical blocks
>>> [  629.951842] sd 5:0:0:0: Attached scsi generic sg3 type 0
>>> [  629.951875] sd 5:0:0:0: [sdc] Write Protect is off
>>> [  629.951877] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
>>> [  629.951901] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>>> [  629.995970]  sdc: sdc1
>>> [  629.996359] sd 5:0:0:0: [sdc] Attached SCSI disk
>>>
>>>
>>> Martin
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lin Ming - April 26, 2012, 1:06 p.m.
On Thu, Apr 26, 2012 at 5:29 PM, Gwendal Grignou <gwendal@google.com> wrote:
> Lin,

Hi Gwendal,

>
> In the patch d9027470b88631d0956ac37cdadfdeb9cdcf2c99, I did limit the
> amount of data cleaning in some of the ata objects.
>
> If setting err_mask to 0 masks the regression I introduced in the
> patch, I may have altered too much how ata_device object is
> reinitialized when a device is found. I am digging deeper, I may have
> change the code to try to preserve the ering as much as possible.
>
> Concerning your patch, isn't adding a test (ent->eflags &
> ATA_EFLAG_OLD_ER) in ata_count_probe_trials_cb() more in line with
> speed_down_verdict_cb() code?

This could also fix the regression.

But the fundamental problem is should ata_ering_map still iterate the old
error history which were cleared already?

Lin Ming

>
> Gwendal.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gwendal Grignou - April 26, 2012, 4:50 p.m.
On Thu, Apr 26, 2012 at 6:06 AM, Lin Ming <ming.m.lin@intel.com> wrote:
> On Thu, Apr 26, 2012 at 5:29 PM, Gwendal Grignou <gwendal@google.com> wrote:
>> Lin,
>
> Hi Gwendal,
>
>>
>> In the patch d9027470b88631d0956ac37cdadfdeb9cdcf2c99, I did limit the
>> amount of data cleaning in some of the ata objects.
>>
>> If setting err_mask to 0 masks the regression I introduced in the
>> patch, I may have altered too much how ata_device object is
>> reinitialized when a device is found. I am digging deeper, I may have
>> change the code to try to preserve the ering as much as possible.
>>
>> Concerning your patch, isn't adding a test (ent->eflags &
>> ATA_EFLAG_OLD_ER) in ata_count_probe_trials_cb() more in line with
>> speed_down_verdict_cb() code?
>
> This could also fix the regression.
>
> But the fundamental problem is should ata_ering_map still iterate the old
> error history which were cleared already?
ATA_EFLAG_OLD_ER marks the entry as irrelevant for the current error
handler. It can still be interesting to be able to see the history of
errors for a particular device without going through dmesg output.
Especially if you want a script to do it.

Gwendal.
>
> Lin Ming
>
>>
>> Gwendal.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lin Ming - May 3, 2012, 1:26 a.m.
On Thu, Apr 26, 2012 at 5:29 PM, Gwendal Grignou <gwendal@google.com> wrote:
> Lin,
>
> In the patch d9027470b88631d0956ac37cdadfdeb9cdcf2c99, I did limit the
> amount of data cleaning in some of the ata objects.
>
> If setting err_mask to 0 masks the regression I introduced in the
> patch, I may have altered too much how ata_device object is
> reinitialized when a device is found. I am digging deeper, I may have
> change the code to try to preserve the ering as much as possible.
>
> Concerning your patch, isn't adding a test (ent->eflags &
> ATA_EFLAG_OLD_ER) in ata_count_probe_trials_cb() more in line with
> speed_down_verdict_cb() code?

Hi,

I have updated the patch.

[PATCH] libata: skip old error history when counting probe trials
http://marc.info/?l=linux-ide&m=133600832126645&w=2

Regards,
Lin Ming
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index c61316e..4c6f49b 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -419,9 +419,10 @@  int ata_ering_map(struct ata_ering *ering,
 	return rc;
 }
 
-int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
+static int ata_ering_clear_cb(struct ata_ering_entry *ent, void *void_arg)
 {
 	ent->eflags |= ATA_EFLAG_OLD_ER;
+	ent->err_mask = 0;
 	return 0;
 }