diff mbox series

ata: libata-eh: avoid needless hard reset when revalidating link

Message ID 20220921155753.231770-1-niklas.cassel@wdc.com
State New
Headers show
Series ata: libata-eh: avoid needless hard reset when revalidating link | expand

Commit Message

Niklas Cassel Sept. 21, 2022, 3:57 p.m. UTC
Performing a revalidation on a AHCI controller supporting LPM,
while using a lpm mode of e.g. med_power_with_dip (hipm + dipm) or
medium_power (hipm), will currently always lead to a hard reset.

The expected behavior is that a hard reset is only performed when
revalidate fails, because the properties of the drive has changed.

A revalidate performed after e.g. a NCQ error, or such a simple thing
as disabling write-caching (hdparm -W 0 /dev/sda), should succeed on
the first try (and should therefore not cause the link to be reset).

This unwarranted hard reset happens because ata_phys_link_offline()
returns true for a link that is in deep sleep. Thus the call to
ata_phys_link_offline() in ata_eh_revalidate_and_attach() will cause
the revalidation to fail, which causes ata_eh_handle_dev_fail() to be
called, which will set ehc->i.action |= ATA_EH_RESET, such that the
link is reset before retrying revalidation.

When the link is reset, the link is reestablished, so when
ata_eh_revalidate_and_attach() is called the second time, directly
after the link has been reset, ata_phys_link_offline() will return
false, and the revalidation will succeed.

Looking at "8.3.1.3 HBA Initiated" in the AHCI 1.3.1 specification,
it is clear the when host software writes a new command to memory,
by setting a bit in the PxCI/PxSACT HBA port registers, the HBA will
automatically bring back the link before sending out the Command FIS.

However, simply reading a SCR (like ata_phys_link_offline() does),
will not cause the HBA to automatically bring back the link.

As long as hipm is enabled, the HBA will put an idle link into deep
sleep. Avoid this needless hard reset on revalidation by temporarily
disabling hipm, by setting the LPM mode to ATA_LPM_MAX_POWER.

After revalidation is complete, ata_eh_recover() will restore the link
policy by setting the LPM mode to ap->target_lpm_policy.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
---
 drivers/ata/libata-eh.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Damien Le Moal Sept. 25, 2022, 11:11 p.m. UTC | #1
On 9/22/22 00:57, Niklas Cassel wrote:
> Performing a revalidation on a AHCI controller supporting LPM,
> while using a lpm mode of e.g. med_power_with_dip (hipm + dipm) or
> medium_power (hipm), will currently always lead to a hard reset.
> 
> The expected behavior is that a hard reset is only performed when
> revalidate fails, because the properties of the drive has changed.
> 
> A revalidate performed after e.g. a NCQ error, or such a simple thing
> as disabling write-caching (hdparm -W 0 /dev/sda), should succeed on
> the first try (and should therefore not cause the link to be reset).
> 
> This unwarranted hard reset happens because ata_phys_link_offline()
> returns true for a link that is in deep sleep. Thus the call to
> ata_phys_link_offline() in ata_eh_revalidate_and_attach() will cause
> the revalidation to fail, which causes ata_eh_handle_dev_fail() to be
> called, which will set ehc->i.action |= ATA_EH_RESET, such that the
> link is reset before retrying revalidation.
> 
> When the link is reset, the link is reestablished, so when
> ata_eh_revalidate_and_attach() is called the second time, directly
> after the link has been reset, ata_phys_link_offline() will return
> false, and the revalidation will succeed.
> 
> Looking at "8.3.1.3 HBA Initiated" in the AHCI 1.3.1 specification,
> it is clear the when host software writes a new command to memory,
> by setting a bit in the PxCI/PxSACT HBA port registers, the HBA will
> automatically bring back the link before sending out the Command FIS.
> 
> However, simply reading a SCR (like ata_phys_link_offline() does),
> will not cause the HBA to automatically bring back the link.
> 
> As long as hipm is enabled, the HBA will put an idle link into deep
> sleep. Avoid this needless hard reset on revalidation by temporarily
> disabling hipm, by setting the LPM mode to ATA_LPM_MAX_POWER.
> 
> After revalidation is complete, ata_eh_recover() will restore the link
> policy by setting the LPM mode to ap->target_lpm_policy.
> 
> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>

Applied to for-6.1. Thanks !

> ---
>  drivers/ata/libata-eh.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index 7c128c89b454..1f83ae8690ee 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -151,6 +151,8 @@ ata_eh_cmd_timeout_table[ATA_EH_CMD_TIMEOUT_TABLE_SIZE] = {
>  #undef CMDS
>  
>  static void __ata_port_freeze(struct ata_port *ap);
> +static int ata_eh_set_lpm(struct ata_link *link, enum ata_lpm_policy policy,
> +			  struct ata_device **r_failed_dev);
>  #ifdef CONFIG_PM
>  static void ata_eh_handle_port_suspend(struct ata_port *ap);
>  static void ata_eh_handle_port_resume(struct ata_port *ap);
> @@ -2940,6 +2942,23 @@ static int ata_eh_revalidate_and_attach(struct ata_link *link,
>  		if ((action & ATA_EH_REVALIDATE) && ata_dev_enabled(dev)) {
>  			WARN_ON(dev->class == ATA_DEV_PMP);
>  
> +			/*
> +			 * The link may be in a deep sleep, wake it up.
> +			 *
> +			 * If the link is in deep sleep, ata_phys_link_offline()
> +			 * will return true, causing the revalidation to fail,
> +			 * which leads to a (potentially) needless hard reset.
> +			 *
> +			 * ata_eh_recover() will later restore the link policy
> +			 * to ap->target_lpm_policy after revalidation is done.
> +			 */
> +			if (link->lpm_policy > ATA_LPM_MAX_POWER) {
> +				rc = ata_eh_set_lpm(link, ATA_LPM_MAX_POWER,
> +						    r_failed_dev);
> +				if (rc)
> +					goto err;
> +			}
> +
>  			if (ata_phys_link_offline(ata_dev_phys_link(dev))) {
>  				rc = -EIO;
>  				goto err;
diff mbox series

Patch

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 7c128c89b454..1f83ae8690ee 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -151,6 +151,8 @@  ata_eh_cmd_timeout_table[ATA_EH_CMD_TIMEOUT_TABLE_SIZE] = {
 #undef CMDS
 
 static void __ata_port_freeze(struct ata_port *ap);
+static int ata_eh_set_lpm(struct ata_link *link, enum ata_lpm_policy policy,
+			  struct ata_device **r_failed_dev);
 #ifdef CONFIG_PM
 static void ata_eh_handle_port_suspend(struct ata_port *ap);
 static void ata_eh_handle_port_resume(struct ata_port *ap);
@@ -2940,6 +2942,23 @@  static int ata_eh_revalidate_and_attach(struct ata_link *link,
 		if ((action & ATA_EH_REVALIDATE) && ata_dev_enabled(dev)) {
 			WARN_ON(dev->class == ATA_DEV_PMP);
 
+			/*
+			 * The link may be in a deep sleep, wake it up.
+			 *
+			 * If the link is in deep sleep, ata_phys_link_offline()
+			 * will return true, causing the revalidation to fail,
+			 * which leads to a (potentially) needless hard reset.
+			 *
+			 * ata_eh_recover() will later restore the link policy
+			 * to ap->target_lpm_policy after revalidation is done.
+			 */
+			if (link->lpm_policy > ATA_LPM_MAX_POWER) {
+				rc = ata_eh_set_lpm(link, ATA_LPM_MAX_POWER,
+						    r_failed_dev);
+				if (rc)
+					goto err;
+			}
+
 			if (ata_phys_link_offline(ata_dev_phys_link(dev))) {
 				rc = -EIO;
 				goto err;