Patchwork libata-eh don't waste time retrying media errors (v3)

login
register
mail settings
Submitter Mark Lord
Date May 2, 2012, 7:22 p.m.
Message ID <4FA1898C.5070108@teksavvy.com>
Download mbox | patch
Permalink /patch/156540/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Mark Lord - May 2, 2012, 7:22 p.m.
ATA and SATA drives have had built-in retries for media errors
for as long as they've been commonplace in computers (early 1990s).

When libata stumbles across a bad sector, it can waste minutes
sitting there doing retry after retry before finally giving up
and letting the higher layers deal with it.

This patch removes retries for media errors only.

Signed-off-by: Mark Lord <mlord@pobox.com>
---
version 3: try to improve readability.
Tejun Heo - May 2, 2012, 7:33 p.m.
On Wed, May 02, 2012 at 03:22:52PM -0400, Mark Lord wrote:
> ATA and SATA drives have had built-in retries for media errors
> for as long as they've been commonplace in computers (early 1990s).
> 
> When libata stumbles across a bad sector, it can waste minutes
> sitting there doing retry after retry before finally giving up
> and letting the higher layers deal with it.
> 
> This patch removes retries for media errors only.
> 
> Signed-off-by: Mark Lord <mlord@pobox.com>
> ---
> version 3: try to improve readability.
> 
> --- old/drivers/ata/libata-eh.c	2012-04-27 13:17:35.000000000 -0400
> +++ linux/drivers/ata/libata-eh.c	2012-05-02 15:20:19.946827031 -0400
> @@ -2046,6 +2046,26 @@
>  }
> 
>  /**
> + *	ata_eh_worth_retry - analyze error and decide whether to retry
> + *	@qc: qc to possibly retry
> + *
> + *	Look at the cause of the error and decide if a retry
> + * 	might be useful or not.  We don't want to retry media errors
> + *	because the drive itself has probably already taken 10-30 seconds
> + *	doing its own internal retries before reporting the failure.
> + */
> +static inline int ata_eh_worth_retry(struct ata_queued_cmd *qc)

Return bool? && maybe split the patch into two - the first separating
out the logic into a function, the latter changing emedia handling?

> +{
> +	if (qc->flags & AC_ERR_MEDIA)
> +		return 0;	/* don't retry media errors */
> +	if (qc->flags & ATA_QCFLAG_IO)
> +		return 1;	/* otherwise retry anything from fs stack */
> +	if (qc->err_mask & AC_ERR_INVALID)
> +		return 0;	/* don't retry these */
> +	return qc->err_mask != AC_ERR_DEV;  /* retry if not dev error */
> +}

Other than that,

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.
Mark Lord - May 2, 2012, 7:43 p.m.
On 12-05-02 03:33 PM, Tejun Heo wrote:
> On Wed, May 02, 2012 at 03:22:52PM -0400, Mark Lord wrote:
>> ATA and SATA drives have had built-in retries for media errors
>> for as long as they've been commonplace in computers (early 1990s).
>>
>> When libata stumbles across a bad sector, it can waste minutes
>> sitting there doing retry after retry before finally giving up
>> and letting the higher layers deal with it.
>>
>> This patch removes retries for media errors only.
>>
>> Signed-off-by: Mark Lord <mlord@pobox.com>
>> ---
>> version 3: try to improve readability.
>>
>> --- old/drivers/ata/libata-eh.c	2012-04-27 13:17:35.000000000 -0400
>> +++ linux/drivers/ata/libata-eh.c	2012-05-02 15:20:19.946827031 -0400
>> @@ -2046,6 +2046,26 @@
>>  }
>>
>>  /**
>> + *	ata_eh_worth_retry - analyze error and decide whether to retry
>> + *	@qc: qc to possibly retry
>> + *
>> + *	Look at the cause of the error and decide if a retry
>> + * 	might be useful or not.  We don't want to retry media errors
>> + *	because the drive itself has probably already taken 10-30 seconds
>> + *	doing its own internal retries before reporting the failure.
>> + */
>> +static inline int ata_eh_worth_retry(struct ata_queued_cmd *qc)
> 
> Return bool? && maybe split the patch into two - the first separating
> out the logic into a function, the latter changing emedia handling?

I think the two-liner from v2 is better.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - May 2, 2012, 7:46 p.m.
On Wed, May 02, 2012 at 03:43:05PM -0400, Mark Lord wrote:
> >> +static inline int ata_eh_worth_retry(struct ata_queued_cmd *qc)
> > 
> > Return bool? && maybe split the patch into two - the first separating
> > out the logic into a function, the latter changing emedia handling?
> 
> I think the two-liner from v2 is better.

Heh, I don't know.  It probably doesn't matter all that much either
way.  Let's let Jeff decide. ;)

Thanks.
Jeff Garzik - May 4, 2012, 2:25 a.m.
On 05/02/2012 03:46 PM, Tejun Heo wrote:
> On Wed, May 02, 2012 at 03:43:05PM -0400, Mark Lord wrote:
>>>> +static inline int ata_eh_worth_retry(struct ata_queued_cmd *qc)
>>>
>>> Return bool?&&  maybe split the patch into two - the first separating
>>> out the logic into a function, the latter changing emedia handling?
>>
>> I think the two-liner from v2 is better.
>
> Heh, I don't know.  It probably doesn't matter all that much either
> way.  Let's let Jeff decide. ;)

Queued v3.  If anyone is motivated to make additional changes, base them 
on top of v3...



--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Lord - May 4, 2012, 3:04 a.m.
On 12-05-03 10:25 PM, Jeff Garzik wrote:
> On 05/02/2012 03:46 PM, Tejun Heo wrote:
>> On Wed, May 02, 2012 at 03:43:05PM -0400, Mark Lord wrote:
>>>>> +static inline int ata_eh_worth_retry(struct ata_queued_cmd *qc)
>>>>
>>>> Return bool?&&  maybe split the patch into two - the first separating
>>>> out the logic into a function, the latter changing emedia handling?
>>>
>>> I think the two-liner from v2 is better.
>>
>> Heh, I don't know.  It probably doesn't matter all that much either
>> way.  Let's let Jeff decide. ;)
> 
> Queued v3.  If anyone is motivated to make additional changes, base them on top of v3...


Peachy.  Thanks, Jeff!


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

--- old/drivers/ata/libata-eh.c	2012-04-27 13:17:35.000000000 -0400
+++ linux/drivers/ata/libata-eh.c	2012-05-02 15:20:19.946827031 -0400
@@ -2046,6 +2046,26 @@ 
 }
 
 /**
+ *	ata_eh_worth_retry - analyze error and decide whether to retry
+ *	@qc: qc to possibly retry
+ *
+ *	Look at the cause of the error and decide if a retry
+ * 	might be useful or not.  We don't want to retry media errors
+ *	because the drive itself has probably already taken 10-30 seconds
+ *	doing its own internal retries before reporting the failure.
+ */
+static inline int ata_eh_worth_retry(struct ata_queued_cmd *qc)
+{
+	if (qc->flags & AC_ERR_MEDIA)
+		return 0;	/* don't retry media errors */
+	if (qc->flags & ATA_QCFLAG_IO)
+		return 1;	/* otherwise retry anything from fs stack */
+	if (qc->err_mask & AC_ERR_INVALID)
+		return 0;	/* don't retry these */
+	return qc->err_mask != AC_ERR_DEV;  /* retry if not dev error */
+}
+
+/**
  *	ata_eh_link_autopsy - analyze error and determine recovery action
  *	@link: host link to perform autopsy on
  *
@@ -2119,9 +2139,7 @@ 
 			qc->err_mask &= ~(AC_ERR_DEV | AC_ERR_OTHER);
 
 		/* determine whether the command is worth retrying */
-		if (qc->flags & ATA_QCFLAG_IO ||
-		    (!(qc->err_mask & AC_ERR_INVALID) &&
-		     qc->err_mask != AC_ERR_DEV))
+		if (ata_eh_worth_retry(qc))
 			qc->flags |= ATA_QCFLAG_RETRY;
 
 		/* accumulate error info */