diff mbox

[v3] ahci_xgene: Fix the dma state machine lockup for the ATA_CMD_SMART PIO mode command.

Message ID 1422900439-5541-2-git-send-email-stripathi@apm.com
State Not Applicable
Delegated to: David Miller
Headers show

Commit Message

Suman Tripathi Feb. 2, 2015, 6:07 p.m. UTC
This patch addresses the issue with ATA_CMD_SMART pio mode
command for enumeration and device detection with ATA devices.
The X-Gene AHCI controller has an errata in which it cannot clear
the BSY bit after the PIO setup FIS. The dma state machine enters
CMFatalErrorUpdate state and locks up. It is the same issue as
in the commit 2a0bdff6b958d1b2523d2754b6cd5e0ea4053016 (ahci-xgene:
fix the dma state machine lockup for the IDENTIFY DEVICE PIO mode
command).

For example :  without this patch it results in READ DMA command failure
as shown below :

 [  126.700072] ata2.00: exception Emask 0x0 SAct 0x0
		SErr 0x0 action 0x6 frozen
 [  126.707089] ata2.00: failed command: READ DMA
 [  126.711426] ata2.00: cmd c8/00:08:00:55:57/00:00:00:00:00/e1 tag 1
                dma 4096 in
 [  126.711426]  res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask
		0x4 (timeout)
 [  126.725956] ata2.00: status: { DRDY }

Signed-off-by: Suman Tripathi <stripathi@apm.com>
Reported-by:   Mark Langsdorf <mlangsdo@redhat.com>
---
---
 drivers/ata/ahci_xgene.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Sergei Shtylyov Feb. 2, 2015, 7:12 p.m. UTC | #1
Hello.

On 02/02/2015 09:07 PM, Suman Tripathi wrote:

> This patch addresses the issue with ATA_CMD_SMART pio mode
> command for enumeration and device detection with ATA devices.
> The X-Gene AHCI controller has an errata in which it cannot clear
> the BSY bit after the PIO setup FIS. The dma state machine enters

    Hum, if this happens after every PIO command (PIO setup FISes are not 
specific to the command, right?), perhaps it would make more sense to record 
the *protocol* used by the last command?

> CMFatalErrorUpdate state and locks up. It is the same issue as
> in the commit 2a0bdff6b958d1b2523d2754b6cd5e0ea4053016 (ahci-xgene:
> fix the dma state machine lockup for the IDENTIFY DEVICE PIO mode
> command).

> For example :  without this patch it results in READ DMA command failure
> as shown below :

[...]

> Signed-off-by: Suman Tripathi <stripathi@apm.com>
> Reported-by:   Mark Langsdorf <mlangsdo@redhat.com>

MBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sergei Shtylyov Feb. 2, 2015, 7:44 p.m. UTC | #2
On 02/02/2015 10:37 PM, Suman Tripathi wrote:

    Ugh, please avoid using HTML when posting to the lists hosted on 
vger.kernel.org -- it's configured to ignore such mails AFAIK.

>>>     This patch addresses the issue with ATA_CMD_SMART pio mode
>>>     command for enumeration and device detection with ATA devices.
>>>     The X-Gene AHCI controller has an errata in which it cannot clear
>>>     the BSY bit after the PIO setup FIS. The dma state machine enters

>>     Hum, if this happens after every PIO command (PIO setup FISes are not
>> specific to the command, right?), perhaps it would make more sense to record
>> the *protocol* used by the last command?

> No it happens for IDENTIFY DEVICE, ATA_CMD_PACKET and ATA_CMD_SMART commands .
> It is actually the commands associated with a BSY bit clearing.

    I don't understand that -- BSY bit is cleared for *every* command, either 
at the end of it, or along with setting the DRQ bit for PIO data transfer.

>>>     CMFatalErrorUpdate state and locks up. It is the same issue as
>>>     in the commit 2a0bdff6b958d1b2523d2754b6cd5e__0ea4053016 (ahci-xgene:
>>>     fix the dma state machine lockup for the IDENTIFY DEVICE PIO mode
>>>     command).

>> [...]

>>>     Signed-off-by: Suman Tripathi <stripathi@apm.com <mailto:stripathi@apm.com>>
>>>     Reported-by:   Mark Langsdorf <mlangsdo@redhat.com
>>>     <mailto:mlangsdo@redhat.com>>
>>>

MBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Suman Tripathi Feb. 2, 2015, 7:48 p.m. UTC | #3
Ugh, please avoid using HTML when posting to the lists hosted on
vger.kernel.org -- it's configured to ignore such mails AFAIK.
Yeah forgot to set the plain text mode. Sorry for that

>>>     This patch addresses the issue with ATA_CMD_SMART pio mode
>>>     command for enumeration and device detection with ATA devices.
>>>     The X-Gene AHCI controller has an errata in which it cannot clear
>>>     the BSY bit after the PIO setup FIS. The dma state machine enters


>>     Hum, if this happens after every PIO command (PIO setup FISes are not
>> specific to the command, right?), perhaps it would make more sense to record
>> the *protocol* used by the last command?


> No it happens for IDENTIFY DEVICE, ATA_CMD_PACKET and ATA_CMD_SMART commands .
> It is actually the commands associated with a BSY bit clearing.


   I don't understand that -- BSY bit is cleared for *every* command,
either at the end of it, or along with setting the DRQ bit for PIO
data transfer.
BSY bit is cleared after the end of these commands by the controller.
So due to the bug it doesn't clear it. When the DMA commands is issued
, it check for the BSY bit cleared or not and finds not cleared and
results in CMFatalErrorUpdate  and hangs.

>>>     CMFatalErrorUpdate state and locks up. It is the same issue as
>>>     in the commit 2a0bdff6b958d1b2523d2754b6cd5e__0ea4053016 (ahci-xgene:
>>>     fix the dma state machine lockup for the IDENTIFY DEVICE PIO mode
>>>     command).


>> [...]


>>>     Signed-off-by: Suman Tripathi <stripathi@apm.com <mailto:stripathi@apm.com>>
>>>     Reported-by:   Mark Langsdorf <mlangsdo@redhat.com
>>>     <mailto:mlangsdo@redhat.com>>
>>>

MBR, Sergei

On Tue, Feb 3, 2015 at 1:14 AM, Sergei Shtylyov
<sergei.shtylyov@cogentembedded.com> wrote:
> On 02/02/2015 10:37 PM, Suman Tripathi wrote:
>
>    Ugh, please avoid using HTML when posting to the lists hosted on
> vger.kernel.org -- it's configured to ignore such mails AFAIK.
>
>>>>     This patch addresses the issue with ATA_CMD_SMART pio mode
>>>>     command for enumeration and device detection with ATA devices.
>>>>     The X-Gene AHCI controller has an errata in which it cannot clear
>>>>     the BSY bit after the PIO setup FIS. The dma state machine enters
>
>
>>>     Hum, if this happens after every PIO command (PIO setup FISes are not
>>> specific to the command, right?), perhaps it would make more sense to
>>> record
>>> the *protocol* used by the last command?
>
>
>> No it happens for IDENTIFY DEVICE, ATA_CMD_PACKET and ATA_CMD_SMART
>> commands .
>> It is actually the commands associated with a BSY bit clearing.
>
>
>    I don't understand that -- BSY bit is cleared for *every* command, either
> at the end of it, or along with setting the DRQ bit for PIO data transfer.
>
>>>>     CMFatalErrorUpdate state and locks up. It is the same issue as
>>>>     in the commit 2a0bdff6b958d1b2523d2754b6cd5e__0ea4053016
>>>> (ahci-xgene:
>>>>     fix the dma state machine lockup for the IDENTIFY DEVICE PIO mode
>>>>     command).
>
>
>>> [...]
>
>
>>>>     Signed-off-by: Suman Tripathi <stripathi@apm.com
>>>> <mailto:stripathi@apm.com>>
>>>>     Reported-by:   Mark Langsdorf <mlangsdo@redhat.com
>>>>     <mailto:mlangsdo@redhat.com>>
>>>>
>
> MBR, Sergei
>
Suman Tripathi Feb. 3, 2015, 4:12 p.m. UTC | #4
Ugh, please avoid using HTML when posting to the lists hosted on
vger.kernel.org -- it's configured to ignore such mails AFAIK.
Yeah forgot to set the plain text mode. Sorry for that

>>>     This patch addresses the issue with ATA_CMD_SMART pio mode
>>>     command for enumeration and device detection with ATA devices.
>>>     The X-Gene AHCI controller has an errata in which it cannot clear
>>>     the BSY bit after the PIO setup FIS. The dma state machine enters


>>     Hum, if this happens after every PIO command (PIO setup FISes are not
>> specific to the command, right?), perhaps it would make more sense to record
>> the *protocol* used by the last command?


> No it happens for IDENTIFY DEVICE, ATA_CMD_PACKET and ATA_CMD_SMART commands .
> It is actually the commands associated with a BSY bit clearing.


   I don't understand that -- BSY bit is cleared for *every* command,
either at the end of it, or along with setting the DRQ bit for PIO
data transfer.
BSY bit is cleared after the end of these commands by the controller.
So due to the bug it doesn't clear it. When the DMA commands is issued
, it check for the BSY bit cleared or not and finds not cleared and
results in CMFatalErrorUpdate  and hangs.

Ping ?? Any problem with this last posted version  Actually Mark from
redhat reported this issue while using smartctl and we found that
ATA_CMD_SMART also has BSY bit not clearing issue..

On Tue, Feb 3, 2015 at 1:18 AM, Suman Tripathi <stripathi@apm.com> wrote:
>    Ugh, please avoid using HTML when posting to the lists hosted on
> vger.kernel.org -- it's configured to ignore such mails AFAIK.
> Yeah forgot to set the plain text mode. Sorry for that
>
>>>>     This patch addresses the issue with ATA_CMD_SMART pio mode
>>>>     command for enumeration and device detection with ATA devices.
>>>>     The X-Gene AHCI controller has an errata in which it cannot clear
>>>>     the BSY bit after the PIO setup FIS. The dma state machine enters
>
>
>>>     Hum, if this happens after every PIO command (PIO setup FISes are not
>>> specific to the command, right?), perhaps it would make more sense to record
>>> the *protocol* used by the last command?
>
>
>> No it happens for IDENTIFY DEVICE, ATA_CMD_PACKET and ATA_CMD_SMART commands .
>> It is actually the commands associated with a BSY bit clearing.
>
>
>    I don't understand that -- BSY bit is cleared for *every* command,
> either at the end of it, or along with setting the DRQ bit for PIO
> data transfer.
> BSY bit is cleared after the end of these commands by the controller.
> So due to the bug it doesn't clear it. When the DMA commands is issued
> , it check for the BSY bit cleared or not and finds not cleared and
> results in CMFatalErrorUpdate  and hangs.
>
>>>>     CMFatalErrorUpdate state and locks up. It is the same issue as
>>>>     in the commit 2a0bdff6b958d1b2523d2754b6cd5e__0ea4053016 (ahci-xgene:
>>>>     fix the dma state machine lockup for the IDENTIFY DEVICE PIO mode
>>>>     command).
>
>
>>> [...]
>
>
>>>>     Signed-off-by: Suman Tripathi <stripathi@apm.com <mailto:stripathi@apm.com>>
>>>>     Reported-by:   Mark Langsdorf <mlangsdo@redhat.com
>>>>     <mailto:mlangsdo@redhat.com>>
>>>>
>
> MBR, Sergei
>
> On Tue, Feb 3, 2015 at 1:14 AM, Sergei Shtylyov
> <sergei.shtylyov@cogentembedded.com> wrote:
>> On 02/02/2015 10:37 PM, Suman Tripathi wrote:
>>
>>    Ugh, please avoid using HTML when posting to the lists hosted on
>> vger.kernel.org -- it's configured to ignore such mails AFAIK.
>>
>>>>>     This patch addresses the issue with ATA_CMD_SMART pio mode
>>>>>     command for enumeration and device detection with ATA devices.
>>>>>     The X-Gene AHCI controller has an errata in which it cannot clear
>>>>>     the BSY bit after the PIO setup FIS. The dma state machine enters
>>
>>
>>>>     Hum, if this happens after every PIO command (PIO setup FISes are not
>>>> specific to the command, right?), perhaps it would make more sense to
>>>> record
>>>> the *protocol* used by the last command?
>>
>>
>>> No it happens for IDENTIFY DEVICE, ATA_CMD_PACKET and ATA_CMD_SMART
>>> commands .
>>> It is actually the commands associated with a BSY bit clearing.
>>
>>
>>    I don't understand that -- BSY bit is cleared for *every* command, either
>> at the end of it, or along with setting the DRQ bit for PIO data transfer.
>>
>>>>>     CMFatalErrorUpdate state and locks up. It is the same issue as
>>>>>     in the commit 2a0bdff6b958d1b2523d2754b6cd5e__0ea4053016
>>>>> (ahci-xgene:
>>>>>     fix the dma state machine lockup for the IDENTIFY DEVICE PIO mode
>>>>>     command).
>>
>>
>>>> [...]
>>
>>
>>>>>     Signed-off-by: Suman Tripathi <stripathi@apm.com
>>>>> <mailto:stripathi@apm.com>>
>>>>>     Reported-by:   Mark Langsdorf <mlangsdo@redhat.com
>>>>>     <mailto:mlangsdo@redhat.com>>
>>>>>
>>
>> MBR, Sergei
>>
>
>
>
> --
> Thanks,
> with regards,
> Suman Tripathi
Tejun Heo Feb. 3, 2015, 4:18 p.m. UTC | #5
On Mon, Feb 02, 2015 at 11:37:19PM +0530, Suman Tripathi wrote:
> This patch addresses the issue with ATA_CMD_SMART pio mode
> command for enumeration and device detection with ATA devices.
> The X-Gene AHCI controller has an errata in which it cannot clear
> the BSY bit after the PIO setup FIS. The dma state machine enters
> CMFatalErrorUpdate state and locks up. It is the same issue as
> in the commit 2a0bdff6b958d1b2523d2754b6cd5e0ea4053016 (ahci-xgene:

The right format is 2a0bdff6b958 ("ahci-xgene: fix the dma state
machine lockup for the IDENTIFY DEVICE PIO mode command").

> fix the dma state machine lockup for the IDENTIFY DEVICE PIO mode
> command).
> 
> For example :  without this patch it results in READ DMA command failure
> as shown below :
> 
>  [  126.700072] ata2.00: exception Emask 0x0 SAct 0x0
> 		SErr 0x0 action 0x6 frozen
>  [  126.707089] ata2.00: failed command: READ DMA
>  [  126.711426] ata2.00: cmd c8/00:08:00:55:57/00:00:00:00:00/e1 tag 1
>                 dma 4096 in
>  [  126.711426]  res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask
> 		0x4 (timeout)
>  [  126.725956] ata2.00: status: { DRDY }
> 
> Signed-off-by: Suman Tripathi <stripathi@apm.com>
> Reported-by:   Mark Langsdorf <mlangsdo@redhat.com>

Applied to libata/for-3.19-fixes.

Thanks.
diff mbox

Patch

diff --git a/drivers/ata/ahci_xgene.c b/drivers/ata/ahci_xgene.c
index 7f68875..506cf5f 100644
--- a/drivers/ata/ahci_xgene.c
+++ b/drivers/ata/ahci_xgene.c
@@ -211,7 +211,8 @@  static unsigned int xgene_ahci_qc_issue(struct ata_queued_cmd *qc)
 	}

 	if (unlikely((ctx->last_cmd[ap->port_no] == ATA_CMD_ID_ATA) ||
-	    (ctx->last_cmd[ap->port_no] == ATA_CMD_PACKET)))
+	    (ctx->last_cmd[ap->port_no] == ATA_CMD_PACKET) ||
+	    (ctx->last_cmd[ap->port_no] == ATA_CMD_SMART)))
 		xgene_ahci_restart_engine(ap);

 	rc = ahci_qc_issue(qc);