diff mbox

[08/16] ahci: clear error register before NCQ cmd

Message ID 1434765047-29333-9-git-send-email-jsnow@redhat.com
State New
Headers show

Commit Message

John Snow June 20, 2015, 1:50 a.m. UTC
The legacy ide command execution layer will clear any errors outstanding
before execution, but the NCQ layer doesn't.
Even on success, this register will remain clogged.

Clear it out before each NCQ command so the guest can tell if the
error code produced after completion is meaningful or not.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 hw/ide/ahci.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Stefan Hajnoczi June 22, 2015, 2:38 p.m. UTC | #1
On Fri, Jun 19, 2015 at 09:50:39PM -0400, John Snow wrote:
> The legacy ide command execution layer will clear any errors outstanding
> before execution, but the NCQ layer doesn't.
> Even on success, this register will remain clogged.
> 
> Clear it out before each NCQ command so the guest can tell if the
> error code produced after completion is meaningful or not.
> 
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  hw/ide/ahci.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index 6bded67..e63ba9b 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -1048,6 +1048,8 @@ static void process_ncq_command(AHCIState *s, int port, uint8_t *cmd_fis,
>              ncq_tfs->lba, ncq_tfs->lba + ncq_tfs->sector_count - 1,
>              ide_state->nb_sectors - 1);
>  
> +    ide_state->error = 0;

I'm not sure it makes sense use ide_state at all in NCQ.

ide_state is per-port and NCQ can issue multiple asynchronous commands
per port.  If process_ncq_command() modifies ide_state, it may do that
while other commands are still pending or about to be processed.

This will clobber ide_state->error.
John Snow June 22, 2015, 2:43 p.m. UTC | #2
On 06/22/2015 10:38 AM, Stefan Hajnoczi wrote:
> On Fri, Jun 19, 2015 at 09:50:39PM -0400, John Snow wrote:
>> The legacy ide command execution layer will clear any errors
>> outstanding before execution, but the NCQ layer doesn't. Even on
>> success, this register will remain clogged.
>> 
>> Clear it out before each NCQ command so the guest can tell if
>> the error code produced after completion is meaningful or not.
>> 
>> Signed-off-by: John Snow <jsnow@redhat.com> --- hw/ide/ahci.c | 2
>> ++ 1 file changed, 2 insertions(+)
>> 
>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 6bded67..e63ba9b
>> 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -1048,6 +1048,8
>> @@ static void process_ncq_command(AHCIState *s, int port,
>> uint8_t *cmd_fis, ncq_tfs->lba, ncq_tfs->lba +
>> ncq_tfs->sector_count - 1, ide_state->nb_sectors - 1);
>> 
>> +    ide_state->error = 0;
> 
> I'm not sure it makes sense use ide_state at all in NCQ.
> 
> ide_state is per-port and NCQ can issue multiple asynchronous
> commands per port.  If process_ncq_command() modifies ide_state, it
> may do that while other commands are still pending or about to be
> processed.
> 
> This will clobber ide_state->error.
> 

Good point. The real problem that we need to fix then is in the core
IDE layer, which tends to set a "default error" and waits for
ide_exec_cmd to clear it before execution.

If we don't clear that bit somehow, we will get failed commands.

I'll have to do a little research to see if it's safe to remove our
startup error, since it might be part of an ATA signature handshake.

Hmm.

Maybe it's not actually a problem because any real OS should probably
have issued ATA IDENTIFY (which will clear ERR) by the time they get
to NCQ, so maybe we can scrap this, and I'll adjust the test accordingly
to issue at least IDENTIFY before attempting NCQ.

Thanks.
--js
John Snow June 22, 2015, 9:51 p.m. UTC | #3
On 06/22/2015 10:43 AM, John Snow wrote:
> 
> 
> On 06/22/2015 10:38 AM, Stefan Hajnoczi wrote:
>> On Fri, Jun 19, 2015 at 09:50:39PM -0400, John Snow wrote:
>>> The legacy ide command execution layer will clear any errors
>>> outstanding before execution, but the NCQ layer doesn't. Even on
>>> success, this register will remain clogged.
>>>
>>> Clear it out before each NCQ command so the guest can tell if
>>> the error code produced after completion is meaningful or not.
>>>
>>> Signed-off-by: John Snow <jsnow@redhat.com> --- hw/ide/ahci.c | 2
>>> ++ 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 6bded67..e63ba9b
>>> 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -1048,6 +1048,8
>>> @@ static void process_ncq_command(AHCIState *s, int port,
>>> uint8_t *cmd_fis, ncq_tfs->lba, ncq_tfs->lba +
>>> ncq_tfs->sector_count - 1, ide_state->nb_sectors - 1);
>>>
>>> +    ide_state->error = 0;
>>
>> I'm not sure it makes sense use ide_state at all in NCQ.
>>
>> ide_state is per-port and NCQ can issue multiple asynchronous
>> commands per port.  If process_ncq_command() modifies ide_state, it
>> may do that while other commands are still pending or about to be
>> processed.
>>
>> This will clobber ide_state->error.
>>
> 
> Good point. The real problem that we need to fix then is in the core
> IDE layer, which tends to set a "default error" and waits for
> ide_exec_cmd to clear it before execution.
> 
> If we don't clear that bit somehow, we will get failed commands.
> 
> I'll have to do a little research to see if it's safe to remove our
> startup error, since it might be part of an ATA signature handshake.
> 

It is indeed part of a startup signature and should not be removed.

> Hmm.
> 
> Maybe it's not actually a problem because any real OS should probably
> have issued ATA IDENTIFY (which will clear ERR) by the time they get
> to NCQ, so maybe we can scrap this, and I'll adjust the test accordingly
> to issue at least IDENTIFY before attempting NCQ.

It's not exquisitely clear what the NCQ state machine for SATA should
look like -- I need to figure out what a real device might do if you
attempt to send it an NCQ command before a "hello."

Or I'll just ignore this whole train of thought under the premise that
every OS alive will always send an IDENTIFY packet first, and be on my way.
Stefan Hajnoczi June 23, 2015, 1:49 p.m. UTC | #4
On Mon, Jun 22, 2015 at 05:51:10PM -0400, John Snow wrote:
> Or I'll just ignore this whole train of thought under the premise that
> every OS alive will always send an IDENTIFY packet first, and be on my way.

Yes, it's not worth worrying about at this stage when guest OSes seem to
be happy with QEMU's AHCI emulation.

Stefan
diff mbox

Patch

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 6bded67..e63ba9b 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -1048,6 +1048,8 @@  static void process_ncq_command(AHCIState *s, int port, uint8_t *cmd_fis,
             ncq_tfs->lba, ncq_tfs->lba + ncq_tfs->sector_count - 1,
             ide_state->nb_sectors - 1);
 
+    ide_state->error = 0;
+
     switch(ncq_fis->command) {
         case READ_FPDMA_QUEUED:
             DPRINTF(port, "NCQ reading %d sectors from LBA %"PRId64", "