Patchwork [v2,3/4] ide: Set BSY bit during FLUSH

login
register
mail settings
Submitter Kevin Wolf
Date June 5, 2013, 1:17 p.m.
Message ID <1370438278-1703-4-git-send-email-kwolf@redhat.com>
Download mbox | patch
Permalink /patch/249047/
State New
Headers show

Comments

Kevin Wolf - June 5, 2013, 1:17 p.m.
From: Andreas Färber <afaerber@suse.de>

The implementation of the ATA FLUSH command invokes a flush at the block
layer, which may on raw files on POSIX entail a synchronous fdatasync().
This may in some cases take so long that the SLES 11 SP1 guest driver
reports I/O errors and filesystems get corrupted or remounted read-only.

Avoid this by setting BUSY_STAT, so that the guest is made aware we are
in the middle of an operation and no ATA commands are attempted to be
processed concurrently.

Addresses BNC#637297.

Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 hw/ide/core.c | 1 +
 1 file changed, 1 insertion(+)
Alex Williamson - July 3, 2013, 8:02 p.m.
On Wed, 2013-06-05 at 15:17 +0200, Kevin Wolf wrote:
> From: Andreas Färber <afaerber@suse.de>
> 
> The implementation of the ATA FLUSH command invokes a flush at the block
> layer, which may on raw files on POSIX entail a synchronous fdatasync().
> This may in some cases take so long that the SLES 11 SP1 guest driver
> reports I/O errors and filesystems get corrupted or remounted read-only.
> 
> Avoid this by setting BUSY_STAT, so that the guest is made aware we are
> in the middle of an operation and no ATA commands are attempted to be
> processed concurrently.
> 
> Addresses BNC#637297.
> 
> Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com>
> Signed-off-by: Andreas Färber <afaerber@suse.de>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  hw/ide/core.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index c7a8041..9926d92 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -814,6 +814,7 @@ void ide_flush_cache(IDEState *s)
>          return;
>      }
>  
> +    s->status |= BUSY_STAT;
>      bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
>      bdrv_aio_flush(s->bs, ide_flush_cb, s);
>  }


I can no longer boot win7 x64 on q35 with IDE using a qcow2 image.  git
bisect determined this patch is the culprit.

-M q35 -nodefconfig -readconfig docs/q35-chipset.cfg -drive
file=image.qcow2,if=none,id=mydisk -device
ide-drive,drive=mydisk,bus=ide.0

Thanks,
Alex
Kevin Wolf - July 4, 2013, 7:55 a.m.
Am 03.07.2013 um 22:02 hat Alex Williamson geschrieben:
> On Wed, 2013-06-05 at 15:17 +0200, Kevin Wolf wrote:
> > From: Andreas Färber <afaerber@suse.de>
> > 
> > The implementation of the ATA FLUSH command invokes a flush at the block
> > layer, which may on raw files on POSIX entail a synchronous fdatasync().
> > This may in some cases take so long that the SLES 11 SP1 guest driver
> > reports I/O errors and filesystems get corrupted or remounted read-only.
> > 
> > Avoid this by setting BUSY_STAT, so that the guest is made aware we are
> > in the middle of an operation and no ATA commands are attempted to be
> > processed concurrently.
> > 
> > Addresses BNC#637297.
> > 
> > Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com>
> > Signed-off-by: Andreas Färber <afaerber@suse.de>
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  hw/ide/core.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/hw/ide/core.c b/hw/ide/core.c
> > index c7a8041..9926d92 100644
> > --- a/hw/ide/core.c
> > +++ b/hw/ide/core.c
> > @@ -814,6 +814,7 @@ void ide_flush_cache(IDEState *s)
> >          return;
> >      }
> >  
> > +    s->status |= BUSY_STAT;
> >      bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
> >      bdrv_aio_flush(s->bs, ide_flush_cb, s);
> >  }
> 
> 
> I can no longer boot win7 x64 on q35 with IDE using a qcow2 image.  git
> bisect determined this patch is the culprit.
> 
> -M q35 -nodefconfig -readconfig docs/q35-chipset.cfg -drive
> file=image.qcow2,if=none,id=mydisk -device
> ide-drive,drive=mydisk,bus=ide.0

This means you're using AHCI, right?

handle_cmd() in ahci.c checks the flags and does indeed behave
differently now:

    if (s->dev[port].port.ifs[0].status & (BUSY_STAT|DRQ_STAT)) {
        /* async command, complete later */
        s->dev[port].busy_slot = slot;
        return -1;
    }

    /* done handling the command */
    return 0;

The caller of this code updates pr->cmd_issue to clear the bit for the
respective command slot. This is missed now, and the later completion
mentioned in the comment doesn't happen for flushes, the IDE core never
calls back into the AHCI core for the completion.

The correct fix might be to call ide_set_inactive() in the flush
callback, though I haven't checked in detail yet whether there's
anything specific to DMA read/write in ide_set_inactive().

Kevin
Michael S. Tsirkin - July 10, 2013, 6:27 a.m.
On Thu, Jul 04, 2013 at 09:55:42AM +0200, Kevin Wolf wrote:
> Am 03.07.2013 um 22:02 hat Alex Williamson geschrieben:
> > On Wed, 2013-06-05 at 15:17 +0200, Kevin Wolf wrote:
> > > From: Andreas Färber <afaerber@suse.de>
> > > 
> > > The implementation of the ATA FLUSH command invokes a flush at the block
> > > layer, which may on raw files on POSIX entail a synchronous fdatasync().
> > > This may in some cases take so long that the SLES 11 SP1 guest driver
> > > reports I/O errors and filesystems get corrupted or remounted read-only.
> > > 
> > > Avoid this by setting BUSY_STAT, so that the guest is made aware we are
> > > in the middle of an operation and no ATA commands are attempted to be
> > > processed concurrently.
> > > 
> > > Addresses BNC#637297.
> > > 
> > > Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com>
> > > Signed-off-by: Andreas Färber <afaerber@suse.de>
> > > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > > ---
> > >  hw/ide/core.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/hw/ide/core.c b/hw/ide/core.c
> > > index c7a8041..9926d92 100644
> > > --- a/hw/ide/core.c
> > > +++ b/hw/ide/core.c
> > > @@ -814,6 +814,7 @@ void ide_flush_cache(IDEState *s)
> > >          return;
> > >      }
> > >  
> > > +    s->status |= BUSY_STAT;
> > >      bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
> > >      bdrv_aio_flush(s->bs, ide_flush_cb, s);
> > >  }
> > 
> > 
> > I can no longer boot win7 x64 on q35 with IDE using a qcow2 image.  git
> > bisect determined this patch is the culprit.
> > 
> > -M q35 -nodefconfig -readconfig docs/q35-chipset.cfg -drive
> > file=image.qcow2,if=none,id=mydisk -device
> > ide-drive,drive=mydisk,bus=ide.0
> 
> This means you're using AHCI, right?
> 
> handle_cmd() in ahci.c checks the flags and does indeed behave
> differently now:
> 
>     if (s->dev[port].port.ifs[0].status & (BUSY_STAT|DRQ_STAT)) {
>         /* async command, complete later */
>         s->dev[port].busy_slot = slot;
>         return -1;
>     }
> 
>     /* done handling the command */
>     return 0;
> 
> The caller of this code updates pr->cmd_issue to clear the bit for the
> respective command slot. This is missed now, and the later completion
> mentioned in the comment doesn't happen for flushes, the IDE core never
> calls back into the AHCI core for the completion.
> 
> The correct fix might be to call ide_set_inactive() in the flush
> callback, though I haven't checked in detail yet whether there's
> anything specific to DMA read/write in ide_set_inactive().
> 
> Kevin

Any resolution yet? This blocks testing for me.

Patch

diff --git a/hw/ide/core.c b/hw/ide/core.c
index c7a8041..9926d92 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -814,6 +814,7 @@  void ide_flush_cache(IDEState *s)
         return;
     }
 
+    s->status |= BUSY_STAT;
     bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
     bdrv_aio_flush(s->bs, ide_flush_cb, s);
 }