Message ID | 1371073298-14519-10-git-send-email-mdroth@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On Wed, 2013-06-12 at 16:41 -0500, Michael Roth wrote: > From: Andreas Färber <afaerber@suse.de> > > The implementation of the ATA FLUSH command invokes a flush at the block > layer, which may on raw files on POSIX entail a synchronous fdatasync(). > This may in some cases take so long that the SLES 11 SP1 guest driver > reports I/O errors and filesystems get corrupted or remounted read-only. > > Avoid this by setting BUSY_STAT, so that the guest is made aware we are > in the middle of an operation and no ATA commands are attempted to be > processed concurrently. > > Addresses BNC#637297. > > Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com> > Signed-off-by: Andreas Färber <afaerber@suse.de> > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > (cherry picked from commit f68ec8379e88502b4841a110c070e9b118d3151c) > > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> > --- > hw/ide/core.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/hw/ide/core.c b/hw/ide/core.c > index c7a8041..9926d92 100644 > --- a/hw/ide/core.c > +++ b/hw/ide/core.c > @@ -814,6 +814,7 @@ void ide_flush_cache(IDEState *s) > return; > } > > + s->status |= BUSY_STAT; > bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH); > bdrv_aio_flush(s->bs, ide_flush_cb, s); > } This caused a regression for me from 1.5.0->1.5.1. Windows 7 x64 no longer boots on q35 IDE with this change. Thanks, Alex
On Wed, Jul 3, 2013 at 3:10 PM, Alex Williamson <alex.williamson@redhat.com> wrote: > On Wed, 2013-06-12 at 16:41 -0500, Michael Roth wrote: >> From: Andreas Färber <afaerber@suse.de> >> >> The implementation of the ATA FLUSH command invokes a flush at the block >> layer, which may on raw files on POSIX entail a synchronous fdatasync(). >> This may in some cases take so long that the SLES 11 SP1 guest driver >> reports I/O errors and filesystems get corrupted or remounted read-only. >> >> Avoid this by setting BUSY_STAT, so that the guest is made aware we are >> in the middle of an operation and no ATA commands are attempted to be >> processed concurrently. >> >> Addresses BNC#637297. >> >> Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com> >> Signed-off-by: Andreas Färber <afaerber@suse.de> >> Signed-off-by: Kevin Wolf <kwolf@redhat.com> >> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> >> (cherry picked from commit f68ec8379e88502b4841a110c070e9b118d3151c) >> >> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> >> --- >> hw/ide/core.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/hw/ide/core.c b/hw/ide/core.c >> index c7a8041..9926d92 100644 >> --- a/hw/ide/core.c >> +++ b/hw/ide/core.c >> @@ -814,6 +814,7 @@ void ide_flush_cache(IDEState *s) >> return; >> } >> >> + s->status |= BUSY_STAT; >> bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH); >> bdrv_aio_flush(s->bs, ide_flush_cb, s); >> } > > This caused a regression for me from 1.5.0->1.5.1. Windows 7 x64 no > longer boots on q35 IDE with this change. Thanks, Are you seeing the issue for upstream builds as well? > > Alex >
On Wed, 2013-07-03 at 15:16 -0500, Michael Roth wrote: > On Wed, Jul 3, 2013 at 3:10 PM, Alex Williamson > <alex.williamson@redhat.com> wrote: > > On Wed, 2013-06-12 at 16:41 -0500, Michael Roth wrote: > >> From: Andreas Färber <afaerber@suse.de> > >> > >> The implementation of the ATA FLUSH command invokes a flush at the block > >> layer, which may on raw files on POSIX entail a synchronous fdatasync(). > >> This may in some cases take so long that the SLES 11 SP1 guest driver > >> reports I/O errors and filesystems get corrupted or remounted read-only. > >> > >> Avoid this by setting BUSY_STAT, so that the guest is made aware we are > >> in the middle of an operation and no ATA commands are attempted to be > >> processed concurrently. > >> > >> Addresses BNC#637297. > >> > >> Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com> > >> Signed-off-by: Andreas Färber <afaerber@suse.de> > >> Signed-off-by: Kevin Wolf <kwolf@redhat.com> > >> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > >> (cherry picked from commit f68ec8379e88502b4841a110c070e9b118d3151c) > >> > >> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> > >> --- > >> hw/ide/core.c | 1 + > >> 1 file changed, 1 insertion(+) > >> > >> diff --git a/hw/ide/core.c b/hw/ide/core.c > >> index c7a8041..9926d92 100644 > >> --- a/hw/ide/core.c > >> +++ b/hw/ide/core.c > >> @@ -814,6 +814,7 @@ void ide_flush_cache(IDEState *s) > >> return; > >> } > >> > >> + s->status |= BUSY_STAT; > >> bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH); > >> bdrv_aio_flush(s->bs, ide_flush_cb, s); > >> } > > > > This caused a regression for me from 1.5.0->1.5.1. Windows 7 x64 no > > longer boots on q35 IDE with this change. Thanks, > > Are you seeing the issue for upstream builds as well? Yes, I bisected this on upstream. If i revert just this from 1.5.1, I can boot again. Upstream requires reverting this and a workaround for 9afce429. Thanks, Alex
Quoting Alex Williamson (2013-07-03 16:51:56) > On Wed, 2013-07-03 at 15:16 -0500, Michael Roth wrote: > > On Wed, Jul 3, 2013 at 3:10 PM, Alex Williamson > > <alex.williamson@redhat.com> wrote: > > > On Wed, 2013-06-12 at 16:41 -0500, Michael Roth wrote: > > >> From: Andreas Färber <afaerber@suse.de> > > >> > > >> The implementation of the ATA FLUSH command invokes a flush at the block > > >> layer, which may on raw files on POSIX entail a synchronous fdatasync(). > > >> This may in some cases take so long that the SLES 11 SP1 guest driver > > >> reports I/O errors and filesystems get corrupted or remounted read-only. > > >> > > >> Avoid this by setting BUSY_STAT, so that the guest is made aware we are > > >> in the middle of an operation and no ATA commands are attempted to be > > >> processed concurrently. > > >> > > >> Addresses BNC#637297. > > >> > > >> Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com> > > >> Signed-off-by: Andreas Färber <afaerber@suse.de> > > >> Signed-off-by: Kevin Wolf <kwolf@redhat.com> > > >> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > >> (cherry picked from commit f68ec8379e88502b4841a110c070e9b118d3151c) > > >> > > >> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> > > >> --- > > >> hw/ide/core.c | 1 + > > >> 1 file changed, 1 insertion(+) > > >> > > >> diff --git a/hw/ide/core.c b/hw/ide/core.c > > >> index c7a8041..9926d92 100644 > > >> --- a/hw/ide/core.c > > >> +++ b/hw/ide/core.c > > >> @@ -814,6 +814,7 @@ void ide_flush_cache(IDEState *s) > > >> return; > > >> } > > >> > > >> + s->status |= BUSY_STAT; > > >> bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH); > > >> bdrv_aio_flush(s->bs, ide_flush_cb, s); > > >> } > > > > > > This caused a regression for me from 1.5.0->1.5.1. Windows 7 x64 no > > > longer boots on q35 IDE with this change. Thanks, > > > > Are you seeing the issue for upstream builds as well? > > Yes, I bisected this on upstream. If i revert just this from 1.5.1, I > can boot again. Upstream requires reverting this and a workaround for > 9afce429. Thanks, Does the problem still exist upstream? I'd rather not revert without a fix for the issue in 9afce429 since that might cause a regression for 1.5.1 users now. Sucks either way but I think it's safer to have Win7 not boot on q35 than re-introduce potential image corruption. > > Alex
Quoting Michael Roth (2013-08-12 17:43:00) > Quoting Alex Williamson (2013-07-03 16:51:56) > > On Wed, 2013-07-03 at 15:16 -0500, Michael Roth wrote: > > > On Wed, Jul 3, 2013 at 3:10 PM, Alex Williamson > > > <alex.williamson@redhat.com> wrote: > > > > On Wed, 2013-06-12 at 16:41 -0500, Michael Roth wrote: > > > >> From: Andreas Färber <afaerber@suse.de> > > > >> > > > >> The implementation of the ATA FLUSH command invokes a flush at the block > > > >> layer, which may on raw files on POSIX entail a synchronous fdatasync(). > > > >> This may in some cases take so long that the SLES 11 SP1 guest driver > > > >> reports I/O errors and filesystems get corrupted or remounted read-only. > > > >> > > > >> Avoid this by setting BUSY_STAT, so that the guest is made aware we are > > > >> in the middle of an operation and no ATA commands are attempted to be > > > >> processed concurrently. > > > >> > > > >> Addresses BNC#637297. > > > >> > > > >> Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com> > > > >> Signed-off-by: Andreas Färber <afaerber@suse.de> > > > >> Signed-off-by: Kevin Wolf <kwolf@redhat.com> > > > >> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > > >> (cherry picked from commit f68ec8379e88502b4841a110c070e9b118d3151c) > > > >> > > > >> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> > > > >> --- > > > >> hw/ide/core.c | 1 + > > > >> 1 file changed, 1 insertion(+) > > > >> > > > >> diff --git a/hw/ide/core.c b/hw/ide/core.c > > > >> index c7a8041..9926d92 100644 > > > >> --- a/hw/ide/core.c > > > >> +++ b/hw/ide/core.c > > > >> @@ -814,6 +814,7 @@ void ide_flush_cache(IDEState *s) > > > >> return; > > > >> } > > > >> > > > >> + s->status |= BUSY_STAT; > > > >> bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH); > > > >> bdrv_aio_flush(s->bs, ide_flush_cb, s); > > > >> } > > > > > > > > This caused a regression for me from 1.5.0->1.5.1. Windows 7 x64 no > > > > longer boots on q35 IDE with this change. Thanks, > > > > > > Are you seeing the issue for upstream builds as well? > > > > Yes, I bisected this on upstream. If i revert just this from 1.5.1, I > > can boot again. Upstream requires reverting this and a workaround for > > 9afce429. Thanks, > > Does the problem still exist upstream? I'd rather not revert without a > fix for the issue in 9afce429 since that might cause a regression for 1.5.1 > users now. Ok, looks like we have a fix for this that was already sent to stable: a62eaa26c1d6d48fbdc3ac1d32bd1314f5fdc8c9 Could've sworn I searched master for a reference to f68ec837... Will pull it in for 1.5.3 > > Sucks either way but I think it's safer to have Win7 not boot on q35 than > re-introduce potential image corruption. > > > > > Alex
diff --git a/hw/ide/core.c b/hw/ide/core.c index c7a8041..9926d92 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -814,6 +814,7 @@ void ide_flush_cache(IDEState *s) return; } + s->status |= BUSY_STAT; bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH); bdrv_aio_flush(s->bs, ide_flush_cb, s); }