Patchwork IDE disk FLUSH take more than 30 secs, the SUSE guest reports "lost interrupt and the file system becomes read-only"

login
register
mail settings
Submitter Andreas Färber
Date May 26, 2013, 5:08 p.m.
Message ID <51A24172.4020208@suse.de>
Download mbox | patch
Permalink /patch/246434/
State New
Headers show

Comments

Andreas Färber - May 26, 2013, 5:08 p.m.
Am 21.05.2013 09:12, schrieb Gonglei (Arei):
> Through analysis, I found that because the system call the fdatasync command in the Qemu over 30s, 
> after the Guest's kernel thread detects the io transferation is timeout, went to check IDE disk state. 
> But the IDE disk status is 0x50, rather than the BSY status, and then departed error process...
> 
> the path of kernel's action is :
> scsi_softirq_done
>  scsi_eh_scmd_add
>    scsi_error_handler
>      shost->transportt->eh_strategy_handler 
> 		ata_scsi_error 
> 			ap->ops->lost_interrupt
> 				ata_sff_lost_interrupt
> Finally, the file system becomes read-only.
> 
> Why not set the IDE disk for the BSY status When 0xe7 command is executed in the Qemu?

Have you actually tried that out with a patch such as the following?


No clue if this is spec-compliant. ;)

Note however that qemu_fdatasync() is done in the flush callback of
block/raw-posix.c, so IIUC everything calling bdrv_aio_flush() or
bdrv_flush_all() may potentially run into issues beyond just ATA:

hw/block/virtio-blk.c
hw/block/xen_disk.c
hw/ide/core.c
hw/scsi/scsi-disk.c

cpus.c:do_vm_stop()
hw/xen/xen_platform.c:platform_fixed_ioport_writew()

qemu_fdatasync() further occurs in:

hw/block/dataplane/virtio-blk.c:process_request()
hw/9pfs/virtio-9p-*.c

Quite possibly not all of them are problematic, but flush times >30 sec
are very likely not well tested by developers...

Regards,
Andreas
Stefan Hajnoczi - May 27, 2013, 11:39 a.m.
On Sun, May 26, 2013 at 07:08:02PM +0200, Andreas Färber wrote:
> Am 21.05.2013 09:12, schrieb Gonglei (Arei):
> > Through analysis, I found that because the system call the fdatasync command in the Qemu over 30s, 
> > after the Guest's kernel thread detects the io transferation is timeout, went to check IDE disk state. 
> > But the IDE disk status is 0x50, rather than the BSY status, and then departed error process...
> > 
> > the path of kernel's action is :
> > scsi_softirq_done
> >  scsi_eh_scmd_add
> >    scsi_error_handler
> >      shost->transportt->eh_strategy_handler 
> > 		ata_scsi_error 
> > 			ap->ops->lost_interrupt
> > 				ata_sff_lost_interrupt
> > Finally, the file system becomes read-only.
> > 
> > Why not set the IDE disk for the BSY status When 0xe7 command is executed in the Qemu?
> 
> Have you actually tried that out with a patch such as the following?
> 
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index c7a8041..bf1ff18 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -795,6 +795,8 @@ static void ide_flush_cb(void *opaque, int ret)
>  {
>      IDEState *s = opaque;
> 
> +    s->status &= ~BUSY_STAT;
> +
>      if (ret < 0) {
>          /* XXX: What sector number to set here? */
>          if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) {
> @@ -814,6 +816,7 @@ void ide_flush_cache(IDEState *s)
>          return;
>      }
> 
> +    s->status |= BUSY_STAT;
>      bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
>      bdrv_aio_flush(s->bs, ide_flush_cb, s);
>  }
> 
> No clue if this is spec-compliant. ;)
> 
> Note however that qemu_fdatasync() is done in the flush callback of
> block/raw-posix.c, so IIUC everything calling bdrv_aio_flush() or
> bdrv_flush_all() may potentially run into issues beyond just ATA:

This is an IDE emulation bug.  virtio-blk, for example, doesn't have
this kind of busy status bit.  It's probably not an issue with SCSI
either.

Stefan
Paolo Bonzini - May 27, 2013, 9:05 p.m.
Il 26/05/2013 19:08, Andreas Färber ha scritto:
> Have you actually tried that out with a patch such as the following?
> 
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index c7a8041..bf1ff18 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -795,6 +795,8 @@ static void ide_flush_cb(void *opaque, int ret)
>  {
>      IDEState *s = opaque;
> 
> +    s->status &= ~BUSY_STAT;
> +
>      if (ret < 0) {
>          /* XXX: What sector number to set here? */
>          if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) {
> @@ -814,6 +816,7 @@ void ide_flush_cache(IDEState *s)
>          return;
>      }
> 
> +    s->status |= BUSY_STAT;
>      bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
>      bdrv_aio_flush(s->bs, ide_flush_cb, s);
>  }

Yes, this patch is correct.  Can you resend with S-o-b?

Paolo

Patch

diff --git a/hw/ide/core.c b/hw/ide/core.c
index c7a8041..bf1ff18 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -795,6 +795,8 @@  static void ide_flush_cb(void *opaque, int ret)
 {
     IDEState *s = opaque;

+    s->status &= ~BUSY_STAT;
+
     if (ret < 0) {
         /* XXX: What sector number to set here? */
         if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) {
@@ -814,6 +816,7 @@  void ide_flush_cache(IDEState *s)
         return;
     }

+    s->status |= BUSY_STAT;
     bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
     bdrv_aio_flush(s->bs, ide_flush_cb, s);
 }