Message ID | 1379609334-20811-3-git-send-email-pbonzini@redhat.com |
---|---|
State | New |
Headers | show |
On 09/19/2013 10:48 AM, Paolo Bonzini wrote: > The following sequence happens: > Hence, whenever the guest is reset, the cache mode of the disk should > be reset to whatever was specified in the "-drive" option. With this > change, the Linux virtio-blk driver finds that writeback caching is > enabled, and tells the block layer to send cache flush commands > appropriately. > > Reported-by: Rusty Russell <rusty@au1.ibm.com > Cc: qemu-stable@nongnu.org > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > hw/block/virtio-blk.c | 8 ++++++-- > include/hw/virtio/virtio-blk.h | 1 + > 2 files changed, 7 insertions(+), 2 deletions(-) Reviewed-by: Eric Blake <eblake@redhat.com> > > features = vdev->guest_features; > - bdrv_set_enable_write_cache(s->bs, !!(features & (1 << VIRTIO_BLK_F_WCE))); > + if (!(features & (1 << VIRTIO_BLK_F_CONFIG_WCE))) { > + bdrv_set_enable_write_cache(s->bs, > + !!(features & (1 << VIRTIO_BLK_F_WCE))); > + } > } > > static void virtio_blk_save(QEMUFile *f, void *opaque) > @@ -674,6 +678,7 @@ static int virtio_blk_device_init(VirtIODevice *vdev) > } > > blkconf_serial(&blk->conf, &blk->serial); > + s->original_wce = bdrv_enable_write_cache(blk->conf.bs); At first, I was worried that this does 'bool = int', and whether that was correct in all cases. But looking further, bdrv_enable_write_cache merely returns bs->enable_write_cache (also an int), but that all assignments to bs->enable_write_cache are careful to only assign 0 or 1. A followup patch that changes the types to bool might be in order, but doesn't invalidate this patch.
Am 19.09.2013 um 18:48 hat Paolo Bonzini geschrieben: > The following sequence happens: > - the SeaBIOS virtio-blk driver does not support the WCE feature, which > causes QEMU to disable writeback caching > > - the Linux virtio-blk driver resets the device, finds WCE is available > but writeback caching is disabled; tells block layer to not send cache > flush commands > > - the Linux virtio-blk driver sets the DRIVER_OK bit, which causes > writeback caching to be re-enabled, but the Linux virtio-blk driver does > not know of this side effect and cache flushes remain disabled > > The bug is at the third step. If the guest does know about CONFIG_WCE, > QEMU should ignore the WCE feature's state. The guest will control the > cache mode solely using configuration space. This change makes Linux > do flushes correctly, but Linux will keep SeaBIOS's writethrough mode. This sounds fishy. The solutions happens to make recent Linux kernels do the right thing, but wouldn't drivers that don't know CONFIG_WCE still fall into the same trap? I guess making a host feature flag dynamic was a bad idea to start with. Perhaps we should restrict the magic to disabling WCE in case the guest doesn't have VIRTIO_BLK_F_WCE, but never allow it to enable WCE even though we've already advertised that the host doesn't have WCE. Kevin
Il 20/09/2013 11:54, Kevin Wolf ha scritto: > Am 19.09.2013 um 18:48 hat Paolo Bonzini geschrieben: >> The following sequence happens: >> - the SeaBIOS virtio-blk driver does not support the WCE feature, which >> causes QEMU to disable writeback caching >> >> - the Linux virtio-blk driver resets the device, finds WCE is available >> but writeback caching is disabled; tells block layer to not send cache >> flush commands >> >> - the Linux virtio-blk driver sets the DRIVER_OK bit, which causes >> writeback caching to be re-enabled, but the Linux virtio-blk driver does >> not know of this side effect and cache flushes remain disabled >> >> The bug is at the third step. If the guest does know about CONFIG_WCE, >> QEMU should ignore the WCE feature's state. The guest will control the >> cache mode solely using configuration space. This change makes Linux >> do flushes correctly, but Linux will keep SeaBIOS's writethrough mode. > > This sounds fishy. The solutions happens to make recent Linux kernels do > the right thing, but wouldn't drivers that don't know CONFIG_WCE still > fall into the same trap? No, drivers that don't know CONFIG_WCE will do the following: 1) -drive cache=writethrough case, WCE supported When the driver resets the device, QEMU disables the write cache (virtio_blk_reset). Thus VIRTIO_BLK_F_WCE is not advertised. The Linux virtio-blk driver tells the block layer to not send cache flush commands, which is correct because they are useless. VIRTIO_BLK_F_WCE is obviously not negotiated, and virtio_blk_set_status confirms the disk in writethrough mode. 2) -drive cache=writeback case, WCE supported When the driver resets the device, QEMU disables the write cache (virtio_blk_reset). Thus VIRTIO_BLK_F_WCE is advertised by the device and negotiated by the driver. The Linux virtio-blk driver recognizes that VIRTIO_BLK_F_WCE is negotiated and tells the block layer to send cache flush commands. virtio_blk_set_status confirms the disk in writeback mode. 3) -drive cache=writethrough case, WCE not supported When the driver resets the device, QEMU disables the write cache (virtio_blk_reset). Thus VIRTIO_BLK_F_WCE is not advertised. The virtio-blk driver doesn't do anything. virtio_blk_set_status confirms the disk in writethrough mode. 4) -drive cache=writeback case, WCE not supported When the driver resets the device, QEMU disables the write cache (virtio_blk_reset). Thus VIRTIO_BLK_F_WCE is advertised by the device, but not negotiated by the driver. The virtio-blk driver doesn't do anything. virtio_blk_set_status places the disk in writethrough mode. > I guess making a host feature flag dynamic was > a bad idea to start with. I disagree, it is very useful. The bug was unfortunate indeed, and probably happened due to testing the two patches (CONFIG_WCE and no-WCE-implies-writethrough) independently rather than together. > Perhaps we should restrict the magic to disabling WCE in case the guest > doesn't have VIRTIO_BLK_F_WCE, but never allow it to enable WCE even > though we've already advertised that the host doesn't have WCE. That's already what happens, because (thanks to the new "bdrv_set_enable_write_cache(s->bs, s->original_wce);" at reset time) VIRTIO_BLK_F_WCE is never exposed in writethrough mode. Paolo
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index e2f55cc..6ed9666 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -460,9 +460,9 @@ static void virtio_blk_dma_restart_cb(void *opaque, int running, static void virtio_blk_reset(VirtIODevice *vdev) { -#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE VirtIOBlock *s = VIRTIO_BLK(vdev); +#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE if (s->dataplane) { virtio_blk_data_plane_stop(s->dataplane); } @@ -473,6 +473,7 @@ static void virtio_blk_reset(VirtIODevice *vdev) * are per-device request lists. */ bdrv_drain_all(); + bdrv_set_enable_write_cache(s->bs, s->original_wce); } /* coalesce internal state, copy to pci i/o region 0 @@ -564,7 +565,10 @@ static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status) } features = vdev->guest_features; - bdrv_set_enable_write_cache(s->bs, !!(features & (1 << VIRTIO_BLK_F_WCE))); + if (!(features & (1 << VIRTIO_BLK_F_CONFIG_WCE))) { + bdrv_set_enable_write_cache(s->bs, + !!(features & (1 << VIRTIO_BLK_F_WCE))); + } } static void virtio_blk_save(QEMUFile *f, void *opaque) @@ -674,6 +678,7 @@ static int virtio_blk_device_init(VirtIODevice *vdev) } blkconf_serial(&blk->conf, &blk->serial); + s->original_wce = bdrv_enable_write_cache(blk->conf.bs); if (blkconf_geometry(&blk->conf, NULL, 65535, 255, 255) < 0) { return -1; } diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h index b87cf49..41885da 100644 --- a/include/hw/virtio/virtio-blk.h +++ b/include/hw/virtio/virtio-blk.h @@ -123,6 +123,7 @@ typedef struct VirtIOBlock { BlockConf *conf; VirtIOBlkConf blk; unsigned short sector_mask; + bool original_wce; VMChangeStateEntry *change; #ifdef CONFIG_VIRTIO_BLK_DATA_PLANE Notifier migration_state_notifier;
The following sequence happens: - the SeaBIOS virtio-blk driver does not support the WCE feature, which causes QEMU to disable writeback caching - the Linux virtio-blk driver resets the device, finds WCE is available but writeback caching is disabled; tells block layer to not send cache flush commands - the Linux virtio-blk driver sets the DRIVER_OK bit, which causes writeback caching to be re-enabled, but the Linux virtio-blk driver does not know of this side effect and cache flushes remain disabled The bug is at the third step. If the guest does know about CONFIG_WCE, QEMU should ignore the WCE feature's state. The guest will control the cache mode solely using configuration space. This change makes Linux do flushes correctly, but Linux will keep SeaBIOS's writethrough mode. Hence, whenever the guest is reset, the cache mode of the disk should be reset to whatever was specified in the "-drive" option. With this change, the Linux virtio-blk driver finds that writeback caching is enabled, and tells the block layer to send cache flush commands appropriately. Reported-by: Rusty Russell <rusty@au1.ibm.com Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/block/virtio-blk.c | 8 ++++++-- include/hw/virtio/virtio-blk.h | 1 + 2 files changed, 7 insertions(+), 2 deletions(-)