diff mbox

[4/4] blocksize: add blkconf_blocksize call to all block devices

Message ID 1406636839-11946-5-git-send-email-tumanova@linux.vnet.ibm.com
State New
Headers show

Commit Message

Ekaterina Tumanova July 29, 2014, 12:27 p.m. UTC
This patch add the blkconf_blocksize call to all
devices, which use DEFINE_BLOCK_PROPERTIES.
If the underlying driver function fails, blkconf_blocksizes
will set blocksizes to default (512) value.

Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
---
 hw/block/nvme.c          | 1 +
 hw/block/virtio-blk.c    | 1 +
 hw/ide/qdev.c            | 1 +
 hw/scsi/scsi-disk.c      | 1 +
 hw/usb/dev-storage.c     | 1 +
 include/hw/block/block.h | 4 ++--
 6 files changed, 7 insertions(+), 2 deletions(-)

Comments

Stefan Hajnoczi Sept. 3, 2014, 3:46 p.m. UTC | #1
On Tue, Jul 29, 2014 at 02:27:19PM +0200, Ekaterina Tumanova wrote:
> This patch add the blkconf_blocksize call to all
> devices, which use DEFINE_BLOCK_PROPERTIES.
> If the underlying driver function fails, blkconf_blocksizes
> will set blocksizes to default (512) value.
> 
> Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> ---
>  hw/block/nvme.c          | 1 +
>  hw/block/virtio-blk.c    | 1 +
>  hw/ide/qdev.c            | 1 +
>  hw/scsi/scsi-disk.c      | 1 +
>  hw/usb/dev-storage.c     | 1 +
>  include/hw/block/block.h | 4 ++--
>  6 files changed, 7 insertions(+), 2 deletions(-)

Wasn't this NACKed before on the grounds that it is likely to upset the
guest after live migration?  QEMU doesn't automatically query the
storage because these parameters must be preserved across migration.

The knowledge of these fields belongs in the management tool that
orchestrates migration, not QEMU.

If you want specific parameters, please put them in your guest
configuration.  QEMU and libvirt support that.

I'm concerned that this patch serious is likely to break things and
autodetection doesn't add much value since the management tool needs to
be aware of this information anyway.

Stefan
Ekaterina Tumanova Sept. 4, 2014, 10:28 a.m. UTC | #2
On 09/03/2014 07:46 PM, Stefan Hajnoczi wrote:
> On Tue, Jul 29, 2014 at 02:27:19PM +0200, Ekaterina Tumanova wrote:
>> This patch add the blkconf_blocksize call to all
>> devices, which use DEFINE_BLOCK_PROPERTIES.
>> If the underlying driver function fails, blkconf_blocksizes
>> will set blocksizes to default (512) value.
>>
>> Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
>> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
>> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
>> ---
>>   hw/block/nvme.c          | 1 +
>>   hw/block/virtio-blk.c    | 1 +
>>   hw/ide/qdev.c            | 1 +
>>   hw/scsi/scsi-disk.c      | 1 +
>>   hw/usb/dev-storage.c     | 1 +
>>   include/hw/block/block.h | 4 ++--
>>   6 files changed, 7 insertions(+), 2 deletions(-)
>
> Wasn't this NACKed before on the grounds that it is likely to upset the
> guest after live migration?  QEMU doesn't automatically query the
> storage because these parameters must be preserved across migration.
>

Sorry, haven't found this discussion in the mail list. Do you have a link?

As far as I understand, the xxxxxx_init functions of the qemu block 
devices, which contain blkconf_blocksize calls, will
be called again on the destination host before the guest is resumed. And 
since migration requests qemu to be brought on the same disk, the 
configuration will receive the same block size from the ioctl, as
before. What do I miss?

> The knowledge of these fields belongs in the management tool that
> orchestrates migration, not QEMU.
>

For case of DASDs we need QEMU to know the underlying blocksize.
And you mentioned in your review comment to the Einar's initial patch
that you request this to be implemented for all architectures:

"Detecting the underlying block size is a generally useful configuration
option.  This should not be s390-specific, so no need to rename
DEFINE_BLOCK_PROPERTIES()."

http://qemu.11.n7.nabble.com/Qemu-devel-PATCH-V3-0-2-hd-geometry-c-Integrate-HDIO-GETGEO-in-guessing-tt185124.html#none

> If you want specific parameters, please put them in your guest
> configuration.  QEMU and libvirt support that.
>
> I'm concerned that this patch serious is likely to break things and
> autodetection doesn't add much value since the management tool needs to
> be aware of this information anyway.
>

Can you please explain what do you mean by "AUTOdetection"?
Do you simply mean "detection by ioctl" or "detection performed without
guest request"?

> Stefan
>

Thanks!
Christian Borntraeger Sept. 4, 2014, 2:15 p.m. UTC | #3
On 03/09/14 17:46, Stefan Hajnoczi wrote:
> On Tue, Jul 29, 2014 at 02:27:19PM +0200, Ekaterina Tumanova wrote:
>> This patch add the blkconf_blocksize call to all
>> devices, which use DEFINE_BLOCK_PROPERTIES.
>> If the underlying driver function fails, blkconf_blocksizes
>> will set blocksizes to default (512) value.
>>
>> Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
>> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
>> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
>> ---
>>  hw/block/nvme.c          | 1 +
>>  hw/block/virtio-blk.c    | 1 +
>>  hw/ide/qdev.c            | 1 +
>>  hw/scsi/scsi-disk.c      | 1 +
>>  hw/usb/dev-storage.c     | 1 +
>>  include/hw/block/block.h | 4 ++--
>>  6 files changed, 7 insertions(+), 2 deletions(-)
> 
> Wasn't this NACKed before on the grounds that it is likely to upset the

Not yet ;-) (just other other comments)

> guest after live migration?  QEMU doesn't automatically query the
> storage because these parameters must be preserved across migration.

Would it be more acceptable if this auto detection is only running on init,
but on migration the target will use the block size of the source?

> 
> The knowledge of these fields belongs in the management tool that
> orchestrates migration, not QEMU.

I think the main problem that this patch tries to address is, is not migration. 
It tries to make the whole guest work an pass-through dasd. It guess that this problem
did not happen on x86 yet as most disks are 512 and most file system will cope with
sector size changes.
Maybe this will change as soon as you run a guest on a real disk (no image file) and
the disk happens to be a 4Kn disk.

Question is: Do we want all users to specify the block size of the underlying disk,
just because its a 4Kn-type disk?

Or in other words:
shall we force everybody to change
qemu-system-x86_64 -hda /dev/sdc

into 
qemu-system-x86_64 -drive file=/dev/sdc,if=none,id=mydisk -device ide-drive,bus=ide.0,drive=mydisk,physical_block_size=4096
for a 4Kn device?

Or for the libvirt case, we need
   <blockio logical_block_size='4096' physical_block_size='4096'/>

Or is your suggestion to let libvirt detect the block size for host devices?


> If you want specific parameters, please put them in your guest
> configuration.  QEMU and libvirt support that.
> 
> I'm concerned that this patch serious is likely to break things and
> autodetection doesn't add much value since the management tool needs to
> be aware of this information anyway.

I am totally with you, that we should make this patch in a way to not break anything on other platforms.

Christian
Stefan Hajnoczi Sept. 17, 2014, 1:56 p.m. UTC | #4
On Thu, Sep 04, 2014 at 04:15:21PM +0200, Christian Borntraeger wrote:
> On 03/09/14 17:46, Stefan Hajnoczi wrote:
> > On Tue, Jul 29, 2014 at 02:27:19PM +0200, Ekaterina Tumanova wrote:
> > guest after live migration?  QEMU doesn't automatically query the
> > storage because these parameters must be preserved across migration.
> 
> Would it be more acceptable if this auto detection is only running on init,
> but on migration the target will use the block size of the source?

Detection is only useful when installing a new guest from scratch.

> > The knowledge of these fields belongs in the management tool that
> > orchestrates migration, not QEMU.
> 
> I think the main problem that this patch tries to address is, is not migration. 
> It tries to make the whole guest work an pass-through dasd. It guess that this problem
> did not happen on x86 yet as most disks are 512 and most file system will cope with
> sector size changes.
> Maybe this will change as soon as you run a guest on a real disk (no image file) and
> the disk happens to be a 4Kn disk.
> 
> Question is: Do we want all users to specify the block size of the underlying disk,
> just because its a 4Kn-type disk?
> 
> Or in other words:
> shall we force everybody to change
> qemu-system-x86_64 -hda /dev/sdc
> 
> into 
> qemu-system-x86_64 -drive file=/dev/sdc,if=none,id=mydisk -device ide-drive,bus=ide.0,drive=mydisk,physical_block_size=4096
> for a 4Kn device?
> 
> Or for the libvirt case, we need
>    <blockio logical_block_size='4096' physical_block_size='4096'/>
> 
> Or is your suggestion to let libvirt detect the block size for host devices?

No, QEMU already detects alignment requirements and uses bounce buffers.
So the guest can run under the illusion that 512 byte requests are okay
even on a 4 KB disk.

Please see block.c:bdrv_co_do_preadv()

This means existing guests that were installed on a 512 byte disk keep
running when you move them onto a 4 KB sector host block device.

If you want to pass the 4 KB requirement through, then set it explicitly
in the guest configuration.

> > If you want specific parameters, please put them in your guest
> > configuration.  QEMU and libvirt support that.
> > 
> > I'm concerned that this patch serious is likely to break things and
> > autodetection doesn't add much value since the management tool needs to
> > be aware of this information anyway.
> 
> I am totally with you, that we should make this patch in a way to not break anything on other platforms.

This is not architecture-specific.  If the current policy doesn't work
on s390 then the policy needs to be extended for all architectures.

My point is that QEMU cannot pass values from the host through to the
guest automatically since it changes what hardware configuration the
guest sees.  Whether the guest is running (live migration) or across
boot (moving disk image to another machine) doesn't really matter, the
point is that the guest hardware configuration should be constant to
avoid upsetting guests.

For these reasons, I think that I/O alignment constraints should be
explicitly configured at the QEMU level.

Stefan
Stefan Hajnoczi Sept. 17, 2014, 2 p.m. UTC | #5
On Thu, Sep 04, 2014 at 02:28:26PM +0400, Ekaterina Tumanova wrote:
> On 09/03/2014 07:46 PM, Stefan Hajnoczi wrote:
> >On Tue, Jul 29, 2014 at 02:27:19PM +0200, Ekaterina Tumanova wrote:
> >>This patch add the blkconf_blocksize call to all
> >>devices, which use DEFINE_BLOCK_PROPERTIES.
> >>If the underlying driver function fails, blkconf_blocksizes
> >>will set blocksizes to default (512) value.
> >>
> >>Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
> >>Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
> >>Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> >>---
> >>  hw/block/nvme.c          | 1 +
> >>  hw/block/virtio-blk.c    | 1 +
> >>  hw/ide/qdev.c            | 1 +
> >>  hw/scsi/scsi-disk.c      | 1 +
> >>  hw/usb/dev-storage.c     | 1 +
> >>  include/hw/block/block.h | 4 ++--
> >>  6 files changed, 7 insertions(+), 2 deletions(-)
> >
> >Wasn't this NACKed before on the grounds that it is likely to upset the
> >guest after live migration?  QEMU doesn't automatically query the
> >storage because these parameters must be preserved across migration.
> >
> 
> Sorry, haven't found this discussion in the mail list. Do you have a link?

I don't have the link but am referring to the last time these patches
were discussed - ~1 year ago?

> As far as I understand, the xxxxxx_init functions of the qemu block devices,
> which contain blkconf_blocksize calls, will
> be called again on the destination host before the guest is resumed. And
> since migration requests qemu to be brought on the same disk, the
> configuration will receive the same block size from the ioctl, as
> before. What do I miss?

Live storage migration is supported.  This means non-shared storage is
possible.

Therefore the host block device that a guest runs on can change.

> >If you want specific parameters, please put them in your guest
> >configuration.  QEMU and libvirt support that.
> >
> >I'm concerned that this patch serious is likely to break things and
> >autodetection doesn't add much value since the management tool needs to
> >be aware of this information anyway.
> >
> 
> Can you please explain what do you mean by "AUTOdetection"?
> Do you simply mean "detection by ioctl" or "detection performed without
> guest request"?

I mean passing through the host block device attributes to the guest.

Stefan
diff mbox

Patch

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 5fd8f89..50fe769 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -761,6 +761,7 @@  static int nvme_init(PCIDevice *pci_dev)
     if (!n->serial) {
         return -1;
     }
+    blkconf_blocksizes(&n->conf);
 
     pci_conf = pci_dev->config;
     pci_conf[PCI_INTERRUPT_PIN] = 1;
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c241c50..b5027b1 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -743,6 +743,7 @@  static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
     }
 
     blkconf_serial(&blk->conf, &blk->serial);
+    blkconf_blocksizes(&blk->conf);
     s->original_wce = bdrv_enable_write_cache(blk->conf.bs);
     if (blkconf_geometry(&blk->conf, NULL, 65535, 255, 255) < 0) {
         error_setg(errp, "Error setting geometry");
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 6e475e6..6d94d8f 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -161,6 +161,7 @@  static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind)
     }
 
     blkconf_serial(&dev->conf, &dev->serial);
+    blkconf_blocksizes(&dev->conf);
     if (kind != IDE_CD
         && blkconf_geometry(&dev->conf, &dev->chs_trans, 65536, 16, 255) < 0) {
         return -1;
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index d47ecd6..bfae48b 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2250,6 +2250,7 @@  static int scsi_initfn(SCSIDevice *dev)
     }
 
     blkconf_serial(&s->qdev.conf, &s->serial);
+    blkconf_blocksizes(&s->qdev.conf);
     if (dev->type == TYPE_DISK
         && blkconf_geometry(&dev->conf, NULL, 65535, 255, 255) < 0) {
         return -1;
diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c
index ae4efcb..cf50ac1 100644
--- a/hw/usb/dev-storage.c
+++ b/hw/usb/dev-storage.c
@@ -603,6 +603,7 @@  static int usb_msd_initfn_storage(USBDevice *dev)
     }
 
     blkconf_serial(&s->conf, &dev->serial);
+    blkconf_blocksizes(&s->conf);
 
     /*
      * Hack alert: this pretends to be a block device, but it's really
diff --git a/include/hw/block/block.h b/include/hw/block/block.h
index 7a0092e..8560bb2 100644
--- a/include/hw/block/block.h
+++ b/include/hw/block/block.h
@@ -44,9 +44,9 @@  static inline unsigned int get_physical_block_exp(BlockConf *conf)
 #define DEFINE_BLOCK_PROPERTIES(_state, _conf)                          \
     DEFINE_PROP_DRIVE("drive", _state, _conf.bs),                       \
     DEFINE_PROP_BLOCKSIZE("logical_block_size", _state,                 \
-                          _conf.logical_block_size, 512),               \
+                          _conf.logical_block_size, 0),                 \
     DEFINE_PROP_BLOCKSIZE("physical_block_size", _state,                \
-                          _conf.physical_block_size, 512),              \
+                          _conf.physical_block_size, 0),                \
     DEFINE_PROP_UINT16("min_io_size", _state, _conf.min_io_size, 0),  \
     DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0),    \
     DEFINE_PROP_INT32("bootindex", _state, _conf.bootindex, -1),        \