diff mbox

[1/4] hd-geometry.c: Integrate HDIO_GETGEO in guessing for target-s390x

Message ID 1406636839-11946-2-git-send-email-tumanova@linux.vnet.ibm.com
State New
Headers show

Commit Message

Ekaterina Tumanova July 29, 2014, 12:27 p.m. UTC
This patch extends the function hd_geometry_guess. It introduces a
target specific hook. The default implementation for this target
specific hook is empty, has therefore no effect and the existing logic
works as before.

For target-s390x, the behaviour is chosen as follows:
If no geo could be guessed via guess_disk_lchs, a new function called
guess_disk_pchs is called. The latter utilizes HDIO_GET_GEO ioctl to ask
the underlying disk for geometry.
If this is not successful (e.g. image files) geometry is derived from the
size of the disk (as before).

The new HDIO_GETGEO logic is required for two use cases:
a) Support for geometries of Direct Attached Storage Disks (DASD)
on s390x configured as backing of virtio block devices.
b) Support for FCP attached SCSI disks that do not yet have a
partition table. Without this patch, fdisk -l on the host would
return different results then fdisk -l in the guest.

Based on 2013 patch from Einar Lueck <elelueck@de.ibm.com>

Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
---
 hw/block/Makefile.objs |  6 +++++-
 hw/block/hd-geometry.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+), 1 deletion(-)

Comments

Paolo Bonzini Aug. 20, 2014, 8:19 a.m. UTC | #1
Il 29/07/2014 14:27, Ekaterina Tumanova ha scritto:
> The new HDIO_GETGEO logic is required for two use cases:
> a) Support for geometries of Direct Attached Storage Disks (DASD)
> on s390x configured as backing of virtio block devices.

Is this still relevant now that QEMU can emulate 512-byte sectors on top
of host devices with 4k sectors?

> b) Support for FCP attached SCSI disks that do not yet have a
> partition table. Without this patch, fdisk -l on the host would
> return different results then fdisk -l in the guest.

Can you provide an example?

Paolo
Christian Borntraeger Aug. 20, 2014, 9:35 a.m. UTC | #2
On 20/08/14 10:19, Paolo Bonzini wrote:
> Il 29/07/2014 14:27, Ekaterina Tumanova ha scritto:
>> The new HDIO_GETGEO logic is required for two use cases:
>> a) Support for geometries of Direct Attached Storage Disks (DASD)
>> on s390x configured as backing of virtio block devices.
> 
> Is this still relevant now that QEMU can emulate 512-byte sectors on top
> of host devices with 4k sectors?

Yes, the geometry and the block size define the layout of the DASD disk format. This defines how the ibm disk layout partition table looks like. So partition detection of the IBM layout only works if values are correct. (see linux block/partitions/ibm.c. it needs these values to correctly locate the data structures).

Furthermore, if you do an mkfs in the host and the guest sees a different block size all kind of strange things will happen when mounting, no?

> 
>> b) Support for FCP attached SCSI disks that do not yet have a
>> partition table. Without this patch, fdisk -l on the host would
>> return different results then fdisk -l in the guest.
> 
> Can you provide an example?

scsi disk in the host:
[root@host ~]#  sfdisk -g /dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593
/dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593: 1011 cylinders, 34 heads, 61 sectors/track

same scsi disk in the guest as virtio-blk:
scsi disk in the guest:
[root@guest ~]# sfdisk -g /dev/disk/by-id/virtio-d20
/dev/disk/by-id/virtio-d20: 2080 cylinders, 16 heads, 63 sectors/track

16/63 is currently hardcoded by QEMU, no matter what the host thinks. This gets overridden as soon as there is a partition table.

command line was:
-drive file=/dev/disk/by-id/scsi-36005076305ffc1ae0000000000002593,if=none,id=drive-virtio-disk20,format=raw,serial=d20,cache=none,aio=native

> 
> Paolo
>
Paolo Bonzini Aug. 20, 2014, 1:10 p.m. UTC | #3
Il 20/08/2014 11:35, Christian Borntraeger ha scritto:
> On 20/08/14 10:19, Paolo Bonzini wrote:
>> Il 29/07/2014 14:27, Ekaterina Tumanova ha scritto:
>>> The new HDIO_GETGEO logic is required for two use cases:
>>> a) Support for geometries of Direct Attached Storage Disks (DASD)
>>> on s390x configured as backing of virtio block devices.
>>
>> Is this still relevant now that QEMU can emulate 512-byte sectors on top
>> of host devices with 4k sectors?
> 
> Yes, the geometry and the block size define the layout of the DASD disk format. This defines how the ibm disk layout partition table looks like. So partition detection of the IBM layout only works if values are correct. (see linux block/partitions/ibm.c. it needs these values to correctly locate the data structures).
> 
> Furthermore, if you do an mkfs in the host and the guest sees a different block size all kind of strange things will happen when mounting, no?
> 
>>
>>> b) Support for FCP attached SCSI disks that do not yet have a
>>> partition table. Without this patch, fdisk -l on the host would
>>> return different results then fdisk -l in the guest.
>>
>> Can you provide an example?
> 
> scsi disk in the host:
> [root@host ~]#  sfdisk -g /dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593
> /dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593: 1011 cylinders, 34 heads, 61 sectors/track
> 
> same scsi disk in the guest as virtio-blk:
> scsi disk in the guest:
> [root@guest ~]# sfdisk -g /dev/disk/by-id/virtio-d20
> /dev/disk/by-id/virtio-d20: 2080 cylinders, 16 heads, 63 sectors/track
> 
> 16/63 is currently hardcoded by QEMU, no matter what the host thinks. This gets overridden as soon as there is a partition table.
> 
> command line was:
> -drive file=/dev/disk/by-id/scsi-36005076305ffc1ae0000000000002593,if=none,id=drive-virtio-disk20,format=raw,serial=d20,cache=none,aio=native

But this risks changing whenever you move data from a disk to another
disk, or if you move a virtual DASD disk from physical DASD to physical
SCSI.

If s390 is so sensitive to the head count and number of sectors/track
(on x86 everything is done with LBAs nowadays, even in the firmware), I
think whatever management layer you use should always specify them
explicitly.

I'm not saying this patch shouldn't be included---but it should be
treated as a handy thing for developers rather than as a definitive fix.

Paolo
Christian Borntraeger Aug. 25, 2014, 8:21 a.m. UTC | #4
On 20/08/14 15:10, Paolo Bonzini wrote:
> Il 20/08/2014 11:35, Christian Borntraeger ha scritto:
>> On 20/08/14 10:19, Paolo Bonzini wrote:
>>> Il 29/07/2014 14:27, Ekaterina Tumanova ha scritto:
>>>> The new HDIO_GETGEO logic is required for two use cases:
>>>> a) Support for geometries of Direct Attached Storage Disks (DASD)
>>>> on s390x configured as backing of virtio block devices.
>>>
>>> Is this still relevant now that QEMU can emulate 512-byte sectors on top
>>> of host devices with 4k sectors?
>>
>> Yes, the geometry and the block size define the layout of the DASD disk format. This defines how the ibm disk layout partition table looks like. So partition detection of the IBM layout only works if values are correct. (see linux block/partitions/ibm.c. it needs these values to correctly locate the data structures).
>>
>> Furthermore, if you do an mkfs in the host and the guest sees a different block size all kind of strange things will happen when mounting, no?
>>
>>>
>>>> b) Support for FCP attached SCSI disks that do not yet have a
>>>> partition table. Without this patch, fdisk -l on the host would
>>>> return different results then fdisk -l in the guest.
>>>
>>> Can you provide an example?
>>
>> scsi disk in the host:
>> [root@host ~]#  sfdisk -g /dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593
>> /dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593: 1011 cylinders, 34 heads, 61 sectors/track
>>
>> same scsi disk in the guest as virtio-blk:
>> scsi disk in the guest:
>> [root@guest ~]# sfdisk -g /dev/disk/by-id/virtio-d20
>> /dev/disk/by-id/virtio-d20: 2080 cylinders, 16 heads, 63 sectors/track
>>
>> 16/63 is currently hardcoded by QEMU, no matter what the host thinks. This gets overridden as soon as there is a partition table.
>>
>> command line was:
>> -drive file=/dev/disk/by-id/scsi-36005076305ffc1ae0000000000002593,if=none,id=drive-virtio-disk20,format=raw,serial=d20,cache=none,aio=native
> 
> But this risks changing whenever you move data from a disk to another
> disk, or if you move a virtual DASD disk from physical DASD to physical
> SCSI.
> 
> If s390 is so sensitive to the head count and number of sectors/track
> (on x86 everything is done with LBAs nowadays, even in the firmware), I
> think whatever management layer you use should always specify them
> explicitly.

Only the DASD disks are so sensitive. The SCSI geometry is just a cosmetic thing, but it doesnt hurt. So for anything that comes via FCP SCSI we dont have a series issue besides the cosmetic thing. (Well, unless you have a storage server with 4k scsi disks, then it also becomes an issue on x86 - already seen on Flash Systems and other storage servers).
> 
> I'm not saying this patch shouldn't be included---but it should be
> treated as a handy thing for developers rather than as a definitive fix.

Yes. I would even suggest, that for image files you better use a SCSI disk image (which then has the normal partition layout as x86). 
Of course, there are reasons to have image files to hold DASD images, but then you have to manually specify geo/ss. In fact, libvirt always supported to specify the geometry and Viktor from our team did the sector size support in libvirt with:

commit 5cc50ad7a4e139261079a5848859c84cab3b0c7c
Author:     Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
AuthorDate: Wed Aug 29 17:48:30 2012 +0200
Commit:     Eric Blake <eblake@redhat.com>
CommitDate: Fri Aug 31 11:27:27 2012 -0700

    conf: Support for Block Device IO Limits
    
    Introducing a new iolimits element allowing to override certain
    properties of a guest block device like the physical and logical
    block size.
    This can be useful for platforms with 'non-standard' disk formats
    like S390 DASD with its 4K block size.
    
    Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>


commit 277a49bce73da908c965e466b03f5fc97f04cae1
Author:     Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
AuthorDate: Wed Aug 29 17:48:31 2012 +0200
Commit:     Eric Blake <eblake@redhat.com>
CommitDate: Fri Aug 31 11:27:47 2012 -0700

    qemu: Support for Block Device IO Limits.
    
    Implementation of iolimits for the qemu driver with
    capability probing for block size attribute and
    command line generation for block sizes.
    Including testcase for qemuxml2argvtest.
    
    Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>

some time ago.

So this detection is mostly important for DASD disk passthrough as the defaults are plainly wrong for those disks. (Under LPAR and z/VM the disks work out of the box, so a proper detection really helps).
As far as I can see the patches only affect disks, that are "raw" or "host_device". So it would be good, if Kevin and Stefan could take a look at the overall design. If they/you come up with a totally different approach, that is also fine.
Kate was away last week, so hopefully she can continue to drive the discussion with you folks, when she is back.

Christian
diff mbox

Patch

diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
index bf46f03..e4f6205 100644
--- a/hw/block/Makefile.objs
+++ b/hw/block/Makefile.objs
@@ -1,4 +1,4 @@ 
-common-obj-y += block.o cdrom.o hd-geometry.o
+common-obj-y += block.o cdrom.o
 common-obj-$(CONFIG_FDC) += fdc.o
 common-obj-$(CONFIG_SSI_M25P80) += m25p80.o
 common-obj-$(CONFIG_NAND) += nand.o
@@ -13,3 +13,7 @@  obj-$(CONFIG_SH4) += tc58128.o
 
 obj-$(CONFIG_VIRTIO) += virtio-blk.o
 obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
+
+# geometry is target/architecture dependent and therefore needs to be built
+# for every target platform
+obj-y += hd-geometry.o
diff --git a/hw/block/hd-geometry.c b/hw/block/hd-geometry.c
index 6feb4f8..7988264 100644
--- a/hw/block/hd-geometry.c
+++ b/hw/block/hd-geometry.c
@@ -33,6 +33,10 @@ 
 #include "block/block.h"
 #include "hw/block/block.h"
 #include "trace.h"
+#ifdef __linux__
+#include <linux/fs.h>
+#include <linux/hdreg.h>
+#endif
 
 struct partition {
         uint8_t boot_ind;           /* 0x80 - active */
@@ -47,6 +51,37 @@  struct partition {
         uint32_t nr_sects;          /* nr of sectors in partition */
 } QEMU_PACKED;
 
+static void guess_chs_for_size(BlockDriverState *bs, uint32_t *pcyls,
+                               uint32_t *pheads, uint32_t *psecs);
+
+/* try to get geometry from disk via HDIO_GETGEO ioctl
+   Return 0 if OK, -1 if ioctl does not work (e.g. image file) */
+static inline int guess_disk_pchs(BlockDriverState *bs, uint32_t *pcylinders,
+                                  uint32_t *pheads, uint32_t *psectors)
+{
+#ifdef __linux__
+    struct hd_geometry geo;
+
+    if (bdrv_ioctl(bs, HDIO_GETGEO, &geo)) {
+        return -1;
+    }
+
+    /* HDIO_GETGEO may return success even though geo contains zeros
+       (e.g. certain multipath setups) */
+    if (!geo.heads || !geo.sectors || !geo.cylinders) {
+        return -1;
+    }
+
+    *pheads = geo.heads;
+    *psectors = geo.sectors;
+    *pcylinders = geo.cylinders;
+    return 0;
+#else
+    return -1;
+#endif
+}
+
+
 /* try to guess the disk logical geometry from the MSDOS partition table.
    Return 0 if OK, -1 if could not guess */
 static int guess_disk_lchs(BlockDriverState *bs,
@@ -116,6 +151,26 @@  static void guess_chs_for_size(BlockDriverState *bs,
     *psecs = 63;
 }
 
+#ifdef TARGET_S390X
+void hd_geometry_guess(BlockDriverState *bs, uint32_t *pcyls, uint32_t *pheads,
+                       uint32_t *psecs, int *ptrans)
+{
+    if (guess_disk_lchs(bs, (int *)pcyls, (int *)pheads, (int *)psecs) == 0) {
+        goto done;
+    }
+    if (guess_disk_pchs(bs, pcyls, pheads, psecs) == 0) {
+        goto done;
+    }
+    /* still no geometry - try to guess from disk size */
+    guess_chs_for_size(bs, pcyls, pheads, psecs);
+done:
+    if (ptrans) {
+        *ptrans = BIOS_ATA_TRANSLATION_NONE;
+    }
+    trace_hd_geometry_guess(bs, *pcyls, *pheads, *psecs,
+                            BIOS_ATA_TRANSLATION_NONE);
+}
+#else
 void hd_geometry_guess(BlockDriverState *bs,
                        uint32_t *pcyls, uint32_t *pheads, uint32_t *psecs,
                        int *ptrans)
@@ -148,6 +203,7 @@  void hd_geometry_guess(BlockDriverState *bs,
     }
     trace_hd_geometry_guess(bs, *pcyls, *pheads, *psecs, translation);
 }
+#endif
 
 int hd_bios_chs_auto_trans(uint32_t cyls, uint32_t heads, uint32_t secs)
 {