Message ID | 1406636839-11946-2-git-send-email-tumanova@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
Il 29/07/2014 14:27, Ekaterina Tumanova ha scritto: > The new HDIO_GETGEO logic is required for two use cases: > a) Support for geometries of Direct Attached Storage Disks (DASD) > on s390x configured as backing of virtio block devices. Is this still relevant now that QEMU can emulate 512-byte sectors on top of host devices with 4k sectors? > b) Support for FCP attached SCSI disks that do not yet have a > partition table. Without this patch, fdisk -l on the host would > return different results then fdisk -l in the guest. Can you provide an example? Paolo
On 20/08/14 10:19, Paolo Bonzini wrote: > Il 29/07/2014 14:27, Ekaterina Tumanova ha scritto: >> The new HDIO_GETGEO logic is required for two use cases: >> a) Support for geometries of Direct Attached Storage Disks (DASD) >> on s390x configured as backing of virtio block devices. > > Is this still relevant now that QEMU can emulate 512-byte sectors on top > of host devices with 4k sectors? Yes, the geometry and the block size define the layout of the DASD disk format. This defines how the ibm disk layout partition table looks like. So partition detection of the IBM layout only works if values are correct. (see linux block/partitions/ibm.c. it needs these values to correctly locate the data structures). Furthermore, if you do an mkfs in the host and the guest sees a different block size all kind of strange things will happen when mounting, no? > >> b) Support for FCP attached SCSI disks that do not yet have a >> partition table. Without this patch, fdisk -l on the host would >> return different results then fdisk -l in the guest. > > Can you provide an example? scsi disk in the host: [root@host ~]# sfdisk -g /dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593 /dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593: 1011 cylinders, 34 heads, 61 sectors/track same scsi disk in the guest as virtio-blk: scsi disk in the guest: [root@guest ~]# sfdisk -g /dev/disk/by-id/virtio-d20 /dev/disk/by-id/virtio-d20: 2080 cylinders, 16 heads, 63 sectors/track 16/63 is currently hardcoded by QEMU, no matter what the host thinks. This gets overridden as soon as there is a partition table. command line was: -drive file=/dev/disk/by-id/scsi-36005076305ffc1ae0000000000002593,if=none,id=drive-virtio-disk20,format=raw,serial=d20,cache=none,aio=native > > Paolo >
Il 20/08/2014 11:35, Christian Borntraeger ha scritto: > On 20/08/14 10:19, Paolo Bonzini wrote: >> Il 29/07/2014 14:27, Ekaterina Tumanova ha scritto: >>> The new HDIO_GETGEO logic is required for two use cases: >>> a) Support for geometries of Direct Attached Storage Disks (DASD) >>> on s390x configured as backing of virtio block devices. >> >> Is this still relevant now that QEMU can emulate 512-byte sectors on top >> of host devices with 4k sectors? > > Yes, the geometry and the block size define the layout of the DASD disk format. This defines how the ibm disk layout partition table looks like. So partition detection of the IBM layout only works if values are correct. (see linux block/partitions/ibm.c. it needs these values to correctly locate the data structures). > > Furthermore, if you do an mkfs in the host and the guest sees a different block size all kind of strange things will happen when mounting, no? > >> >>> b) Support for FCP attached SCSI disks that do not yet have a >>> partition table. Without this patch, fdisk -l on the host would >>> return different results then fdisk -l in the guest. >> >> Can you provide an example? > > scsi disk in the host: > [root@host ~]# sfdisk -g /dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593 > /dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593: 1011 cylinders, 34 heads, 61 sectors/track > > same scsi disk in the guest as virtio-blk: > scsi disk in the guest: > [root@guest ~]# sfdisk -g /dev/disk/by-id/virtio-d20 > /dev/disk/by-id/virtio-d20: 2080 cylinders, 16 heads, 63 sectors/track > > 16/63 is currently hardcoded by QEMU, no matter what the host thinks. This gets overridden as soon as there is a partition table. > > command line was: > -drive file=/dev/disk/by-id/scsi-36005076305ffc1ae0000000000002593,if=none,id=drive-virtio-disk20,format=raw,serial=d20,cache=none,aio=native But this risks changing whenever you move data from a disk to another disk, or if you move a virtual DASD disk from physical DASD to physical SCSI. If s390 is so sensitive to the head count and number of sectors/track (on x86 everything is done with LBAs nowadays, even in the firmware), I think whatever management layer you use should always specify them explicitly. I'm not saying this patch shouldn't be included---but it should be treated as a handy thing for developers rather than as a definitive fix. Paolo
On 20/08/14 15:10, Paolo Bonzini wrote: > Il 20/08/2014 11:35, Christian Borntraeger ha scritto: >> On 20/08/14 10:19, Paolo Bonzini wrote: >>> Il 29/07/2014 14:27, Ekaterina Tumanova ha scritto: >>>> The new HDIO_GETGEO logic is required for two use cases: >>>> a) Support for geometries of Direct Attached Storage Disks (DASD) >>>> on s390x configured as backing of virtio block devices. >>> >>> Is this still relevant now that QEMU can emulate 512-byte sectors on top >>> of host devices with 4k sectors? >> >> Yes, the geometry and the block size define the layout of the DASD disk format. This defines how the ibm disk layout partition table looks like. So partition detection of the IBM layout only works if values are correct. (see linux block/partitions/ibm.c. it needs these values to correctly locate the data structures). >> >> Furthermore, if you do an mkfs in the host and the guest sees a different block size all kind of strange things will happen when mounting, no? >> >>> >>>> b) Support for FCP attached SCSI disks that do not yet have a >>>> partition table. Without this patch, fdisk -l on the host would >>>> return different results then fdisk -l in the guest. >>> >>> Can you provide an example? >> >> scsi disk in the host: >> [root@host ~]# sfdisk -g /dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593 >> /dev/disk/by-id/wwn-0x6005076305ffc1ae0000000000002593: 1011 cylinders, 34 heads, 61 sectors/track >> >> same scsi disk in the guest as virtio-blk: >> scsi disk in the guest: >> [root@guest ~]# sfdisk -g /dev/disk/by-id/virtio-d20 >> /dev/disk/by-id/virtio-d20: 2080 cylinders, 16 heads, 63 sectors/track >> >> 16/63 is currently hardcoded by QEMU, no matter what the host thinks. This gets overridden as soon as there is a partition table. >> >> command line was: >> -drive file=/dev/disk/by-id/scsi-36005076305ffc1ae0000000000002593,if=none,id=drive-virtio-disk20,format=raw,serial=d20,cache=none,aio=native > > But this risks changing whenever you move data from a disk to another > disk, or if you move a virtual DASD disk from physical DASD to physical > SCSI. > > If s390 is so sensitive to the head count and number of sectors/track > (on x86 everything is done with LBAs nowadays, even in the firmware), I > think whatever management layer you use should always specify them > explicitly. Only the DASD disks are so sensitive. The SCSI geometry is just a cosmetic thing, but it doesnt hurt. So for anything that comes via FCP SCSI we dont have a series issue besides the cosmetic thing. (Well, unless you have a storage server with 4k scsi disks, then it also becomes an issue on x86 - already seen on Flash Systems and other storage servers). > > I'm not saying this patch shouldn't be included---but it should be > treated as a handy thing for developers rather than as a definitive fix. Yes. I would even suggest, that for image files you better use a SCSI disk image (which then has the normal partition layout as x86). Of course, there are reasons to have image files to hold DASD images, but then you have to manually specify geo/ss. In fact, libvirt always supported to specify the geometry and Viktor from our team did the sector size support in libvirt with: commit 5cc50ad7a4e139261079a5848859c84cab3b0c7c Author: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> AuthorDate: Wed Aug 29 17:48:30 2012 +0200 Commit: Eric Blake <eblake@redhat.com> CommitDate: Fri Aug 31 11:27:27 2012 -0700 conf: Support for Block Device IO Limits Introducing a new iolimits element allowing to override certain properties of a guest block device like the physical and logical block size. This can be useful for platforms with 'non-standard' disk formats like S390 DASD with its 4K block size. Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> commit 277a49bce73da908c965e466b03f5fc97f04cae1 Author: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> AuthorDate: Wed Aug 29 17:48:31 2012 +0200 Commit: Eric Blake <eblake@redhat.com> CommitDate: Fri Aug 31 11:27:47 2012 -0700 qemu: Support for Block Device IO Limits. Implementation of iolimits for the qemu driver with capability probing for block size attribute and command line generation for block sizes. Including testcase for qemuxml2argvtest. Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> some time ago. So this detection is mostly important for DASD disk passthrough as the defaults are plainly wrong for those disks. (Under LPAR and z/VM the disks work out of the box, so a proper detection really helps). As far as I can see the patches only affect disks, that are "raw" or "host_device". So it would be good, if Kevin and Stefan could take a look at the overall design. If they/you come up with a totally different approach, that is also fine. Kate was away last week, so hopefully she can continue to drive the discussion with you folks, when she is back. Christian
diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs index bf46f03..e4f6205 100644 --- a/hw/block/Makefile.objs +++ b/hw/block/Makefile.objs @@ -1,4 +1,4 @@ -common-obj-y += block.o cdrom.o hd-geometry.o +common-obj-y += block.o cdrom.o common-obj-$(CONFIG_FDC) += fdc.o common-obj-$(CONFIG_SSI_M25P80) += m25p80.o common-obj-$(CONFIG_NAND) += nand.o @@ -13,3 +13,7 @@ obj-$(CONFIG_SH4) += tc58128.o obj-$(CONFIG_VIRTIO) += virtio-blk.o obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/ + +# geometry is target/architecture dependent and therefore needs to be built +# for every target platform +obj-y += hd-geometry.o diff --git a/hw/block/hd-geometry.c b/hw/block/hd-geometry.c index 6feb4f8..7988264 100644 --- a/hw/block/hd-geometry.c +++ b/hw/block/hd-geometry.c @@ -33,6 +33,10 @@ #include "block/block.h" #include "hw/block/block.h" #include "trace.h" +#ifdef __linux__ +#include <linux/fs.h> +#include <linux/hdreg.h> +#endif struct partition { uint8_t boot_ind; /* 0x80 - active */ @@ -47,6 +51,37 @@ struct partition { uint32_t nr_sects; /* nr of sectors in partition */ } QEMU_PACKED; +static void guess_chs_for_size(BlockDriverState *bs, uint32_t *pcyls, + uint32_t *pheads, uint32_t *psecs); + +/* try to get geometry from disk via HDIO_GETGEO ioctl + Return 0 if OK, -1 if ioctl does not work (e.g. image file) */ +static inline int guess_disk_pchs(BlockDriverState *bs, uint32_t *pcylinders, + uint32_t *pheads, uint32_t *psectors) +{ +#ifdef __linux__ + struct hd_geometry geo; + + if (bdrv_ioctl(bs, HDIO_GETGEO, &geo)) { + return -1; + } + + /* HDIO_GETGEO may return success even though geo contains zeros + (e.g. certain multipath setups) */ + if (!geo.heads || !geo.sectors || !geo.cylinders) { + return -1; + } + + *pheads = geo.heads; + *psectors = geo.sectors; + *pcylinders = geo.cylinders; + return 0; +#else + return -1; +#endif +} + + /* try to guess the disk logical geometry from the MSDOS partition table. Return 0 if OK, -1 if could not guess */ static int guess_disk_lchs(BlockDriverState *bs, @@ -116,6 +151,26 @@ static void guess_chs_for_size(BlockDriverState *bs, *psecs = 63; } +#ifdef TARGET_S390X +void hd_geometry_guess(BlockDriverState *bs, uint32_t *pcyls, uint32_t *pheads, + uint32_t *psecs, int *ptrans) +{ + if (guess_disk_lchs(bs, (int *)pcyls, (int *)pheads, (int *)psecs) == 0) { + goto done; + } + if (guess_disk_pchs(bs, pcyls, pheads, psecs) == 0) { + goto done; + } + /* still no geometry - try to guess from disk size */ + guess_chs_for_size(bs, pcyls, pheads, psecs); +done: + if (ptrans) { + *ptrans = BIOS_ATA_TRANSLATION_NONE; + } + trace_hd_geometry_guess(bs, *pcyls, *pheads, *psecs, + BIOS_ATA_TRANSLATION_NONE); +} +#else void hd_geometry_guess(BlockDriverState *bs, uint32_t *pcyls, uint32_t *pheads, uint32_t *psecs, int *ptrans) @@ -148,6 +203,7 @@ void hd_geometry_guess(BlockDriverState *bs, } trace_hd_geometry_guess(bs, *pcyls, *pheads, *psecs, translation); } +#endif int hd_bios_chs_auto_trans(uint32_t cyls, uint32_t heads, uint32_t secs) {