diff mbox

block: Use bdrv functions to replace file operation in qcow.c

Message ID 1321607573-29744-1-git-send-email-zhihuili@linux.vnet.ibm.com
State New
Headers show

Commit Message

Zhi Hui Li Nov. 18, 2011, 9:12 a.m. UTC
Since common file operation functions lack of error detection and use much more I/O syscalls, 
so change them to bdrv series functions and reduce I/O request.

Signed-off-by: Li Zhi Hui <zhihuili@linux.vnet.ibm.com>
---
 block/qcow.c |   42 +++++++++++++++++++++++-------------------
 1 files changed, 23 insertions(+), 19 deletions(-)

Comments

Stefan Hajnoczi Nov. 18, 2011, 10:59 a.m. UTC | #1
On Fri, Nov 18, 2011 at 9:12 AM, Li Zhi Hui <zhihuili@linux.vnet.ibm.com> wrote:
> +    tmp = g_malloc0(sizeof(uint64_t)*l1_size);
> +    ret = bdrv_pwrite(qcow_bs, header_size, tmp, sizeof(uint64_t)*l1_size);
> +    g_free(tmp);
> +    if (ret != sizeof(uint64_t)*l1_size) {
> +        goto exit;
>     }

qemu-img create -f qcow test.qcow 100T

>>> 100 * 1024 * 1024 * 1024 * 1024 / ((1 << 12) * (1 << 9))
52428800
>>> 52428800 * 8
419430400

That means 400 MB of RAM for the zero L1 table for a 100 TB image.
Since qcow is a legacy format this probably doesn't matter in practice
but in theory this approach can require a noticable amount of RAM.

Looks okay to me.

Stefan
Paolo Bonzini Nov. 18, 2011, 11:10 a.m. UTC | #2
On 11/18/2011 11:59 AM, Stefan Hajnoczi wrote:
> +    tmp = g_malloc0(sizeof(uint64_t)*l1_size);
> >  +    ret = bdrv_pwrite(qcow_bs, header_size, tmp, sizeof(uint64_t)*l1_size);
> >  +    g_free(tmp);
> >  +    if (ret != sizeof(uint64_t)*l1_size) {
> >  +        goto exit;
> >       }
>
> That means 400 MB of RAM for the zero L1 table for a 100 TB image.
> Since qcow is a legacy format this probably doesn't matter in practice
> but in theory this approach can require a noticable amount of RAM.

4 MB / TB is not a big deal (you probably would like the L1 table to be 
in memory all the time), but why write the L1 table at all?  Since the 
file was CREATed, it is already zero and you can just leave a hole in 
the file.

Paolo
Stefan Hajnoczi Nov. 18, 2011, 2:34 p.m. UTC | #3
On Fri, Nov 18, 2011 at 11:10 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 11/18/2011 11:59 AM, Stefan Hajnoczi wrote:
>>
>> +    tmp = g_malloc0(sizeof(uint64_t)*l1_size);
>> >  +    ret = bdrv_pwrite(qcow_bs, header_size, tmp,
>> > sizeof(uint64_t)*l1_size);
>> >  +    g_free(tmp);
>> >  +    if (ret != sizeof(uint64_t)*l1_size) {
>> >  +        goto exit;
>> >       }
>>
>> That means 400 MB of RAM for the zero L1 table for a 100 TB image.
>> Since qcow is a legacy format this probably doesn't matter in practice
>> but in theory this approach can require a noticable amount of RAM.
>
> 4 MB / TB is not a big deal (you probably would like the L1 table to be in
> memory all the time), but why write the L1 table at all?  Since the file was
> CREATed, it is already zero and you can just leave a hole in the file.

I thought the same thing then remember sometimes people want to use
image formats on block devices.  I think at least making image
creation not depend on has_zero_init is a good idea.

Stefan
Kevin Wolf Nov. 21, 2011, 10:44 a.m. UTC | #4
Am 18.11.2011 15:34, schrieb Stefan Hajnoczi:
> On Fri, Nov 18, 2011 at 11:10 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 11/18/2011 11:59 AM, Stefan Hajnoczi wrote:
>>>
>>> +    tmp = g_malloc0(sizeof(uint64_t)*l1_size);
>>>>  +    ret = bdrv_pwrite(qcow_bs, header_size, tmp,
>>>> sizeof(uint64_t)*l1_size);
>>>>  +    g_free(tmp);
>>>>  +    if (ret != sizeof(uint64_t)*l1_size) {
>>>>  +        goto exit;
>>>>       }
>>>
>>> That means 400 MB of RAM for the zero L1 table for a 100 TB image.
>>> Since qcow is a legacy format this probably doesn't matter in practice
>>> but in theory this approach can require a noticable amount of RAM.
>>
>> 4 MB / TB is not a big deal (you probably would like the L1 table to be in
>> memory all the time), but why write the L1 table at all?  Since the file was
>> CREATed, it is already zero and you can just leave a hole in the file.
> 
> I thought the same thing then remember sometimes people want to use
> image formats on block devices.  I think at least making image
> creation not depend on has_zero_init is a good idea.

qcow1 doesn't work on block devices anyway.

Kevin
Stefan Hajnoczi Nov. 21, 2011, 10:53 a.m. UTC | #5
On Mon, Nov 21, 2011 at 10:44 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 18.11.2011 15:34, schrieb Stefan Hajnoczi:
>> On Fri, Nov 18, 2011 at 11:10 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 11/18/2011 11:59 AM, Stefan Hajnoczi wrote:
>>>>
>>>> +    tmp = g_malloc0(sizeof(uint64_t)*l1_size);
>>>>>  +    ret = bdrv_pwrite(qcow_bs, header_size, tmp,
>>>>> sizeof(uint64_t)*l1_size);
>>>>>  +    g_free(tmp);
>>>>>  +    if (ret != sizeof(uint64_t)*l1_size) {
>>>>>  +        goto exit;
>>>>>       }
>>>>
>>>> That means 400 MB of RAM for the zero L1 table for a 100 TB image.
>>>> Since qcow is a legacy format this probably doesn't matter in practice
>>>> but in theory this approach can require a noticable amount of RAM.
>>>
>>> 4 MB / TB is not a big deal (you probably would like the L1 table to be in
>>> memory all the time), but why write the L1 table at all?  Since the file was
>>> CREATed, it is already zero and you can just leave a hole in the file.
>>
>> I thought the same thing then remember sometimes people want to use
>> image formats on block devices.  I think at least making image
>> creation not depend on has_zero_init is a good idea.
>
> qcow1 doesn't work on block devices anyway.

Okay, both of my original points were moot, Kevin and Paolo have explained why:

The L1 RAM size issue doesn't really matter since we hold the entire
L1 in RAM during normal operation anyway.  Holding it in RAM during
creation is no worse.

The zero initialization could be optimized as Paolo suggested with
truncate since qcow1 always works on image files (which have automatic
zero initialization).

I'm happy with this patch.

Stefan
Kevin Wolf Nov. 21, 2011, 11:08 a.m. UTC | #6
Am 21.11.2011 11:53, schrieb Stefan Hajnoczi:
> On Mon, Nov 21, 2011 at 10:44 AM, Kevin Wolf <kwolf@redhat.com> wrote:
>> Am 18.11.2011 15:34, schrieb Stefan Hajnoczi:
>>> On Fri, Nov 18, 2011 at 11:10 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> On 11/18/2011 11:59 AM, Stefan Hajnoczi wrote:
>>>>>
>>>>> +    tmp = g_malloc0(sizeof(uint64_t)*l1_size);
>>>>>>  +    ret = bdrv_pwrite(qcow_bs, header_size, tmp,
>>>>>> sizeof(uint64_t)*l1_size);
>>>>>>  +    g_free(tmp);
>>>>>>  +    if (ret != sizeof(uint64_t)*l1_size) {
>>>>>>  +        goto exit;
>>>>>>       }
>>>>>
>>>>> That means 400 MB of RAM for the zero L1 table for a 100 TB image.
>>>>> Since qcow is a legacy format this probably doesn't matter in practice
>>>>> but in theory this approach can require a noticable amount of RAM.
>>>>
>>>> 4 MB / TB is not a big deal (you probably would like the L1 table to be in
>>>> memory all the time), but why write the L1 table at all?  Since the file was
>>>> CREATed, it is already zero and you can just leave a hole in the file.
>>>
>>> I thought the same thing then remember sometimes people want to use
>>> image formats on block devices.  I think at least making image
>>> creation not depend on has_zero_init is a good idea.
>>
>> qcow1 doesn't work on block devices anyway.
> 
> Okay, both of my original points were moot, Kevin and Paolo have explained why:
> 
> The L1 RAM size issue doesn't really matter since we hold the entire
> L1 in RAM during normal operation anyway.  Holding it in RAM during
> creation is no worse.
> 
> The zero initialization could be optimized as Paolo suggested with
> truncate since qcow1 always works on image files (which have automatic
> zero initialization).

I didn't say this. :-)

At least in theory, block devices may not be the only protocols with
!has_zero_init. We have only covered raw-posix with this discussion. I
would prefer an explicit write of the table to avoid breaking other
protocols (though I don't think we have one today; curl would be a
candidate, but it is read-only).

Kevin
Paolo Bonzini Nov. 21, 2011, 12:48 p.m. UTC | #7
On 11/21/2011 12:08 PM, Kevin Wolf wrote:
> I didn't say this.:-)
>
> At least in theory, block devices may not be the only protocols with
> !has_zero_init. We have only covered raw-posix with this discussion. I
> would prefer an explicit write of the table to avoid breaking other
> protocols (though I don't think we have one today; curl would be a
> candidate, but it is read-only).

Indeed, I'm also fine with the patch.

Paolo
diff mbox

Patch

diff --git a/block/qcow.c b/block/qcow.c
index adecee0..089e79e 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -612,13 +612,14 @@  static void qcow_close(BlockDriverState *bs)
 
 static int qcow_create(const char *filename, QEMUOptionParameter *options)
 {
-    int fd, header_size, backing_filename_len, l1_size, i, shift;
+    int header_size, backing_filename_len, l1_size, shift;
     QCowHeader header;
-    uint64_t tmp;
+    uint8_t *tmp;
     int64_t total_size = 0;
     const char *backing_file = NULL;
     int flags = 0;
     int ret;
+    BlockDriverState *qcow_bs;
 
     /* Read out options */
     while (options && options->name) {
@@ -632,9 +633,16 @@  static int qcow_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
-    if (fd < 0)
-        return -errno;
+    ret = bdrv_create_file(filename, options);
+    if (ret < 0) {
+        return ret;
+    }
+
+    ret = bdrv_file_open(&qcow_bs, filename, BDRV_O_RDWR);
+    if (ret < 0) {
+        return ret;
+    }
+
     memset(&header, 0, sizeof(header));
     header.magic = cpu_to_be32(QCOW_MAGIC);
     header.version = cpu_to_be32(QCOW_VERSION);
@@ -670,33 +678,29 @@  static int qcow_create(const char *filename, QEMUOptionParameter *options)
     }
 
     /* write all the data */
-    ret = qemu_write_full(fd, &header, sizeof(header));
+    ret = bdrv_pwrite(qcow_bs, 0, &header, sizeof(header));
     if (ret != sizeof(header)) {
-        ret = -errno;
         goto exit;
     }
 
     if (backing_file) {
-        ret = qemu_write_full(fd, backing_file, backing_filename_len);
+        ret = bdrv_pwrite(qcow_bs, sizeof(header),
+            backing_file, backing_filename_len);
         if (ret != backing_filename_len) {
-            ret = -errno;
             goto exit;
         }
-
     }
-    lseek(fd, header_size, SEEK_SET);
-    tmp = 0;
-    for(i = 0;i < l1_size; i++) {
-        ret = qemu_write_full(fd, &tmp, sizeof(tmp));
-        if (ret != sizeof(tmp)) {
-            ret = -errno;
-            goto exit;
-        }
+
+    tmp = g_malloc0(sizeof(uint64_t)*l1_size);
+    ret = bdrv_pwrite(qcow_bs, header_size, tmp, sizeof(uint64_t)*l1_size);
+    g_free(tmp);
+    if (ret != sizeof(uint64_t)*l1_size) {
+        goto exit;
     }
 
     ret = 0;
 exit:
-    close(fd);
+    bdrv_delete(qcow_bs);
     return ret;
 }