Patchwork Guest latency issues due to bdrv_check_byte_request

login
register
mail settings
Submitter Stefan Hajnoczi
Date April 17, 2010, 7:05 p.m.
Message ID <h2gfbd9d3991004171205q6aab7757kf4ad237821ba22f@mail.gmail.com>
Download mbox | patch
Permalink /patch/50386/
State New
Headers show

Comments

Stefan Hajnoczi - April 17, 2010, 7:05 p.m.
$ strace -cf x86_64-softmmu/qemu-system-x86_64 test.raw

Uncached getlength:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.40    1.944174       13136       148         4 futex
  1.65    0.033259           1     56418      2507 select
  0.39    0.007817           0     81118      5561 read
  0.33    0.006556           0     78787           timer_gettime
  0.31    0.006223           0     56412           timer_settime
  0.26    0.005191           0     47723           lseek
  0.24    0.004924           0     51896           write
  0.24    0.004800           0     51844      2917 rt_sigreturn
  0.17    0.003333         833         4           shmdt
  0.01    0.000175           0       790           poll

Cached getlength:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 97.25    2.266124       14715       154         4 futex
  1.03    0.023984           0     57749      3200 select
  0.37    0.008644           0     79926           timer_gettime
  0.29    0.006761           0     82390      6601 read
  0.27    0.006398           0     57900           timer_settime
  0.26    0.006038           0     52503           write
  0.26    0.005985           0     52450      3671 rt_sigreturn
  0.15    0.003418        1139         3           shmdt
  0.10    0.002398           0     23846           lseek
  0.01    0.000216           3        81         4 open

I think there are still a lot of lseeks left because
raw-posix.c:raw_pread_aligned() is implemented using lseek+read
instead of pread.  Does anyone know the reasoning there or could
pread() be used?

Here is the cached getlength hack (I'm not confident that this patch
is correct in all cases, just a quick experiment):


Stefan
Christoph Hellwig - April 17, 2010, 7:40 p.m.
On Sat, Apr 17, 2010 at 08:05:45PM +0100, Stefan Hajnoczi wrote:
> I think there are still a lot of lseeks left because
> raw-posix.c:raw_pread_aligned() is implemented using lseek+read
> instead of pread.  Does anyone know the reasoning there or could
> pread() be used?

There's no good reason for it except maybe compatiblity to really
olh hosts that do not have pread/pwrite.  But given that AIO support
is now mandator and the AIO code uses pread/pwritev exclusively we
would have noticed if that's the case by now.

Patch

diff --git a/block.c b/block.c
index 0f6be17..447327f 100644
--- a/block.c
+++ b/block.c
@@ -957,13 +957,25 @@  int bdrv_pwrite(BlockDriverState *bs, int64_t offset,
 int bdrv_truncate(BlockDriverState *bs, int64_t offset)
 {
     BlockDriver *drv = bs->drv;
+    int ret;
     if (!drv)
         return -ENOMEDIUM;
     if (!drv->bdrv_truncate)
         return -ENOTSUP;
     if (bs->read_only)
         return -EACCES;
-    return drv->bdrv_truncate(bs, offset);
+    ret = drv->bdrv_truncate(bs, offset);
+    if (ret < 0) {
+        return ret;
+    }
+
+    /* refresh total sectors */
+    if (drv->bdrv_getlength) {
+        bs->total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
+    } else {
+        bs->total_sectors = offset >> BDRV_SECTOR_BITS;
+    }
+    return ret;
 }

 /**
@@ -974,8 +986,12 @@  int64_t bdrv_getlength(BlockDriverState *bs)
     BlockDriver *drv = bs->drv;
     if (!drv)
         return -ENOMEDIUM;
-    if (!drv->bdrv_getlength) {
-        /* legacy mode */
+
+    /* Fixed size devices use the total_sectors value for speed instead of
+       issuing a length query (like lseek) on each call.  Also, legacy block
+       drivers don't provide a bdrv_getlength function and must use
+       total_sectors. */
+    if ((bs->total_sectors && !bs->growable) || !drv->bdrv_getlength) {
         return bs->total_sectors * BDRV_SECTOR_SIZE;
     }
     return drv->bdrv_getlength(bs);