Patchwork [v3,1/6] cutils: extract buffer_is_zero() from qemu-img.c

login
register
mail settings
Submitter Stefan Hajnoczi
Date Dec. 21, 2011, 4 p.m.
Message ID <1324483240-31726-2-git-send-email-stefanha@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/132677/
State New
Headers show

Comments

Stefan Hajnoczi - Dec. 21, 2011, 4 p.m.
The qemu-img.c:is_not_zero() function checks if a buffer contains all
zeroes.  This function will come in handy for zero-detection in the
block layer, so clean it up and move it to cutils.c.

Note that the function now returns true if the buffer is all zeroes.
This avoids the double-negatives (i.e. !is_not_zero()) that the old
function can cause in callers.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 cutils.c      |   34 ++++++++++++++++++++++++++++++++++
 qemu-common.h |    2 ++
 qemu-img.c    |   46 +++++++---------------------------------------
 3 files changed, 43 insertions(+), 39 deletions(-)
Eric Blake - Dec. 21, 2011, 4:43 p.m.
On 12/21/2011 09:00 AM, Stefan Hajnoczi wrote:
> The qemu-img.c:is_not_zero() function checks if a buffer contains all
> zeroes.  This function will come in handy for zero-detection in the
> block layer, so clean it up and move it to cutils.c.
> 
> Note that the function now returns true if the buffer is all zeroes.
> This avoids the double-negatives (i.e. !is_not_zero()) that the old
> function can cause in callers.

Are there plans to improve the efficiency of buffer_is_zero to take
advantage of metadata about sparseness?

That is, there are cases where we can use metadata to prove a region of
a file is sparse, without having to read every byte within that region.
 Now that this series is giving QED special metadata that marks a zero
cluster, it is faster to query if that metadata exists denoting a zero
cluster than it is to read the entire cluster and check for non-zero.
Likewise, with regular files, the kernel provides lseek(SEEK_HOLE) (or
the older, lower-level, ioctl(FS_IOC_FIEMAP)); which at least GNU
coreutils is using for efficient sparse detection in source files.
Stefan Hajnoczi - Dec. 22, 2011, 7:47 a.m.
On Wed, Dec 21, 2011 at 09:43:55AM -0700, Eric Blake wrote:
> On 12/21/2011 09:00 AM, Stefan Hajnoczi wrote:
> > The qemu-img.c:is_not_zero() function checks if a buffer contains all
> > zeroes.  This function will come in handy for zero-detection in the
> > block layer, so clean it up and move it to cutils.c.
> > 
> > Note that the function now returns true if the buffer is all zeroes.
> > This avoids the double-negatives (i.e. !is_not_zero()) that the old
> > function can cause in callers.
> 
> Are there plans to improve the efficiency of buffer_is_zero to take
> advantage of metadata about sparseness?
> 
> That is, there are cases where we can use metadata to prove a region of
> a file is sparse, without having to read every byte within that region.
>  Now that this series is giving QED special metadata that marks a zero
> cluster, it is faster to query if that metadata exists denoting a zero
> cluster than it is to read the entire cluster and check for non-zero.
> Likewise, with regular files, the kernel provides lseek(SEEK_HOLE) (or
> the older, lower-level, ioctl(FS_IOC_FIEMAP)); which at least GNU
> coreutils is using for efficient sparse detection in source files.

Yes, there are ways to optimize this for specific storage backends.  But
we need a code path that supports all storage systems first.  For
example, raw files over NFS or an image file over HTTP (curl).

In the case of qcow2 or QED backing files we already don't read zeroes
today.  Instead we memset the read buffer to zero and the waste CPU
cycles doing buffer_is_zero() detection.  At least this means that file
I/O (and network I/O, if using NFS) is already optimal if your backing
file is qcow2 or QED - it's just the CPU cycles that we can optimize
away.

Stefan

Patch

diff --git a/cutils.c b/cutils.c
index 24b3fe3..c9560c3 100644
--- a/cutils.c
+++ b/cutils.c
@@ -301,6 +301,40 @@  void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count,
     }
 }
 
+/*
+ * Checks if a buffer is all zeroes
+ *
+ * Attention! The len must be a multiple of 4 * sizeof(long) due to
+ * restriction of optimizations in this function.
+ */
+bool buffer_is_zero(const void *buf, size_t len)
+{
+    /*
+     * Use long as the biggest available internal data type that fits into the
+     * CPU register and unroll the loop to smooth out the effect of memory
+     * latency.
+     */
+
+    size_t i;
+    long d0, d1, d2, d3;
+    const long * const data = buf;
+
+    len /= sizeof(long);
+
+    for (i = 0; i < len; i += 4) {
+        d0 = data[i + 0];
+        d1 = data[i + 1];
+        d2 = data[i + 2];
+        d3 = data[i + 3];
+
+        if (d0 || d1 || d2 || d3) {
+            return false;
+        }
+    }
+
+    return true;
+}
+
 #ifndef _WIN32
 /* Sets a specific flag */
 int fcntl_setfl(int fd, int flag)
diff --git a/qemu-common.h b/qemu-common.h
index b2de015..95fa2b2 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -293,6 +293,8 @@  void qemu_iovec_memset(QEMUIOVector *qiov, int c, size_t count);
 void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count,
                             size_t skip);
 
+bool buffer_is_zero(const void *buf, size_t len);
+
 void qemu_progress_init(int enabled, float min_skip);
 void qemu_progress_end(void);
 void qemu_progress_print(float delta, int max);
diff --git a/qemu-img.c b/qemu-img.c
index 01cc0d3..c4bcf41 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -515,40 +515,6 @@  static int img_commit(int argc, char **argv)
 }
 
 /*
- * Checks whether the sector is not a zero sector.
- *
- * Attention! The len must be a multiple of 4 * sizeof(long) due to
- * restriction of optimizations in this function.
- */
-static int is_not_zero(const uint8_t *sector, int len)
-{
-    /*
-     * Use long as the biggest available internal data type that fits into the
-     * CPU register and unroll the loop to smooth out the effect of memory
-     * latency.
-     */
-
-    int i;
-    long d0, d1, d2, d3;
-    const long * const data = (const long *) sector;
-
-    len /= sizeof(long);
-
-    for(i = 0; i < len; i += 4) {
-        d0 = data[i + 0];
-        d1 = data[i + 1];
-        d2 = data[i + 2];
-        d3 = data[i + 3];
-
-        if (d0 || d1 || d2 || d3) {
-            return 1;
-        }
-    }
-
-    return 0;
-}
-
-/*
  * Returns true iff the first sector pointed to by 'buf' contains at least
  * a non-NUL byte.
  *
@@ -557,20 +523,22 @@  static int is_not_zero(const uint8_t *sector, int len)
  */
 static int is_allocated_sectors(const uint8_t *buf, int n, int *pnum)
 {
-    int v, i;
+    bool is_zero;
+    int i;
 
     if (n <= 0) {
         *pnum = 0;
         return 0;
     }
-    v = is_not_zero(buf, 512);
+    is_zero = buffer_is_zero(buf, 512);
     for(i = 1; i < n; i++) {
         buf += 512;
-        if (v != is_not_zero(buf, 512))
+        if (is_zero != buffer_is_zero(buf, 512)) {
             break;
+        }
     }
     *pnum = i;
-    return v;
+    return !is_zero;
 }
 
 /*
@@ -955,7 +923,7 @@  static int img_convert(int argc, char **argv)
             if (n < cluster_sectors) {
                 memset(buf + n * 512, 0, cluster_size - n * 512);
             }
-            if (is_not_zero(buf, cluster_size)) {
+            if (!buffer_is_zero(buf, cluster_size)) {
                 ret = bdrv_write_compressed(out_bs, sector_num, buf,
                                             cluster_sectors);
                 if (ret != 0) {