Patchwork [PATCHv2,2/9] cutils: add a function to find non-zero content in a buffer

login
register
mail settings
Submitter Peter Lieven
Date March 15, 2013, 3:50 p.m.
Message ID <1363362619-3190-3-git-send-email-pl@kamp.de>
Download mbox | patch
Permalink /patch/228150/
State New
Headers show

Comments

Peter Lieven - March 15, 2013, 3:50 p.m.
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 include/qemu-common.h |    2 ++
 util/cutils.c         |   40 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+)
Eric Blake - March 19, 2013, 3:54 p.m.
On 03/15/2013 09:50 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  include/qemu-common.h |    2 ++
>  util/cutils.c         |   40 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 42 insertions(+)
> 

>  
> +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8

Good.

> +    assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 
> +        * sizeof(VECTYPE)) == 0);

Good use of it.

> +    assert(((uintptr_t) buf) % sizeof(VECTYPE) == 0);
> +
> +    if (*((const long *) buf)) {
> +        return 0;
> +    }
> +
> +    for (i = 0; i < len / sizeof(VECTYPE); i += 8) {

Magic number 8, instead of reusing BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR.

> +		VECTYPE tmp0 = p[i+0] | p[i+1];

Indentation looks off, because you used TABs.

Spaces around binary operators (two instances of '+' per line).

> +		VECTYPE tmp1 = p[i+2] | p[i+3];
> +		VECTYPE tmp2 = p[i+4] | p[i+5];
> +		VECTYPE tmp3 = p[i+6] | p[i+7];
> +		VECTYPE tmp01 = tmp0 | tmp1;
> +		VECTYPE tmp23 = tmp2 | tmp3;
> +		if (!ALL_EQ(tmp01 | tmp23, zero)) {
> +		    break;
> +		}
> +    }
> +    return i * sizeof(VECTYPE);
> +}

Algorithm looks correct, but worth a respin to fix the appearance.
Peter Lieven - March 19, 2013, 4:18 p.m.
Am 19.03.2013 um 16:54 schrieb Eric Blake <eblake@redhat.com>:

> On 03/15/2013 09:50 AM, Peter Lieven wrote:
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>> include/qemu-common.h |    2 ++
>> util/cutils.c         |   40 ++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 42 insertions(+)
>> 
> 
>> 
>> +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8
> 
> Good.
> 
>> +    assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 
>> +        * sizeof(VECTYPE)) == 0);
> 
> Good use of it.

A question I had is if these asserts make the code slower? In case of all
the 4k pages in RAM migration this could be significant.

Peter
Eric Blake - March 19, 2013, 4:43 p.m.
On 03/19/2013 10:18 AM, Peter Lieven wrote:
> 
> Am 19.03.2013 um 16:54 schrieb Eric Blake <eblake@redhat.com>:
> 
>> On 03/15/2013 09:50 AM, Peter Lieven wrote:
>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>> ---
>>> include/qemu-common.h |    2 ++
>>> util/cutils.c         |   40 ++++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 42 insertions(+)
>>>
>>
>>>
>>> +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8
>>
>> Good.
>>
>>> +    assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 
>>> +        * sizeof(VECTYPE)) == 0);
>>
>> Good use of it.
> 
> A question I had is if these asserts make the code slower? In case of all
> the 4k pages in RAM migration this could be significant.

Yes, it probably does slow things down when asserts are enabled.  I'm
not sure whether it is better to keep the code forcefully robust, or to
rely on auditing all callers for properly obeying assumptions.  Maybe
you should wait for an opinion from one of the maintainers.
Peter Lieven - March 19, 2013, 7:42 p.m.
Am 19.03.2013 um 17:43 schrieb Eric Blake <eblake@redhat.com>:

> On 03/19/2013 10:18 AM, Peter Lieven wrote:
>> 
>> Am 19.03.2013 um 16:54 schrieb Eric Blake <eblake@redhat.com>:
>> 
>>> On 03/15/2013 09:50 AM, Peter Lieven wrote:
>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>> ---
>>>> include/qemu-common.h |    2 ++
>>>> util/cutils.c         |   40 ++++++++++++++++++++++++++++++++++++++++
>>>> 2 files changed, 42 insertions(+)
>>>> 
>>> 
>>>> 
>>>> +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8
>>> 
>>> Good.
>>> 
>>>> +    assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 
>>>> +        * sizeof(VECTYPE)) == 0);
>>> 
>>> Good use of it.
>> 
>> A question I had is if these asserts make the code slower? In case of all
>> the 4k pages in RAM migration this could be significant.
> 
> Yes, it probably does slow things down when asserts are enabled.  I'm
> not sure whether it is better to keep the code forcefully robust, or to
> rely on auditing all callers for properly obeying assumptions.  Maybe
> you should wait for an opinion from one of the maintainers.

If I follow your idea to add a can_use_buffer_find_nonzero_offset() function
and add it to the description of the function my feeling is that we can drop the asserts.
Actually we are always checking the same things twice.

Peter

> 
> -- 
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>

Patch

diff --git a/include/qemu-common.h b/include/qemu-common.h
index b59328f..51a7677 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -362,6 +362,8 @@  size_t qemu_iovec_from_buf(QEMUIOVector *qiov, size_t offset,
 size_t qemu_iovec_memset(QEMUIOVector *qiov, size_t offset,
                          int fillc, size_t bytes);
 
+#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8
+size_t buffer_find_nonzero_offset(const void *buf, size_t len);
 bool buffer_is_zero(const void *buf, size_t len);
 
 void qemu_progress_init(int enabled, float min_skip);
diff --git a/util/cutils.c b/util/cutils.c
index 1439da4..857dd7d 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -143,6 +143,46 @@  int qemu_fdatasync(int fd)
 }
 
 /*
+ * Searches for an area with non-zero content in a buffer
+ *
+ * Attention! The len must be a multiple of 8 * sizeof(VECTYPE) 
+ * and addr must be a multiple of sizeof(VECTYPE) due to 
+ * restriction of optimizations in this function.
+ * 
+ * The return value is the offset of the non-zero area rounded
+ * down to 8 * sizeof(VECTYPE). If the buffer is all zero 
+ * the return value is equal to len.
+ */
+
+size_t buffer_find_nonzero_offset(const void *buf, size_t len)
+{
+    VECTYPE *p = (VECTYPE *)buf;
+    VECTYPE zero = ZERO_SPLAT;
+    size_t i;
+    
+    assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 
+        * sizeof(VECTYPE)) == 0);
+    assert(((uintptr_t) buf) % sizeof(VECTYPE) == 0);
+
+    if (*((const long *) buf)) {
+        return 0;
+    }
+
+    for (i = 0; i < len / sizeof(VECTYPE); i += 8) {
+		VECTYPE tmp0 = p[i+0] | p[i+1];
+		VECTYPE tmp1 = p[i+2] | p[i+3];
+		VECTYPE tmp2 = p[i+4] | p[i+5];
+		VECTYPE tmp3 = p[i+6] | p[i+7];
+		VECTYPE tmp01 = tmp0 | tmp1;
+		VECTYPE tmp23 = tmp2 | tmp3;
+		if (!ALL_EQ(tmp01 | tmp23, zero)) {
+		    break;
+		}
+    }
+    return i * sizeof(VECTYPE);
+}
+
+/*
  * Checks if a buffer is all zeroes
  *
  * Attention! The len must be a multiple of 4 * sizeof(long) due to