Message ID | 1363362619-3190-3-git-send-email-pl@kamp.de |
---|---|
State | New |
Headers | show |
On 03/15/2013 09:50 AM, Peter Lieven wrote: > Signed-off-by: Peter Lieven <pl@kamp.de> > --- > include/qemu-common.h | 2 ++ > util/cutils.c | 40 ++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 42 insertions(+) > > > +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8 Good. > + assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR > + * sizeof(VECTYPE)) == 0); Good use of it. > + assert(((uintptr_t) buf) % sizeof(VECTYPE) == 0); > + > + if (*((const long *) buf)) { > + return 0; > + } > + > + for (i = 0; i < len / sizeof(VECTYPE); i += 8) { Magic number 8, instead of reusing BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR. > + VECTYPE tmp0 = p[i+0] | p[i+1]; Indentation looks off, because you used TABs. Spaces around binary operators (two instances of '+' per line). > + VECTYPE tmp1 = p[i+2] | p[i+3]; > + VECTYPE tmp2 = p[i+4] | p[i+5]; > + VECTYPE tmp3 = p[i+6] | p[i+7]; > + VECTYPE tmp01 = tmp0 | tmp1; > + VECTYPE tmp23 = tmp2 | tmp3; > + if (!ALL_EQ(tmp01 | tmp23, zero)) { > + break; > + } > + } > + return i * sizeof(VECTYPE); > +} Algorithm looks correct, but worth a respin to fix the appearance.
Am 19.03.2013 um 16:54 schrieb Eric Blake <eblake@redhat.com>: > On 03/15/2013 09:50 AM, Peter Lieven wrote: >> Signed-off-by: Peter Lieven <pl@kamp.de> >> --- >> include/qemu-common.h | 2 ++ >> util/cutils.c | 40 ++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 42 insertions(+) >> > >> >> +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8 > > Good. > >> + assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR >> + * sizeof(VECTYPE)) == 0); > > Good use of it. A question I had is if these asserts make the code slower? In case of all the 4k pages in RAM migration this could be significant. Peter
On 03/19/2013 10:18 AM, Peter Lieven wrote: > > Am 19.03.2013 um 16:54 schrieb Eric Blake <eblake@redhat.com>: > >> On 03/15/2013 09:50 AM, Peter Lieven wrote: >>> Signed-off-by: Peter Lieven <pl@kamp.de> >>> --- >>> include/qemu-common.h | 2 ++ >>> util/cutils.c | 40 ++++++++++++++++++++++++++++++++++++++++ >>> 2 files changed, 42 insertions(+) >>> >> >>> >>> +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8 >> >> Good. >> >>> + assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR >>> + * sizeof(VECTYPE)) == 0); >> >> Good use of it. > > A question I had is if these asserts make the code slower? In case of all > the 4k pages in RAM migration this could be significant. Yes, it probably does slow things down when asserts are enabled. I'm not sure whether it is better to keep the code forcefully robust, or to rely on auditing all callers for properly obeying assumptions. Maybe you should wait for an opinion from one of the maintainers.
Am 19.03.2013 um 17:43 schrieb Eric Blake <eblake@redhat.com>: > On 03/19/2013 10:18 AM, Peter Lieven wrote: >> >> Am 19.03.2013 um 16:54 schrieb Eric Blake <eblake@redhat.com>: >> >>> On 03/15/2013 09:50 AM, Peter Lieven wrote: >>>> Signed-off-by: Peter Lieven <pl@kamp.de> >>>> --- >>>> include/qemu-common.h | 2 ++ >>>> util/cutils.c | 40 ++++++++++++++++++++++++++++++++++++++++ >>>> 2 files changed, 42 insertions(+) >>>> >>> >>>> >>>> +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8 >>> >>> Good. >>> >>>> + assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR >>>> + * sizeof(VECTYPE)) == 0); >>> >>> Good use of it. >> >> A question I had is if these asserts make the code slower? In case of all >> the 4k pages in RAM migration this could be significant. > > Yes, it probably does slow things down when asserts are enabled. I'm > not sure whether it is better to keep the code forcefully robust, or to > rely on auditing all callers for properly obeying assumptions. Maybe > you should wait for an opinion from one of the maintainers. If I follow your idea to add a can_use_buffer_find_nonzero_offset() function and add it to the description of the function my feeling is that we can drop the asserts. Actually we are always checking the same things twice. Peter > > -- > Eric Blake eblake redhat com +1-919-301-3266 > Libvirt virtualization library http://libvirt.org >
diff --git a/include/qemu-common.h b/include/qemu-common.h index b59328f..51a7677 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -362,6 +362,8 @@ size_t qemu_iovec_from_buf(QEMUIOVector *qiov, size_t offset, size_t qemu_iovec_memset(QEMUIOVector *qiov, size_t offset, int fillc, size_t bytes); +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8 +size_t buffer_find_nonzero_offset(const void *buf, size_t len); bool buffer_is_zero(const void *buf, size_t len); void qemu_progress_init(int enabled, float min_skip); diff --git a/util/cutils.c b/util/cutils.c index 1439da4..857dd7d 100644 --- a/util/cutils.c +++ b/util/cutils.c @@ -143,6 +143,46 @@ int qemu_fdatasync(int fd) } /* + * Searches for an area with non-zero content in a buffer + * + * Attention! The len must be a multiple of 8 * sizeof(VECTYPE) + * and addr must be a multiple of sizeof(VECTYPE) due to + * restriction of optimizations in this function. + * + * The return value is the offset of the non-zero area rounded + * down to 8 * sizeof(VECTYPE). If the buffer is all zero + * the return value is equal to len. + */ + +size_t buffer_find_nonzero_offset(const void *buf, size_t len) +{ + VECTYPE *p = (VECTYPE *)buf; + VECTYPE zero = ZERO_SPLAT; + size_t i; + + assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR + * sizeof(VECTYPE)) == 0); + assert(((uintptr_t) buf) % sizeof(VECTYPE) == 0); + + if (*((const long *) buf)) { + return 0; + } + + for (i = 0; i < len / sizeof(VECTYPE); i += 8) { + VECTYPE tmp0 = p[i+0] | p[i+1]; + VECTYPE tmp1 = p[i+2] | p[i+3]; + VECTYPE tmp2 = p[i+4] | p[i+5]; + VECTYPE tmp3 = p[i+6] | p[i+7]; + VECTYPE tmp01 = tmp0 | tmp1; + VECTYPE tmp23 = tmp2 | tmp3; + if (!ALL_EQ(tmp01 | tmp23, zero)) { + break; + } + } + return i * sizeof(VECTYPE); +} + +/* * Checks if a buffer is all zeroes * * Attention! The len must be a multiple of 4 * sizeof(long) due to
Signed-off-by: Peter Lieven <pl@kamp.de> --- include/qemu-common.h | 2 ++ util/cutils.c | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+)