Message ID | 1359564804-20398-1-git-send-email-pbonzini@redhat.com |
---|---|
State | New |
Headers | show |
Am 30.01.2013 17:53, schrieb Paolo Bonzini: > Fixes the build on Mac OS X, which has ffsl. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > include/qemu/bitops.h | 40 ++++++++++++++-------------------------- > include/qemu/hbitmap.h | 2 +- > include/qemu/host-utils.h | 25 ------------------------- > util/hbitmap.c | 2 +- > 4 files changed, 16 insertions(+), 53 deletions(-) > This patch also fixes MinGW / MinGW-w64 builds which don't have a prototype declaration for ffsl. Tested-by: Stefan Weil <sw@weilnetz.de> Nevertheless I wonder why you don't use gcc's __builtin_ffsl. Wouldn't it be easier to call ffsl and rely on the compiler to provide inline code? Then adding the missing declaration to include/sysemu/os-win32.h would be sufficient (for MinGW). Stefan
On 01/30/2013 09:53 AM, Paolo Bonzini wrote: > Fixes the build on Mac OS X, which has ffsl. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > include/qemu/bitops.h | 40 ++++++++++++++-------------------------- > include/qemu/hbitmap.h | 2 +- > include/qemu/host-utils.h | 25 ------------------------- > util/hbitmap.c | 2 +- > 4 files changed, 16 insertions(+), 53 deletions(-) Reviewed-by: Eric Blake <eblake@redhat.com>
On 30 January 2013 16:53, Paolo Bonzini <pbonzini@redhat.com> wrote: > Fixes the build on Mac OS X, which has ffsl. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Confirmed that this fixes the MacOSX build (though there seems to be some other problem causing us to hang on startup which I need to track down). NB: the comment in host-utils.h reading: #include <string.h> /* ffsl */ would be out of date with this patch. Also xen-all.c and kvm-all.c still use system ffsl() (but I think they have done so for some time so we could leave them be). -- PMM
On 30 January 2013 16:53, Paolo Bonzini <pbonzini@redhat.com> wrote: > + if (!word) { > + return 0; > + } If we're specifically defining the behaviour for zero input we need to fix the API documentation comment which currently says this case is undefined and the caller should be checking. > -#if LONG_MAX > 0x7FFFFFFF > - if ((word & 0xffffffff) == 0) { > - num += 32; > - word >>= 32; > - } > +#if QEMU_GNUC_PREREQ(3, 4) > + return __builtin_ctzl(word) + 1; > +#else > + if (sizeof(long) == 4) { > + return ctz32(word) + 1; > + } else if (sizeof(long) == 8) { > + return ctz64(word) + 1; > + } else { > + abort(); > + } > #endif > - if ((word & 0xffff) == 0) { > - num += 16; > - word >>= 16; > - } > - if ((word & 0xff) == 0) { > - num += 8; > - word >>= 8; > - } > - if ((word & 0xf) == 0) { > - num += 4; > - word >>= 4; > - } > - if ((word & 0x3) == 0) { > - num += 2; > - word >>= 2; > - } > - if ((word & 0x1) == 0) { > - num += 1; > - } > - return num; > } This reimplementation appears to have an off by one error. For example, on an input of 4, the old algorithm returns 2 and this one returns 3. (this is what was causing my test case on MacOS to hang.) -- PMM
On 01/30/2013 01:41 PM, Peter Maydell wrote: >> -#if LONG_MAX > 0x7FFFFFFF >> - if ((word & 0xffffffff) == 0) { >> - num += 32; >> - word >>= 32; >> - } >> - if ((word & 0xffff) == 0) { >> - num += 16; >> - word >>= 16; >> - } >> - if ((word & 0xff) == 0) { >> - num += 8; >> - word >>= 8; >> - } >> - if ((word & 0xf) == 0) { >> - num += 4; >> - word >>= 4; >> - } >> - if ((word & 0x3) == 0) { >> - num += 2; >> - word >>= 2; >> - } >> - if ((word & 0x1) == 0) { >> - num += 1; >> - } >> - return num; >> } > > This reimplementation appears to have an off by one error. > For example, on an input of 4, the old algorithm returns 2 > and this one returns 3. Ouch - you are right that the old implementation is indeed a ctz() instead of an ffs() operation. That means we need to fix all callers of bitops_ffsl() to either use ffsl semantics, or to add a bitops_ctzl() that provides the old behavior.
On 30 January 2013 21:51, Eric Blake <eblake@redhat.com> wrote: > On 01/30/2013 01:41 PM, Peter Maydell wrote: >> This reimplementation appears to have an off by one error. >> For example, on an input of 4, the old algorithm returns 2 >> and this one returns 3. > > Ouch - you are right that the old implementation is indeed a ctz() > instead of an ffs() operation. That means we need to fix all callers of > bitops_ffsl() to either use ffsl semantics, or to add a bitops_ctzl() > that provides the old behavior. Mmm, it's all a bit confusing. My suggestion to try to clean this up in the longer term (ie not necessarily for 1.4) is: * make sure our function names and semantics line up with the gcc builtin definitions (adjusting callers as needed) * put the bit operations all in bitops.h rather than split between there and host-utils.h -- PMM
diff --git a/include/qemu/bitops.h b/include/qemu/bitops.h index 74e14e5..a59bfdc 100644 --- a/include/qemu/bitops.h +++ b/include/qemu/bitops.h @@ -13,6 +13,7 @@ #define BITOPS_H #include "qemu-common.h" +#include "host-utils.h" #define BITS_PER_BYTE CHAR_BIT #define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE) @@ -30,34 +31,21 @@ */ static unsigned long bitops_ffsl(unsigned long word) { - int num = 0; + if (!word) { + return 0; + } -#if LONG_MAX > 0x7FFFFFFF - if ((word & 0xffffffff) == 0) { - num += 32; - word >>= 32; - } +#if QEMU_GNUC_PREREQ(3, 4) + return __builtin_ctzl(word) + 1; +#else + if (sizeof(long) == 4) { + return ctz32(word) + 1; + } else if (sizeof(long) == 8) { + return ctz64(word) + 1; + } else { + abort(); + } #endif - if ((word & 0xffff) == 0) { - num += 16; - word >>= 16; - } - if ((word & 0xff) == 0) { - num += 8; - word >>= 8; - } - if ((word & 0xf) == 0) { - num += 4; - word >>= 4; - } - if ((word & 0x3) == 0) { - num += 2; - word >>= 2; - } - if ((word & 0x1) == 0) { - num += 1; - } - return num; } /** diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h index 73f5d1d..abad209 100644 --- a/include/qemu/hbitmap.h +++ b/include/qemu/hbitmap.h @@ -170,7 +170,7 @@ static inline int64_t hbitmap_iter_next(HBitmapIter *hbi) /* The next call will resume work from the next bit. */ hbi->cur[HBITMAP_LEVELS - 1] = cur & (cur - 1); - item = ((uint64_t)hbi->pos << BITS_PER_LEVEL) + ffsl(cur) - 1; + item = ((uint64_t)hbi->pos << BITS_PER_LEVEL) + bitops_ffsl(cur) - 1; return item << hbi->granularity; } diff --git a/include/qemu/host-utils.h b/include/qemu/host-utils.h index 2a32be4..40457bd 100644 --- a/include/qemu/host-utils.h +++ b/include/qemu/host-utils.h @@ -238,29 +238,4 @@ static inline int ctpop64(uint64_t val) #endif } -/* glibc does not provide an inline version of ffsl, so always define - * ours. We need to give it a different name, however. - */ -#ifdef __GLIBC__ -#define ffsl qemu_ffsl -#endif -static inline int ffsl(long val) -{ - if (!val) { - return 0; - } - -#if QEMU_GNUC_PREREQ(3, 4) - return __builtin_ctzl(val) + 1; -#else - if (sizeof(long) == 4) { - return ctz32(val) + 1; - } else if (sizeof(long) == 8) { - return ctz64(val) + 1; - } else { - abort(); - } -#endif -} - #endif diff --git a/util/hbitmap.c b/util/hbitmap.c index 2aa487d..32c3d59 100644 --- a/util/hbitmap.c +++ b/util/hbitmap.c @@ -126,7 +126,7 @@ unsigned long hbitmap_iter_skip_words(HBitmapIter *hbi) * The index of this word's least significant set bit provides * the low-order bits. */ - pos = (pos << BITS_PER_LEVEL) + ffsl(cur) - 1; + pos = (pos << BITS_PER_LEVEL) + bitops_ffsl(cur) - 1; hbi->cur[i] = cur & (cur - 1); /* Set up next level for iteration. */
Fixes the build on Mac OS X, which has ffsl. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- include/qemu/bitops.h | 40 ++++++++++++++-------------------------- include/qemu/hbitmap.h | 2 +- include/qemu/host-utils.h | 25 ------------------------- util/hbitmap.c | 2 +- 4 files changed, 16 insertions(+), 53 deletions(-)