Patchwork bitops: unify bitops_ffsl with the one in host-utils.h

login
register
mail settings
Submitter Paolo Bonzini
Date Jan. 30, 2013, 4:53 p.m.
Message ID <1359564804-20398-1-git-send-email-pbonzini@redhat.com>
Download mbox | patch
Permalink /patch/216942/
State New
Headers show

Comments

Paolo Bonzini - Jan. 30, 2013, 4:53 p.m.
Fixes the build on Mac OS X, which has ffsl.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/bitops.h     | 40 ++++++++++++++--------------------------
 include/qemu/hbitmap.h    |  2 +-
 include/qemu/host-utils.h | 25 -------------------------
 util/hbitmap.c            |  2 +-
 4 files changed, 16 insertions(+), 53 deletions(-)
Stefan Weil - Jan. 30, 2013, 5:36 p.m.
Am 30.01.2013 17:53, schrieb Paolo Bonzini:
> Fixes the build on Mac OS X, which has ffsl.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/qemu/bitops.h     | 40 ++++++++++++++--------------------------
>  include/qemu/hbitmap.h    |  2 +-
>  include/qemu/host-utils.h | 25 -------------------------
>  util/hbitmap.c            |  2 +-
>  4 files changed, 16 insertions(+), 53 deletions(-)
>

This patch also fixes MinGW / MinGW-w64 builds which
don't have a prototype declaration for ffsl.

Tested-by: Stefan Weil <sw@weilnetz.de>

Nevertheless I wonder why you don't use gcc's __builtin_ffsl.

Wouldn't it be easier to call ffsl and rely on the compiler to
provide inline code? Then adding the missing declaration to
include/sysemu/os-win32.h would be sufficient (for MinGW).

Stefan
Eric Blake - Jan. 30, 2013, 5:38 p.m.
On 01/30/2013 09:53 AM, Paolo Bonzini wrote:
> Fixes the build on Mac OS X, which has ffsl.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/qemu/bitops.h     | 40 ++++++++++++++--------------------------
>  include/qemu/hbitmap.h    |  2 +-
>  include/qemu/host-utils.h | 25 -------------------------
>  util/hbitmap.c            |  2 +-
>  4 files changed, 16 insertions(+), 53 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>
Peter Maydell - Jan. 30, 2013, 6:34 p.m.
On 30 January 2013 16:53, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Fixes the build on Mac OS X, which has ffsl.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Confirmed that this fixes the MacOSX build (though there seems
to be some other problem causing us to hang on startup which I need
to track down).

NB: the comment in host-utils.h reading:
 #include <string.h>     /* ffsl */
would be out of date with this patch.
Also xen-all.c and kvm-all.c still use system ffsl() (but I think
they have done so for some time so we could leave them be).

-- PMM
Peter Maydell - Jan. 30, 2013, 8:41 p.m.
On 30 January 2013 16:53, Paolo Bonzini <pbonzini@redhat.com> wrote:
> +    if (!word) {
> +        return 0;
> +    }

If we're specifically defining the behaviour for zero input we
need to fix the API documentation comment which currently says
this case is undefined and the caller should be checking.

> -#if LONG_MAX > 0x7FFFFFFF
> -       if ((word & 0xffffffff) == 0) {
> -               num += 32;
> -               word >>= 32;
> -       }
> +#if QEMU_GNUC_PREREQ(3, 4)
> +    return __builtin_ctzl(word) + 1;
> +#else
> +    if (sizeof(long) == 4) {
> +        return ctz32(word) + 1;
> +    } else if (sizeof(long) == 8) {
> +        return ctz64(word) + 1;
> +    } else {
> +        abort();
> +    }
>  #endif
> -       if ((word & 0xffff) == 0) {
> -               num += 16;
> -               word >>= 16;
> -       }
> -       if ((word & 0xff) == 0) {
> -               num += 8;
> -               word >>= 8;
> -       }
> -       if ((word & 0xf) == 0) {
> -               num += 4;
> -               word >>= 4;
> -       }
> -       if ((word & 0x3) == 0) {
> -               num += 2;
> -               word >>= 2;
> -       }
> -       if ((word & 0x1) == 0) {
> -               num += 1;
> -        }
> -       return num;
>  }

This reimplementation appears to have an off by one error.
For example, on an input of 4, the old algorithm returns 2
and this one returns 3.

(this is what was causing my test case on MacOS to hang.)

-- PMM
Eric Blake - Jan. 30, 2013, 9:51 p.m.
On 01/30/2013 01:41 PM, Peter Maydell wrote:

>> -#if LONG_MAX > 0x7FFFFFFF
>> -       if ((word & 0xffffffff) == 0) {
>> -               num += 32;
>> -               word >>= 32;
>> -       }

>> -       if ((word & 0xffff) == 0) {
>> -               num += 16;
>> -               word >>= 16;
>> -       }
>> -       if ((word & 0xff) == 0) {
>> -               num += 8;
>> -               word >>= 8;
>> -       }
>> -       if ((word & 0xf) == 0) {
>> -               num += 4;
>> -               word >>= 4;
>> -       }
>> -       if ((word & 0x3) == 0) {
>> -               num += 2;
>> -               word >>= 2;
>> -       }
>> -       if ((word & 0x1) == 0) {
>> -               num += 1;
>> -        }
>> -       return num;
>>  }
> 
> This reimplementation appears to have an off by one error.
> For example, on an input of 4, the old algorithm returns 2
> and this one returns 3.

Ouch - you are right that the old implementation is indeed a ctz()
instead of an ffs() operation.  That means we need to fix all callers of
bitops_ffsl() to either use ffsl semantics, or to add a bitops_ctzl()
that provides the old behavior.
Peter Maydell - Jan. 30, 2013, 9:58 p.m.
On 30 January 2013 21:51, Eric Blake <eblake@redhat.com> wrote:
> On 01/30/2013 01:41 PM, Peter Maydell wrote:
>> This reimplementation appears to have an off by one error.
>> For example, on an input of 4, the old algorithm returns 2
>> and this one returns 3.
>
> Ouch - you are right that the old implementation is indeed a ctz()
> instead of an ffs() operation.  That means we need to fix all callers of
> bitops_ffsl() to either use ffsl semantics, or to add a bitops_ctzl()
> that provides the old behavior.

Mmm, it's all a bit confusing. My suggestion to try to
clean this up in the longer term (ie not necessarily for
1.4) is:
 * make sure our function names and semantics line up with
   the gcc builtin definitions (adjusting callers as needed)
 * put the bit operations all in bitops.h rather than split
   between there and host-utils.h

-- PMM

Patch

diff --git a/include/qemu/bitops.h b/include/qemu/bitops.h
index 74e14e5..a59bfdc 100644
--- a/include/qemu/bitops.h
+++ b/include/qemu/bitops.h
@@ -13,6 +13,7 @@ 
 #define BITOPS_H
 
 #include "qemu-common.h"
+#include "host-utils.h"
 
 #define BITS_PER_BYTE           CHAR_BIT
 #define BITS_PER_LONG           (sizeof (unsigned long) * BITS_PER_BYTE)
@@ -30,34 +31,21 @@ 
  */
 static unsigned long bitops_ffsl(unsigned long word)
 {
-	int num = 0;
+    if (!word) {
+        return 0;
+    }
 
-#if LONG_MAX > 0x7FFFFFFF
-	if ((word & 0xffffffff) == 0) {
-		num += 32;
-		word >>= 32;
-	}
+#if QEMU_GNUC_PREREQ(3, 4)
+    return __builtin_ctzl(word) + 1;
+#else
+    if (sizeof(long) == 4) {
+        return ctz32(word) + 1;
+    } else if (sizeof(long) == 8) {
+        return ctz64(word) + 1;
+    } else {
+        abort();
+    }
 #endif
-	if ((word & 0xffff) == 0) {
-		num += 16;
-		word >>= 16;
-	}
-	if ((word & 0xff) == 0) {
-		num += 8;
-		word >>= 8;
-	}
-	if ((word & 0xf) == 0) {
-		num += 4;
-		word >>= 4;
-	}
-	if ((word & 0x3) == 0) {
-		num += 2;
-		word >>= 2;
-	}
-	if ((word & 0x1) == 0) {
-		num += 1;
-        }
-	return num;
 }
 
 /**
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 73f5d1d..abad209 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -170,7 +170,7 @@  static inline int64_t hbitmap_iter_next(HBitmapIter *hbi)
 
     /* The next call will resume work from the next bit.  */
     hbi->cur[HBITMAP_LEVELS - 1] = cur & (cur - 1);
-    item = ((uint64_t)hbi->pos << BITS_PER_LEVEL) + ffsl(cur) - 1;
+    item = ((uint64_t)hbi->pos << BITS_PER_LEVEL) + bitops_ffsl(cur) - 1;
 
     return item << hbi->granularity;
 }
diff --git a/include/qemu/host-utils.h b/include/qemu/host-utils.h
index 2a32be4..40457bd 100644
--- a/include/qemu/host-utils.h
+++ b/include/qemu/host-utils.h
@@ -238,29 +238,4 @@  static inline int ctpop64(uint64_t val)
 #endif
 }
 
-/* glibc does not provide an inline version of ffsl, so always define
- * ours.  We need to give it a different name, however.
- */
-#ifdef __GLIBC__
-#define ffsl qemu_ffsl
-#endif
-static inline int ffsl(long val)
-{
-    if (!val) {
-        return 0;
-    }
-
-#if QEMU_GNUC_PREREQ(3, 4)
-    return __builtin_ctzl(val) + 1;
-#else
-    if (sizeof(long) == 4) {
-        return ctz32(val) + 1;
-    } else if (sizeof(long) == 8) {
-        return ctz64(val) + 1;
-    } else {
-        abort();
-    }
-#endif
-}
-
 #endif
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 2aa487d..32c3d59 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -126,7 +126,7 @@  unsigned long hbitmap_iter_skip_words(HBitmapIter *hbi)
          * The index of this word's least significant set bit provides
          * the low-order bits.
          */
-        pos = (pos << BITS_PER_LEVEL) + ffsl(cur) - 1;
+        pos = (pos << BITS_PER_LEVEL) + bitops_ffsl(cur) - 1;
         hbi->cur[i] = cur & (cur - 1);
 
         /* Set up next level for iteration.  */