Patchwork [PATCHv3,4/9] bitops: use vector algorithm to optimize find_next_bit()

login
register
mail settings
Submitter Peter Lieven
Date March 21, 2013, 3:57 p.m.
Message ID <1363881457-14814-5-git-send-email-pl@kamp.de>
Download mbox | patch
Permalink /patch/229740/
State New
Headers show

Comments

Peter Lieven - March 21, 2013, 3:57 p.m.
this patch adds the usage of buffer_find_nonzero_offset()
to skip large areas of zeroes.

compared to loop unrolling presented in an earlier
patch this adds another 50% performance benefit for
skipping large areas of zeroes. loop unrolling alone
added close to 100% speedup.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 util/bitops.c |   22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)
Eric Blake - March 21, 2013, 7:18 p.m.
On 03/21/2013 09:57 AM, Peter Lieven wrote:
> this patch adds the usage of buffer_find_nonzero_offset()
> to skip large areas of zeroes.
> 
> compared to loop unrolling presented in an earlier
> patch this adds another 50% performance benefit for
> skipping large areas of zeroes. loop unrolling alone
> added close to 100% speedup.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  util/bitops.c |   22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

Patch

diff --git a/util/bitops.c b/util/bitops.c
index e72237a..8ea79ae 100644
--- a/util/bitops.c
+++ b/util/bitops.c
@@ -42,10 +42,26 @@  unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
         size -= BITS_PER_LONG;
         result += BITS_PER_LONG;
     }
-    while (size & ~(BITS_PER_LONG-1)) {
-        if ((tmp = *(p++))) {
-            goto found_middle;
+    while (size >= BITS_PER_LONG) {
+        if ((tmp = *p)) {
+             goto found_middle;
+        }
+        if (can_use_buffer_find_nonzero_offset(p, size / BITS_PER_BYTE)) {
+            size_t tmp2 =
+                buffer_find_nonzero_offset(p, size / BITS_PER_BYTE);
+            result += tmp2 * BITS_PER_BYTE;
+            size -= tmp2 * BITS_PER_BYTE;
+            p += tmp2 / sizeof(unsigned long);
+            if (!size) {
+                return result;
+            }
+            if (tmp2) {
+                if ((tmp = *p)) {
+                    goto found_middle;
+                }
+            }
         }
+        p++;
         result += BITS_PER_LONG;
         size -= BITS_PER_LONG;
     }