diff mbox series

[RFCv1,02/72] gen_bitmaps: Fix ext2fs_compare_generic_bmap/bitmap logic

Message ID da2a28305637aef648846f9bf75d269c0f7c6e57.1667822611.git.ritesh.list@gmail.com
State Under Review
Delegated to: Theodore Ts'o
Headers show
Series e2fsprogs: Parallel fsck support | expand

Commit Message

Ritesh Harjani (IBM) Nov. 7, 2022, 12:20 p.m. UTC
Currently this function was not correctly comparing against the right
length of the bitmap. Also when we compare bitarray v/s rbtree bitmap
the value returned by ext2fs_test_generic_bmap() could be different in
these two implementations. Hence only check against boolean value.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 lib/ext2fs/gen_bitmap.c   |  9 ++++++---
 lib/ext2fs/gen_bitmap64.c | 10 +++++++---
 2 files changed, 13 insertions(+), 6 deletions(-)

Comments

Andreas Dilger Nov. 23, 2022, 5:04 a.m. UTC | #1
On Nov 7, 2022, at 5:20 AM, Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
> 
> Currently this function was not correctly comparing against the right
> length of the bitmap. Also when we compare bitarray v/s rbtree bitmap
> the value returned by ext2fs_test_generic_bmap() could be different in
> these two implementations. Hence only check against boolean value.
> 
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
> lib/ext2fs/gen_bitmap.c   |  9 ++++++---
> lib/ext2fs/gen_bitmap64.c | 10 +++++++---
> 2 files changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/ext2fs/gen_bitmap.c b/lib/ext2fs/gen_bitmap.c
> index 1536d4b3..f7764fca 100644
> --- a/lib/ext2fs/gen_bitmap.c
> +++ b/lib/ext2fs/gen_bitmap.c
> @@ -385,10 +385,13 @@ errcode_t ext2fs_compare_generic_bitmap(errcode_t magic, errcode_t neq,

        if ((bm1->start != bm2->start) ||
            (bm1->end != bm2->end) ||
            (memcmp(bm1->bitmap, bm2->bitmap,
> 		    (size_t) (bm1->end - bm1->start)/8)))
> 		return neq;
> 
> -	for (i = bm1->end - ((bm1->end - bm1->start) % 8); i <= bm1->end; i++)

On first review it appears that this is only comparing at most the last 7 bits
before "end", which appears to be wrong.  However, if you include the context
earlier in the function (which I've added above), it is using memcmp() to compare
the majority of the bitmaps in byte-wise order, so this is only needs to compare
the remaining bits, and doing the full bitwise scan is much less efficient.

I was going to say that the "start" value needs to be added to the bitmap offset
for the comparison, but I don't think that is correct either.  Looking at the
code (and thinking about it some more), it doesn't make sense to allocate space
for bits before "start", so this must be a virtual offset, and the *actual* bits
are always starting at "bitmap", so the above code is correct, AFAICS.

In summary, I don't think this patch will introduce a visible defect, but it
makes the comparisons orders of magnitude less efficient when comparing each
bit individually instead of using memcmp() that is likely hardware accelerated.

It may well be that the calling convention of this code is such that it is
always called with end=byte-aligned value so this code is never used, but that
doesn't mean it shouldn't be coded properly, in case that isn't true in the future.

> -		if (ext2fs_fast_test_block_bitmap(gen_bm1, i) !=
> -		    ext2fs_fast_test_block_bitmap(gen_bm2, i))
> +	for (i = bm1->start; i <= bm1->end; i++) {
> +		int ret1, ret2;
> +		ret1 = !!ext2fs_fast_test_block_bitmap(gen_bm1, i);
> +		ret2 = !!ext2fs_fast_test_block_bitmap(gen_bm2, i);

Strictly speaking, the !! here is not needed, and ! is enough for comparing
the inequality of the two values.  However, this is only used for 0-7 bits
and isn't performance critical since the memcmp() does the heavy lifting.

> +		if (ret1 != ret2)
> 			return neq;
> +	}
> 
> 	return 0;
> }
> diff --git a/lib/ext2fs/gen_bitmap64.c b/lib/ext2fs/gen_bitmap64.c
> index c860c10e..f7710afd 100644
> --- a/lib/ext2fs/gen_bitmap64.c
> +++ b/lib/ext2fs/gen_bitmap64.c
> @@ -629,10 +629,14 @@ errcode_t ext2fs_compare_generic_bmap(errcode_t neq,
> 	    (bm1->end != bm2->end))

Conversely, *this* version of the function is *not* doing the memcmp() of
the bulk of the bitmap contents, so it would appear to have a bug that the
patch fixes, but in a very slow manner.  It would be better to use memcmp().

> 		return neq;
> 
> -	for (i = bm1->end - ((bm1->end - bm1->start) % 8); i <= bm1->end; i++)
> -		if (ext2fs_test_generic_bmap(gen_bm1, i) !=
> -		    ext2fs_test_generic_bmap(gen_bm2, i))
> +	for (i = bm1->start; i < bm1->end; i++) {
> +		int ret1, ret2;
> +		ret1 = !!ext2fs_test_generic_bmap(gen_bm1, i);
> +		ret2 = !!ext2fs_test_generic_bmap(gen_bm2, i);

Cheers, Andreas
Theodore Ts'o Jan. 24, 2023, 4:59 p.m. UTC | #2
On Tue, Nov 22, 2022 at 10:04:58PM -0700, Andreas Dilger wrote:
> On Nov 7, 2022, at 5:20 AM, Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
> > 
> > Currently this function was not correctly comparing against the right
> > length of the bitmap. Also when we compare bitarray v/s rbtree bitmap
> > the value returned by ext2fs_test_generic_bmap() could be different in
> > these two implementations. Hence only check against boolean value.
> > 
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > ---
> > lib/ext2fs/gen_bitmap.c   |  9 ++++++---

The gen_bitmap.c file supports only the original 32-bit bitmaps, so
there is not rbtree bitmap.  That's why using memcmp is safe, and as
near as I can tell, no changes are needed for the
lib/ext2fs/gen_bitmap.c

> > diff --git a/lib/ext2fs/gen_bitmap64.c b/lib/ext2fs/gen_bitmap64.c
> > index c860c10e..f7710afd 100644
> > --- a/lib/ext2fs/gen_bitmap64.c
> > +++ b/lib/ext2fs/gen_bitmap64.c
> > @@ -629,10 +629,14 @@ errcode_t ext2fs_compare_generic_bmap(errcode_t neq,
> > 	    (bm1->end != bm2->end))
> 
> Conversely, *this* version of the function is *not* doing the memcmp() of
> the bulk of the bitmap contents, so it would appear to have a bug that the
> patch fixes, but in a very slow manner.  It would be better to use memcmp().

This is where we have a problem, and the reason why we can't use
memcmp is because in the case where the bitmap is really encoded using
an rbtree, a memcmp isn't applicable.

We *could* extract the bitmap into a bitarray, but that would require
allocating memory, and in the case of a very large file system/bitmap,
especially when running on a 32-bit NAS box, we might not be able to
*afford* to allocate that much memory (or it might not even be
addressable :-).

So yes, we need a fix here, and yes, comparing bit by bit is very
slow.  Fortunately, from what I can tell, no one is actually calling
this function, which is why no once noticed that the 64-bit compare
bitmap fucntion was totally hosed.

So applying Ritesh's fix here makes sense, since it at least fixes the
obvious problems.  In the long term, we should probably add proper
unit regression tests so can test whether or not this function is
actually working correctly, and if we *want* a faster version of it,
we could do something faster in the case where both bitmaps are the
same type (e.g., both bitarrays or both rbtrees), and if they could do
something where we look for "runs" where all of the bits are all set
or not set, and then compare that against the other bitmap, etc.  But
given that we don't have anyusers of this function any more (we used
to, but it disappeared when we optimzied e2fsck's pass5
implementation), that's probably not an urgent fix.

						- Ted
diff mbox series

Patch

diff --git a/lib/ext2fs/gen_bitmap.c b/lib/ext2fs/gen_bitmap.c
index 1536d4b3..f7764fca 100644
--- a/lib/ext2fs/gen_bitmap.c
+++ b/lib/ext2fs/gen_bitmap.c
@@ -385,10 +385,13 @@  errcode_t ext2fs_compare_generic_bitmap(errcode_t magic, errcode_t neq,
 		    (size_t) (bm1->end - bm1->start)/8)))
 		return neq;
 
-	for (i = bm1->end - ((bm1->end - bm1->start) % 8); i <= bm1->end; i++)
-		if (ext2fs_fast_test_block_bitmap(gen_bm1, i) !=
-		    ext2fs_fast_test_block_bitmap(gen_bm2, i))
+	for (i = bm1->start; i <= bm1->end; i++) {
+		int ret1, ret2;
+		ret1 = !!ext2fs_fast_test_block_bitmap(gen_bm1, i);
+		ret2 = !!ext2fs_fast_test_block_bitmap(gen_bm2, i);
+		if (ret1 != ret2)
 			return neq;
+	}
 
 	return 0;
 }
diff --git a/lib/ext2fs/gen_bitmap64.c b/lib/ext2fs/gen_bitmap64.c
index c860c10e..f7710afd 100644
--- a/lib/ext2fs/gen_bitmap64.c
+++ b/lib/ext2fs/gen_bitmap64.c
@@ -629,10 +629,14 @@  errcode_t ext2fs_compare_generic_bmap(errcode_t neq,
 	    (bm1->end != bm2->end))
 		return neq;
 
-	for (i = bm1->end - ((bm1->end - bm1->start) % 8); i <= bm1->end; i++)
-		if (ext2fs_test_generic_bmap(gen_bm1, i) !=
-		    ext2fs_test_generic_bmap(gen_bm2, i))
+	for (i = bm1->start; i < bm1->end; i++) {
+		int ret1, ret2;
+		ret1 = !!ext2fs_test_generic_bmap(gen_bm1, i);
+		ret2 = !!ext2fs_test_generic_bmap(gen_bm2, i);
+		if (ret1 != ret2) {
 			return neq;
+		}
+	}
 
 	return 0;
 }