diff mbox series

[U-Boot] fs: btrfs: fix false negatives in ROOT_ITEM search

Message ID 20190413215049.15305-1-delroth@gmail.com
State Accepted
Commit 633967f9818cb6a0e87ffa8cba33148a5bcc6edb
Delegated to: Tom Rini
Headers show
Series [U-Boot] fs: btrfs: fix false negatives in ROOT_ITEM search | expand

Commit Message

Pierre Bourdon April 13, 2019, 9:50 p.m. UTC
ROOT_ITEMs in btrfs are referenced without knowing their actual "offset"
value. To perform these searches using only two items from the key, the
btrfs driver uses a special "btrfs_search_tree_key_type" function.

The algorithm used by that function to transform a 3-tuple search into a
2-tuple search was subtly broken, leading to items not being found if
they were the first in their tree node.

This commit fixes btrfs_search_tree_key_type to properly behave in these
situations.

Signed-off-by: Pierre Bourdon <delroth@gmail.com>
Cc: Marek Behun <marek.behun@nic.cz>
---
 fs/btrfs/ctree.h | 44 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 38 insertions(+), 6 deletions(-)

Comments

Pierre Bourdon April 13, 2019, 10:05 p.m. UTC | #1
On Sat, Apr 13, 2019 at 11:50 PM Pierre Bourdon <delroth@gmail.com> wrote:
>
> ROOT_ITEMs in btrfs are referenced without knowing their actual "offset"
> value. To perform these searches using only two items from the key, the
> btrfs driver uses a special "btrfs_search_tree_key_type" function.
>
> The algorithm used by that function to transform a 3-tuple search into a
> 2-tuple search was subtly broken, leading to items not being found if
> they were the first in their tree node.
>
> This commit fixes btrfs_search_tree_key_type to properly behave in these
> situations.

A more practical example in case the explanation isn't clear:

root tree
node 122216448 level 1 items 6 free 115 generation 12246 owner ROOT_TREE
fs uuid b516974e-a94e-469c-a9ff-d237ce96b03b
chunk uuid ecfa1e4e-8832-49c1-bb0c-2f08b95586a0
        key (EXTENT_TREE ROOT_ITEM 0) block 122810368 gen 12246
        key (ROOT_TREE_DIR INODE_REF 6) block 122339328 gen 12246
        key (344 ROOT_ITEM 9422) block 209317888 gen 12115
        key (344 ROOT_BACKREF 5) block 226222080 gen 12126
        key (368 ROOT_ITEM 11665) block 114966528 gen 12127
        key (375 ROOT_ITEM 12127) block 122949632 gen 12246

If you look for (375 ROOT_ITEM) then using (375 ROOT_ITEM 0) as the
search key will end up walking to block 114966528 (because (375
ROOT_ITEM 0) < (375 ROOT_ITEM 12127)). Using instead (375 ROOT_ITEM
-1) will go to the right leaf, return the first item after (375
ROOT_ITEM 12127), and we can then walk back one slot to get the item
we wanted.
Pierre Bourdon April 15, 2019, 1:20 a.m. UTC | #2
On Sun, Apr 14, 2019 at 12:05 AM Pierre Bourdon <delroth@gmail.com> wrote:
> A more practical example in case the explanation isn't clear:
>
> root tree
> node 122216448 level 1 items 6 free 115 generation 12246 owner ROOT_TREE
> fs uuid b516974e-a94e-469c-a9ff-d237ce96b03b
> chunk uuid ecfa1e4e-8832-49c1-bb0c-2f08b95586a0
>         key (EXTENT_TREE ROOT_ITEM 0) block 122810368 gen 12246
>         key (ROOT_TREE_DIR INODE_REF 6) block 122339328 gen 12246
>         key (344 ROOT_ITEM 9422) block 209317888 gen 12115
>         key (344 ROOT_BACKREF 5) block 226222080 gen 12126
>         key (368 ROOT_ITEM 11665) block 114966528 gen 12127
>         key (375 ROOT_ITEM 12127) block 122949632 gen 12246
>
> If you look for (375 ROOT_ITEM) then using (375 ROOT_ITEM 0) as the
> search key will end up walking to block 114966528 (because (375
> ROOT_ITEM 0) < (375 ROOT_ITEM 12127)). Using instead (375 ROOT_ITEM
> -1) will go to the right leaf, return the first item after (375
> ROOT_ITEM 12127), and we can then walk back one slot to get the item
> we wanted.

So turns out there's a very similar bug when iterating DIR_INDEX
entries -- but that one isn't using btrfs_search_tree_key_type. It's
doing its own thing with offset = 0 to enumerate multiple items of the
same oid/type.

Looking back at this, I think there's a much better way to fix both
issues. Currently, btrfs_search_tree will return an invalid path if
it's asked to look for an item between end of leaf N and beginning of
leaf N+1. Invalid, as in, accessing the element at this path reads
past the end of an array... This seems like a very broken behavior
given that callers have no way of knowing the path is invalid without
dereferencing it.

Instead, btrfs_search_tree should be fixed so that it always returns either:
1. The first element >= searched element.
2. An error if no such element exists.

This can be done with a fairly trivial change to btrfs_search_tree: if
we're about to return something that's past the end of the leaf (slot
>= nritems), call btrfs_next_slot on the path. If btrfs_next_slot
works return that, otherwise return an error.

I'll send another patch which implements that instead. I've verified
it fixes both the root lookup issue I was originally looking for + the
DIR_INDEX issue as well. Please disregard the patch I originally sent.

--
Pierre Bourdon <delroth@gmail.com>
Software Engineer @ Zürich, Switzerland
https://delroth.net/
Tom Rini April 27, 2019, 2:46 p.m. UTC | #3
On Sat, Apr 13, 2019 at 11:50:49PM +0200, Pierre Bourdon wrote:

> ROOT_ITEMs in btrfs are referenced without knowing their actual "offset"
> value. To perform these searches using only two items from the key, the
> btrfs driver uses a special "btrfs_search_tree_key_type" function.
> 
> The algorithm used by that function to transform a 3-tuple search into a
> 2-tuple search was subtly broken, leading to items not being found if
> they were the first in their tree node.
> 
> This commit fixes btrfs_search_tree_key_type to properly behave in these
> situations.
> 
> Signed-off-by: Pierre Bourdon <delroth@gmail.com>
> Cc: Marek Behun <marek.behun@nic.cz>

Applied to u-boot/master, thanks!
Pierre Bourdon April 27, 2019, 2:50 p.m. UTC | #4
Hi Tom,

On Sat, Apr 27, 2019 at 4:46 PM Tom Rini <trini@konsulko.com> wrote:
> On Sat, Apr 13, 2019 at 11:50:49PM +0200, Pierre Bourdon wrote:
> > ROOT_ITEMs in btrfs are referenced without knowing their actual "offset"
> > value. To perform these searches using only two items from the key, the
> > btrfs driver uses a special "btrfs_search_tree_key_type" function.
> >
> > The algorithm used by that function to transform a 3-tuple search into a
> > 2-tuple search was subtly broken, leading to items not being found if
> > they were the first in their tree node.
> >
> > This commit fixes btrfs_search_tree_key_type to properly behave in these
> > situations.
> >
> > Signed-off-by: Pierre Bourdon <delroth@gmail.com>
> > Cc: Marek Behun <marek.behun@nic.cz>
>
> Applied to u-boot/master, thanks!

Please disregard this patch as was mentioned in a followup email on the thread.

https://patchwork.ozlabs.org/patch/1085991/ implements a better way of
addressing the problem and fixes a 2nd btrfs bug that was missed in
this first revision.

Sorry for the mixup,
Best,


--
Pierre Bourdon <delroth@gmail.com>
Software Engineer @ Zürich, Switzerland
https://delroth.net/
Tom Rini April 27, 2019, 2:53 p.m. UTC | #5
On Sat, Apr 27, 2019 at 04:50:18PM +0200, Pierre Bourdon wrote:
> Hi Tom,
> 
> On Sat, Apr 27, 2019 at 4:46 PM Tom Rini <trini@konsulko.com> wrote:
> > On Sat, Apr 13, 2019 at 11:50:49PM +0200, Pierre Bourdon wrote:
> > > ROOT_ITEMs in btrfs are referenced without knowing their actual "offset"
> > > value. To perform these searches using only two items from the key, the
> > > btrfs driver uses a special "btrfs_search_tree_key_type" function.
> > >
> > > The algorithm used by that function to transform a 3-tuple search into a
> > > 2-tuple search was subtly broken, leading to items not being found if
> > > they were the first in their tree node.
> > >
> > > This commit fixes btrfs_search_tree_key_type to properly behave in these
> > > situations.
> > >
> > > Signed-off-by: Pierre Bourdon <delroth@gmail.com>
> > > Cc: Marek Behun <marek.behun@nic.cz>
> >
> > Applied to u-boot/master, thanks!
> 
> Please disregard this patch as was mentioned in a followup email on the thread.
> 
> https://patchwork.ozlabs.org/patch/1085991/ implements a better way of
> addressing the problem and fixes a 2nd btrfs bug that was missed in
> this first revision.
> 
> Sorry for the mixup,

OK, I'll take care of things, thanks!
diff mbox series

Patch

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 65c152a52f..ca44a2404d 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -292,20 +292,52 @@  btrfs_search_tree_key_type(const struct btrfs_root *root, u64 objectid,
 {
 	struct btrfs_key key, *res;
 
+	/*
+	 * In some cases (e.g. tree roots), we need to look for a given
+	 * objectid and type without knowing the offset value (3rd element of a
+	 * btrfs tree node key). We can rely on the fact that btrfs_search_tree
+	 * returns the first element with key >= search_key, and then perform
+	 * our own comparison between the returned element and the search key.
+	 *
+	 * It is tempting to use a search key with offset 0 to perform this
+	 * "fuzzy search". This would return the first item with the (objectid,
+	 * type) we're looking for. However, using offset 0 has the wrong
+	 * behavior when the wanted item is the first in a leaf: since our
+	 * search key will be lower than the wanted item, the recursive search
+	 * will explore the wrong branch of the tree.
+	 *
+	 * Instead, use the largest possible offset (-1). The result of this
+	 * search will either be:
+	 *   1. An element with the (objectid, type) we're looking for, if it
+	 *      has offset -1 or if it is the last element in its leaf.
+	 *   2. The first element *after* an element with the (objectid, type)
+	 */
 	key.objectid = objectid;
 	key.type = type;
-	key.offset = 0;
+	key.offset = -1;
 
 	if (btrfs_search_tree(root, &key, path))
 		return NULL;
 
-	res = btrfs_path_leaf_key(path);
-	if (btrfs_comp_keys_type(&key, res)) {
-		btrfs_free_path(path);
-		return NULL;
+	/*
+	 * Compare with the previous element first -- this is the likely case
+	 * since the result of the search is only what we want if it had offset
+	 * == -1 or if it was last in its leaf.
+	 */
+	if (path->slots[0] > 0) {
+		path->slots[0]--;
+		res = btrfs_path_leaf_key(path);
+		if (!btrfs_comp_keys_type(&key, res))
+			return res;
+		path->slots[0]++;
 	}
 
-	return res;
+	res = btrfs_path_leaf_key(path);
+	if (!btrfs_comp_keys_type(&key, res))
+		return res;
+
+	btrfs_free_path(path);
+	return NULL;
 }
 
 static inline u32 btrfs_path_item_size(struct btrfs_path *p)