Message ID | 20190413215049.15305-1-delroth@gmail.com |
---|---|
State | Accepted |
Commit | 633967f9818cb6a0e87ffa8cba33148a5bcc6edb |
Delegated to: | Tom Rini |
Headers | show |
Series | [U-Boot] fs: btrfs: fix false negatives in ROOT_ITEM search | expand |
On Sat, Apr 13, 2019 at 11:50 PM Pierre Bourdon <delroth@gmail.com> wrote: > > ROOT_ITEMs in btrfs are referenced without knowing their actual "offset" > value. To perform these searches using only two items from the key, the > btrfs driver uses a special "btrfs_search_tree_key_type" function. > > The algorithm used by that function to transform a 3-tuple search into a > 2-tuple search was subtly broken, leading to items not being found if > they were the first in their tree node. > > This commit fixes btrfs_search_tree_key_type to properly behave in these > situations. A more practical example in case the explanation isn't clear: root tree node 122216448 level 1 items 6 free 115 generation 12246 owner ROOT_TREE fs uuid b516974e-a94e-469c-a9ff-d237ce96b03b chunk uuid ecfa1e4e-8832-49c1-bb0c-2f08b95586a0 key (EXTENT_TREE ROOT_ITEM 0) block 122810368 gen 12246 key (ROOT_TREE_DIR INODE_REF 6) block 122339328 gen 12246 key (344 ROOT_ITEM 9422) block 209317888 gen 12115 key (344 ROOT_BACKREF 5) block 226222080 gen 12126 key (368 ROOT_ITEM 11665) block 114966528 gen 12127 key (375 ROOT_ITEM 12127) block 122949632 gen 12246 If you look for (375 ROOT_ITEM) then using (375 ROOT_ITEM 0) as the search key will end up walking to block 114966528 (because (375 ROOT_ITEM 0) < (375 ROOT_ITEM 12127)). Using instead (375 ROOT_ITEM -1) will go to the right leaf, return the first item after (375 ROOT_ITEM 12127), and we can then walk back one slot to get the item we wanted.
On Sun, Apr 14, 2019 at 12:05 AM Pierre Bourdon <delroth@gmail.com> wrote: > A more practical example in case the explanation isn't clear: > > root tree > node 122216448 level 1 items 6 free 115 generation 12246 owner ROOT_TREE > fs uuid b516974e-a94e-469c-a9ff-d237ce96b03b > chunk uuid ecfa1e4e-8832-49c1-bb0c-2f08b95586a0 > key (EXTENT_TREE ROOT_ITEM 0) block 122810368 gen 12246 > key (ROOT_TREE_DIR INODE_REF 6) block 122339328 gen 12246 > key (344 ROOT_ITEM 9422) block 209317888 gen 12115 > key (344 ROOT_BACKREF 5) block 226222080 gen 12126 > key (368 ROOT_ITEM 11665) block 114966528 gen 12127 > key (375 ROOT_ITEM 12127) block 122949632 gen 12246 > > If you look for (375 ROOT_ITEM) then using (375 ROOT_ITEM 0) as the > search key will end up walking to block 114966528 (because (375 > ROOT_ITEM 0) < (375 ROOT_ITEM 12127)). Using instead (375 ROOT_ITEM > -1) will go to the right leaf, return the first item after (375 > ROOT_ITEM 12127), and we can then walk back one slot to get the item > we wanted. So turns out there's a very similar bug when iterating DIR_INDEX entries -- but that one isn't using btrfs_search_tree_key_type. It's doing its own thing with offset = 0 to enumerate multiple items of the same oid/type. Looking back at this, I think there's a much better way to fix both issues. Currently, btrfs_search_tree will return an invalid path if it's asked to look for an item between end of leaf N and beginning of leaf N+1. Invalid, as in, accessing the element at this path reads past the end of an array... This seems like a very broken behavior given that callers have no way of knowing the path is invalid without dereferencing it. Instead, btrfs_search_tree should be fixed so that it always returns either: 1. The first element >= searched element. 2. An error if no such element exists. This can be done with a fairly trivial change to btrfs_search_tree: if we're about to return something that's past the end of the leaf (slot >= nritems), call btrfs_next_slot on the path. If btrfs_next_slot works return that, otherwise return an error. I'll send another patch which implements that instead. I've verified it fixes both the root lookup issue I was originally looking for + the DIR_INDEX issue as well. Please disregard the patch I originally sent. -- Pierre Bourdon <delroth@gmail.com> Software Engineer @ Zürich, Switzerland https://delroth.net/
On Sat, Apr 13, 2019 at 11:50:49PM +0200, Pierre Bourdon wrote: > ROOT_ITEMs in btrfs are referenced without knowing their actual "offset" > value. To perform these searches using only two items from the key, the > btrfs driver uses a special "btrfs_search_tree_key_type" function. > > The algorithm used by that function to transform a 3-tuple search into a > 2-tuple search was subtly broken, leading to items not being found if > they were the first in their tree node. > > This commit fixes btrfs_search_tree_key_type to properly behave in these > situations. > > Signed-off-by: Pierre Bourdon <delroth@gmail.com> > Cc: Marek Behun <marek.behun@nic.cz> Applied to u-boot/master, thanks!
Hi Tom, On Sat, Apr 27, 2019 at 4:46 PM Tom Rini <trini@konsulko.com> wrote: > On Sat, Apr 13, 2019 at 11:50:49PM +0200, Pierre Bourdon wrote: > > ROOT_ITEMs in btrfs are referenced without knowing their actual "offset" > > value. To perform these searches using only two items from the key, the > > btrfs driver uses a special "btrfs_search_tree_key_type" function. > > > > The algorithm used by that function to transform a 3-tuple search into a > > 2-tuple search was subtly broken, leading to items not being found if > > they were the first in their tree node. > > > > This commit fixes btrfs_search_tree_key_type to properly behave in these > > situations. > > > > Signed-off-by: Pierre Bourdon <delroth@gmail.com> > > Cc: Marek Behun <marek.behun@nic.cz> > > Applied to u-boot/master, thanks! Please disregard this patch as was mentioned in a followup email on the thread. https://patchwork.ozlabs.org/patch/1085991/ implements a better way of addressing the problem and fixes a 2nd btrfs bug that was missed in this first revision. Sorry for the mixup, Best, -- Pierre Bourdon <delroth@gmail.com> Software Engineer @ Zürich, Switzerland https://delroth.net/
On Sat, Apr 27, 2019 at 04:50:18PM +0200, Pierre Bourdon wrote: > Hi Tom, > > On Sat, Apr 27, 2019 at 4:46 PM Tom Rini <trini@konsulko.com> wrote: > > On Sat, Apr 13, 2019 at 11:50:49PM +0200, Pierre Bourdon wrote: > > > ROOT_ITEMs in btrfs are referenced without knowing their actual "offset" > > > value. To perform these searches using only two items from the key, the > > > btrfs driver uses a special "btrfs_search_tree_key_type" function. > > > > > > The algorithm used by that function to transform a 3-tuple search into a > > > 2-tuple search was subtly broken, leading to items not being found if > > > they were the first in their tree node. > > > > > > This commit fixes btrfs_search_tree_key_type to properly behave in these > > > situations. > > > > > > Signed-off-by: Pierre Bourdon <delroth@gmail.com> > > > Cc: Marek Behun <marek.behun@nic.cz> > > > > Applied to u-boot/master, thanks! > > Please disregard this patch as was mentioned in a followup email on the thread. > > https://patchwork.ozlabs.org/patch/1085991/ implements a better way of > addressing the problem and fixes a 2nd btrfs bug that was missed in > this first revision. > > Sorry for the mixup, OK, I'll take care of things, thanks!
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 65c152a52f..ca44a2404d 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -292,20 +292,52 @@ btrfs_search_tree_key_type(const struct btrfs_root *root, u64 objectid, { struct btrfs_key key, *res; + /* + * In some cases (e.g. tree roots), we need to look for a given + * objectid and type without knowing the offset value (3rd element of a + * btrfs tree node key). We can rely on the fact that btrfs_search_tree + * returns the first element with key >= search_key, and then perform + * our own comparison between the returned element and the search key. + * + * It is tempting to use a search key with offset 0 to perform this + * "fuzzy search". This would return the first item with the (objectid, + * type) we're looking for. However, using offset 0 has the wrong + * behavior when the wanted item is the first in a leaf: since our + * search key will be lower than the wanted item, the recursive search + * will explore the wrong branch of the tree. + * + * Instead, use the largest possible offset (-1). The result of this + * search will either be: + * 1. An element with the (objectid, type) we're looking for, if it + * has offset -1 or if it is the last element in its leaf. + * 2. The first element *after* an element with the (objectid, type) + */ key.objectid = objectid; key.type = type; - key.offset = 0; + key.offset = -1; if (btrfs_search_tree(root, &key, path)) return NULL; - res = btrfs_path_leaf_key(path); - if (btrfs_comp_keys_type(&key, res)) { - btrfs_free_path(path); - return NULL; + /* + * Compare with the previous element first -- this is the likely case + * since the result of the search is only what we want if it had offset + * == -1 or if it was last in its leaf. + */ + if (path->slots[0] > 0) { + path->slots[0]--; + res = btrfs_path_leaf_key(path); + if (!btrfs_comp_keys_type(&key, res)) + return res; + path->slots[0]++; } - return res; + res = btrfs_path_leaf_key(path); + if (!btrfs_comp_keys_type(&key, res)) + return res; + + btrfs_free_path(path); + return NULL; } static inline u32 btrfs_path_item_size(struct btrfs_path *p)
ROOT_ITEMs in btrfs are referenced without knowing their actual "offset" value. To perform these searches using only two items from the key, the btrfs driver uses a special "btrfs_search_tree_key_type" function. The algorithm used by that function to transform a 3-tuple search into a 2-tuple search was subtly broken, leading to items not being found if they were the first in their tree node. This commit fixes btrfs_search_tree_key_type to properly behave in these situations. Signed-off-by: Pierre Bourdon <delroth@gmail.com> Cc: Marek Behun <marek.behun@nic.cz> --- fs/btrfs/ctree.h | 44 ++++++++++++++++++++++++++++++++++++++------ 1 file changed, 38 insertions(+), 6 deletions(-)