Message ID | 0b8fd2677b797663bfcb97f6aa108193fedf9767.1632909358.git.shreeya.patel@collabora.com |
---|---|
State | New |
Headers | show |
Series | Handle a soft hang and the inconsistent name issue | expand |
Shreeya Patel <shreeya.patel@collabora.com> writes: > There is a soft hang caused by a deadlock in d_alloc_parallel which > waits up on lookups to finish for the dentries in the parent directory's > hash_table. > In case when d_add_ci is called from the fs layer's lookup functions, > the dentry being looked up is already in the hash table (created before > the fs lookup function gets called). We should not be processing the > same dentry that is being looked up, hence, in case of case-insensitive > filesystems we are making it a case-exact match to prevent this from > happening. > > Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com> > --- > fs/dcache.c | 20 ++++++++++++++++++-- > 1 file changed, 18 insertions(+), 2 deletions(-) > > diff --git a/fs/dcache.c b/fs/dcache.c > index cf871a81f4fd..2a28ab64a165 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -2565,6 +2565,15 @@ static void d_wait_lookup(struct dentry *dentry) > } > } > > +static inline bool d_same_exact_name(const struct dentry *dentry, > + const struct dentry *parent, > + const struct qstr *name) > +{ > + if (dentry->d_name.len != name->len) > + return false; > + return dentry_cmp(dentry, name->name, name->len) == 0; > +} I don't like the idea of having a flavor of a dentry comparison function that doesn't invoke d_compare. In particular because d_compare might be used for all sorts of things, and this fix is really specific to the case-insensitive case. Would it be possible to fold this change into generic_ci_d_compare? If we could flag the dentry as part of a parallel lookup under the relevant condition, generic_ci_d_compare could simply return immediately in such case. > + > struct dentry *d_alloc_parallel(struct dentry *parent, > const struct qstr *name, > wait_queue_head_t *wq) > @@ -2575,6 +2584,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent, > struct dentry *new = d_alloc(parent, name); > struct dentry *dentry; > unsigned seq, r_seq, d_seq; > + int ci_dir = IS_CASEFOLDED(parent->d_inode); > > if (unlikely(!new)) > return ERR_PTR(-ENOMEM); > @@ -2626,8 +2636,14 @@ struct dentry *d_alloc_parallel(struct dentry *parent, > continue; > if (dentry->d_parent != parent) > continue; > - if (!d_same_name(dentry, parent, name)) > - continue; > + if (ci_dir) { > + if (!d_same_exact_name(dentry, parent, name)) > + continue; > + } else { As is, this is problematic because d_alloc_parallel is also part of the lookup path (see lookup_open, lookup_slow). In those cases, you want to do the CI comparison, to prevent racing two tasks creating a dentry differing only by case. > + if (!d_same_name(dentry, parent, name)) > + continue; > + } > + > hlist_bl_unlock(b); > /* now we can try to grab a reference */ > if (!lockref_get_not_dead(&dentry->d_lockref)) {
On Wed, Sep 29, 2021 at 04:23:38PM +0530, Shreeya Patel wrote: > There is a soft hang caused by a deadlock in d_alloc_parallel which > waits up on lookups to finish for the dentries in the parent directory's > hash_table. > In case when d_add_ci is called from the fs layer's lookup functions, > the dentry being looked up is already in the hash table (created before > the fs lookup function gets called). We should not be processing the > same dentry that is being looked up, hence, in case of case-insensitive > filesystems we are making it a case-exact match to prevent this from > happening. NAK. What you are doing would lead to parallel calls of ->lookup() in the same directory for names that would compare as equal. Which violates all kinds of assumptions in the analysis of dentry tree locking. d_add_ci() is used to force the "exact" spelling of the name on lookup - that's the whole point of that thing. What are you trying to achieve, and what's the point of mixing that with non-trivial ->d_compare()? If it's "force to exact spelling on lookup, avoid calling ->lookup() on aliases", d_add_ci() is simply not a good match.
On Fri, Oct 01, 2021 at 02:35:32PM -0400, Gabriel Krisman Bertazi wrote: > I don't like the idea of having a flavor of a dentry comparison function > that doesn't invoke d_compare. In particular because d_compare might be > used for all sorts of things, and this fix is really specific to the > case-insensitive case. > > Would it be possible to fold this change into generic_ci_d_compare? If > we could flag the dentry as part of a parallel lookup under the relevant > condition, generic_ci_d_compare could simply return immediately in > such case. Not really - if anything, that's a property of d_alloc_parallel() call done by d_add_ci(), not that of any of the dentries involved...
On 03/10/21 7:08 pm, Al Viro wrote: > On Wed, Sep 29, 2021 at 04:23:38PM +0530, Shreeya Patel wrote: >> There is a soft hang caused by a deadlock in d_alloc_parallel which >> waits up on lookups to finish for the dentries in the parent directory's >> hash_table. >> In case when d_add_ci is called from the fs layer's lookup functions, >> the dentry being looked up is already in the hash table (created before >> the fs lookup function gets called). We should not be processing the >> same dentry that is being looked up, hence, in case of case-insensitive >> filesystems we are making it a case-exact match to prevent this from >> happening. > NAK. What you are doing would lead to parallel calls of ->lookup() in the > same directory for names that would compare as equal. Which violates > all kinds of assumptions in the analysis of dentry tree locking. > > d_add_ci() is used to force the "exact" spelling of the name on lookup - > that's the whole point of that thing. What are you trying to achieve, > and what's the point of mixing that with non-trivial ->d_compare()? > Sending again as plain text... Hi Al Viro, This patch was added to resolve some of the issues faced in patch 02/02 of the series. Originally, the 'native', per-directory case-insensitive implementation merged in ext4/f2fs stores the case of the first lookup on the dcache, regardless of the disk exact file name case. This gets reflected in symlink returned by /proc/self/cwd. To solve this we are calling d_add_ci from the fs lookup function to store the disk exact name in the dcache even if an inexact-match string is used on the FIRST lookup. But this caused a soft hang since there was a deadlock in d_wait_lookup called from d_alloc_parallel. The reason for the hang is that d_same_name uses d_compare which does a case-insensitive match and is able to find the dentry name in the secondary hash table leading it to d_wait_lookup which would wait for the lookup to finish on that dentry causing a deadlock. To avoid the hang, we are doing a case-sensitive match using dentry_cmp here. Thanks > If it's "force to exact spelling on lookup, avoid calling ->lookup() on > aliases", d_add_ci() is simply not a good match.
diff --git a/fs/dcache.c b/fs/dcache.c index cf871a81f4fd..2a28ab64a165 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2565,6 +2565,15 @@ static void d_wait_lookup(struct dentry *dentry) } } +static inline bool d_same_exact_name(const struct dentry *dentry, + const struct dentry *parent, + const struct qstr *name) +{ + if (dentry->d_name.len != name->len) + return false; + return dentry_cmp(dentry, name->name, name->len) == 0; +} + struct dentry *d_alloc_parallel(struct dentry *parent, const struct qstr *name, wait_queue_head_t *wq) @@ -2575,6 +2584,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent, struct dentry *new = d_alloc(parent, name); struct dentry *dentry; unsigned seq, r_seq, d_seq; + int ci_dir = IS_CASEFOLDED(parent->d_inode); if (unlikely(!new)) return ERR_PTR(-ENOMEM); @@ -2626,8 +2636,14 @@ struct dentry *d_alloc_parallel(struct dentry *parent, continue; if (dentry->d_parent != parent) continue; - if (!d_same_name(dentry, parent, name)) - continue; + if (ci_dir) { + if (!d_same_exact_name(dentry, parent, name)) + continue; + } else { + if (!d_same_name(dentry, parent, name)) + continue; + } + hlist_bl_unlock(b); /* now we can try to grab a reference */ if (!lockref_get_not_dead(&dentry->d_lockref)) {
There is a soft hang caused by a deadlock in d_alloc_parallel which waits up on lookups to finish for the dentries in the parent directory's hash_table. In case when d_add_ci is called from the fs layer's lookup functions, the dentry being looked up is already in the hash table (created before the fs lookup function gets called). We should not be processing the same dentry that is being looked up, hence, in case of case-insensitive filesystems we are making it a case-exact match to prevent this from happening. Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com> --- fs/dcache.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)