Message ID | 20200213065352.6310-1-xiyou.wangcong@gmail.com |
---|---|
State | Accepted |
Delegated to: | Pablo Neira |
Headers | show |
Series | [nf] netfilter: xt_hashlimit: unregister proc file before releasing mutex | expand |
On Wed, Feb 12, 2020 at 10:53:52PM -0800, Cong Wang wrote: > Before releasing the global mutex, we only unlink the hashtable > from the hash list, its proc file is still not unregistered at > this point. So syzbot could trigger a race condition where a > parallel htable_create() could register the same file immediately > after the mutex is released. > > Move htable_remove_proc_entry() back to mutex protection to > fix this. And, fold htable_destroy() into htable_put() to make > the code slightly easier to understand. Probably revert previous one?
On Tue, Feb 18, 2020 at 1:35 PM Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > On Wed, Feb 12, 2020 at 10:53:52PM -0800, Cong Wang wrote: > > Before releasing the global mutex, we only unlink the hashtable > > from the hash list, its proc file is still not unregistered at > > this point. So syzbot could trigger a race condition where a > > parallel htable_create() could register the same file immediately > > after the mutex is released. > > > > Move htable_remove_proc_entry() back to mutex protection to > > fix this. And, fold htable_destroy() into htable_put() to make > > the code slightly easier to understand. > > Probably revert previous one? The hung task could appear again if we move the cleanup back under mutex.
On Tue, Feb 18, 2020 at 01:40:26PM -0800, Cong Wang wrote: > On Tue, Feb 18, 2020 at 1:35 PM Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > > On Wed, Feb 12, 2020 at 10:53:52PM -0800, Cong Wang wrote: > > > Before releasing the global mutex, we only unlink the hashtable > > > from the hash list, its proc file is still not unregistered at > > > this point. So syzbot could trigger a race condition where a > > > parallel htable_create() could register the same file immediately > > > after the mutex is released. > > > > > > Move htable_remove_proc_entry() back to mutex protection to > > > fix this. And, fold htable_destroy() into htable_put() to make > > > the code slightly easier to understand. > > > > Probably revert previous one? > > The hung task could appear again if we move the cleanup > back under mutex. How could the hung task appear again by reverting c4a3922d2d20c710f827? Please elaborate. Thanks.
On Tue, Feb 18, 2020 at 2:05 PM Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > On Tue, Feb 18, 2020 at 01:40:26PM -0800, Cong Wang wrote: > > On Tue, Feb 18, 2020 at 1:35 PM Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > > > > On Wed, Feb 12, 2020 at 10:53:52PM -0800, Cong Wang wrote: > > > > Before releasing the global mutex, we only unlink the hashtable > > > > from the hash list, its proc file is still not unregistered at > > > > this point. So syzbot could trigger a race condition where a > > > > parallel htable_create() could register the same file immediately > > > > after the mutex is released. > > > > > > > > Move htable_remove_proc_entry() back to mutex protection to > > > > fix this. And, fold htable_destroy() into htable_put() to make > > > > the code slightly easier to understand. > > > > > > Probably revert previous one? > > > > The hung task could appear again if we move the cleanup > > back under mutex. > > How could the hung task appear again by reverting > c4a3922d2d20c710f827? Please elaborate. Because the cfg.max could be as large as 8*HASHLIMIT_MAX_SIZE: 311 if (hinfo->cfg.max == 0) 312 hinfo->cfg.max = 8 * hinfo->cfg.size; 313 else if (hinfo->cfg.max < hinfo->cfg.size) 314 hinfo->cfg.max = hinfo->cfg.size; Not sure whether we can finish cleaning up 8*HASHLIMIT_MAX_SIZE entries within the time a hung task tolerates. This largely depends on how much contention the spinlock has, at least I don't want to bet on it. Thanks.
On Wed, Feb 19, 2020 at 07:32:13PM -0800, Cong Wang wrote: > On Tue, Feb 18, 2020 at 2:05 PM Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > > On Tue, Feb 18, 2020 at 01:40:26PM -0800, Cong Wang wrote: > > > On Tue, Feb 18, 2020 at 1:35 PM Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > > > > > > On Wed, Feb 12, 2020 at 10:53:52PM -0800, Cong Wang wrote: > > > > > Before releasing the global mutex, we only unlink the hashtable > > > > > from the hash list, its proc file is still not unregistered at > > > > > this point. So syzbot could trigger a race condition where a > > > > > parallel htable_create() could register the same file immediately > > > > > after the mutex is released. > > > > > > > > > > Move htable_remove_proc_entry() back to mutex protection to > > > > > fix this. And, fold htable_destroy() into htable_put() to make > > > > > the code slightly easier to understand. > > > > > > > > Probably revert previous one? > > > > > > The hung task could appear again if we move the cleanup > > > back under mutex. > > > > How could the hung task appear again by reverting > > c4a3922d2d20c710f827? Please elaborate. > > Because the cfg.max could be as large as 8*HASHLIMIT_MAX_SIZE: > > 311 if (hinfo->cfg.max == 0) > 312 hinfo->cfg.max = 8 * hinfo->cfg.size; > 313 else if (hinfo->cfg.max < hinfo->cfg.size) > 314 hinfo->cfg.max = hinfo->cfg.size; > > Not sure whether we can finish cleaning up 8*HASHLIMIT_MAX_SIZE > entries within the time a hung task tolerates. This largely depends on > how much contention the spinlock has, at least I don't want to bet > on it. Please, resend. Thanks.
On Wed, Feb 26, 2020 at 07:11:06PM +0100, Pablo Neira Ayuso wrote: > On Wed, Feb 19, 2020 at 07:32:13PM -0800, Cong Wang wrote: > > On Tue, Feb 18, 2020 at 2:05 PM Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > > > > On Tue, Feb 18, 2020 at 01:40:26PM -0800, Cong Wang wrote: > > > > On Tue, Feb 18, 2020 at 1:35 PM Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > > > > > > > > On Wed, Feb 12, 2020 at 10:53:52PM -0800, Cong Wang wrote: > > > > > > Before releasing the global mutex, we only unlink the hashtable > > > > > > from the hash list, its proc file is still not unregistered at > > > > > > this point. So syzbot could trigger a race condition where a > > > > > > parallel htable_create() could register the same file immediately > > > > > > after the mutex is released. > > > > > > > > > > > > Move htable_remove_proc_entry() back to mutex protection to > > > > > > fix this. And, fold htable_destroy() into htable_put() to make > > > > > > the code slightly easier to understand. > > > > > > > > > > Probably revert previous one? > > > > > > > > The hung task could appear again if we move the cleanup > > > > back under mutex. > > > > > > How could the hung task appear again by reverting > > > c4a3922d2d20c710f827? Please elaborate. > > > > Because the cfg.max could be as large as 8*HASHLIMIT_MAX_SIZE: > > > > 311 if (hinfo->cfg.max == 0) > > 312 hinfo->cfg.max = 8 * hinfo->cfg.size; > > 313 else if (hinfo->cfg.max < hinfo->cfg.size) > > 314 hinfo->cfg.max = hinfo->cfg.size; > > > > Not sure whether we can finish cleaning up 8*HASHLIMIT_MAX_SIZE > > entries within the time a hung task tolerates. This largely depends on > > how much contention the spinlock has, at least I don't want to bet > > on it. > > Please, resend. Thanks. Sorry, I meant, applied, thanks.
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c index 7a2c4b8408c4..8c835ad63729 100644 --- a/net/netfilter/xt_hashlimit.c +++ b/net/netfilter/xt_hashlimit.c @@ -402,15 +402,6 @@ static void htable_remove_proc_entry(struct xt_hashlimit_htable *hinfo) remove_proc_entry(hinfo->name, parent); } -static void htable_destroy(struct xt_hashlimit_htable *hinfo) -{ - cancel_delayed_work_sync(&hinfo->gc_work); - htable_remove_proc_entry(hinfo); - htable_selective_cleanup(hinfo, true); - kfree(hinfo->name); - vfree(hinfo); -} - static struct xt_hashlimit_htable *htable_find_get(struct net *net, const char *name, u_int8_t family) @@ -432,8 +423,13 @@ static void htable_put(struct xt_hashlimit_htable *hinfo) { if (refcount_dec_and_mutex_lock(&hinfo->use, &hashlimit_mutex)) { hlist_del(&hinfo->node); + htable_remove_proc_entry(hinfo); mutex_unlock(&hashlimit_mutex); - htable_destroy(hinfo); + + cancel_delayed_work_sync(&hinfo->gc_work); + htable_selective_cleanup(hinfo, true); + kfree(hinfo->name); + vfree(hinfo); } }
Before releasing the global mutex, we only unlink the hashtable from the hash list, its proc file is still not unregistered at this point. So syzbot could trigger a race condition where a parallel htable_create() could register the same file immediately after the mutex is released. Move htable_remove_proc_entry() back to mutex protection to fix this. And, fold htable_destroy() into htable_put() to make the code slightly easier to understand. Reported-and-tested-by: syzbot+d195fd3b9a364ddd6731@syzkaller.appspotmail.com Fixes: c4a3922d2d20 ("netfilter: xt_hashlimit: reduce hashlimit_mutex scope for htable_put()") Cc: Florian Westphal <fw@strlen.de> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> --- net/netfilter/xt_hashlimit.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-)