diff mbox series

[4/8] rhashtable: fix race in nested_table_alloc()

Message ID 152540605432.18473.11813271279255176724.stgit@noble
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series Assorted rhashtable fixes and cleanups | expand

Commit Message

NeilBrown May 4, 2018, 3:54 a.m. UTC
If two threads run nested_table_alloc() at the same time
they could both allocate a new table.
Best case is that one of them will never be freed, leaking memory.
Worst case is hat entry get stored there before it leaks,
and the are lost from the table.

So use cmpxchg to detect the race and free the unused table.

Fixes: da20420f83ea ("rhashtable: Add nested tables")
Cc: stable@vger.kernel.org # 4.11+
Signed-off-by: NeilBrown <neilb@suse.com>
---
 lib/rhashtable.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Comments

Herbert Xu May 5, 2018, 9:29 a.m. UTC | #1
On Fri, May 04, 2018 at 01:54:14PM +1000, NeilBrown wrote:
> If two threads run nested_table_alloc() at the same time
> they could both allocate a new table.
> Best case is that one of them will never be freed, leaking memory.
> Worst case is hat entry get stored there before it leaks,
> and the are lost from the table.
> 
> So use cmpxchg to detect the race and free the unused table.
> 
> Fixes: da20420f83ea ("rhashtable: Add nested tables")
> Cc: stable@vger.kernel.org # 4.11+
> Signed-off-by: NeilBrown <neilb@suse.com>

What about the spinlock that's meant to be held around this
operation?

Cheers,
NeilBrown May 5, 2018, 9:48 p.m. UTC | #2
On Sat, May 05 2018, Herbert Xu wrote:

> On Fri, May 04, 2018 at 01:54:14PM +1000, NeilBrown wrote:
>> If two threads run nested_table_alloc() at the same time
>> they could both allocate a new table.
>> Best case is that one of them will never be freed, leaking memory.
>> Worst case is hat entry get stored there before it leaks,
>> and the are lost from the table.
>> 
>> So use cmpxchg to detect the race and free the unused table.
>> 
>> Fixes: da20420f83ea ("rhashtable: Add nested tables")
>> Cc: stable@vger.kernel.org # 4.11+
>> Signed-off-by: NeilBrown <neilb@suse.com>
>
> What about the spinlock that's meant to be held around this
> operation?

The spinlock protects 2 or more buckets.  The nested table contains at
least 512 buckets, maybe more.
It is quite possible for two insertions into 2 different buckets to both
get their spinlock and both try to instantiate the same nested table.

Thanks,
NeilBrown
Herbert Xu May 6, 2018, 5:18 a.m. UTC | #3
On Sun, May 06, 2018 at 07:48:20AM +1000, NeilBrown wrote:
>
> The spinlock protects 2 or more buckets.  The nested table contains at
> least 512 buckets, maybe more.
> It is quite possible for two insertions into 2 different buckets to both
> get their spinlock and both try to instantiate the same nested table.

I think you missed the fact that when we use nested tables the spin
lock table is limited to just a single page and hence corresponds
to the first level in the nested table.  Therefore it's always safe.

Cheers,
NeilBrown May 6, 2018, 10:02 p.m. UTC | #4
On Sun, May 06 2018, Herbert Xu wrote:

> On Sun, May 06, 2018 at 07:48:20AM +1000, NeilBrown wrote:
>>
>> The spinlock protects 2 or more buckets.  The nested table contains at
>> least 512 buckets, maybe more.
>> It is quite possible for two insertions into 2 different buckets to both
>> get their spinlock and both try to instantiate the same nested table.
>
> I think you missed the fact that when we use nested tables the spin
> lock table is limited to just a single page and hence corresponds
> to the first level in the nested table.  Therefore it's always safe.

Yes I had missed that - thanks for pointing it out.
In fact the lock table is limited to the number of nested_tables
in the second level.
And it is the same low-order bits that choose both the lock
and the set of nested tables.
So there isn't a bug here.  So we don't need this patch. (I still like
it though - it seems more obviously correct).

Thanks,
NeilBrown
diff mbox series

Patch

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index b73afe1dec7e..114e6090228a 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -119,6 +119,7 @@  static union nested_table *nested_table_alloc(struct rhashtable *ht,
 					      unsigned int nhash)
 {
 	union nested_table *ntbl;
+	union nested_table *tmp;
 	int i;
 
 	ntbl = rcu_dereference(*prev);
@@ -133,9 +134,12 @@  static union nested_table *nested_table_alloc(struct rhashtable *ht,
 					    (i << shifted) | nhash);
 	}
 
-	rcu_assign_pointer(*prev, ntbl);
-
-	return ntbl;
+	rcu_assign_pointer(tmp, ntbl);
+	if (cmpxchg(prev, NULL, tmp) == NULL)
+		return tmp;
+	/* Raced with another thread. */
+	kfree(ntbl);
+	return rcu_dereference(*prev);
 }
 
 static struct bucket_table *nested_bucket_table_alloc(struct rhashtable *ht,