diff mbox

rhashtable: Fix potential crash on destroy in rhashtable_shrink

Message ID 20150131093637.GA29106@gondor.apana.org.au
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Herbert Xu Jan. 31, 2015, 9:36 a.m. UTC
The current being_destroyed check in rhashtable_expand is not
enough since if we start a shrinking process after freeing all
elements in the table that's also going to crash.

This patch adds a being_destroyed check to the deferred worker
thread so that we bail out as soon as we take the lock.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Comments

Thomas Graf Jan. 31, 2015, 11:16 a.m. UTC | #1
On 01/31/15 at 08:36pm, Herbert Xu wrote:
> The current being_destroyed check in rhashtable_expand is not
> enough since if we start a shrinking process after freeing all
> elements in the table that's also going to crash.

(The check in expand() is just an optimization to drop out of
 work cycles if it does not make sense to continue anymore.)

> 
> This patch adds a being_destroyed check to the deferred worker
> thread so that we bail out as soon as we take the lock.

Shouldn't the cancel_work_sync() in rhashtable_destroy() block
until the deferred worker is done and cancelled?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu Jan. 31, 2015, 11:22 a.m. UTC | #2
On Sat, Jan 31, 2015 at 11:16:52AM +0000, Thomas Graf wrote:
> On 01/31/15 at 08:36pm, Herbert Xu wrote:
> > The current being_destroyed check in rhashtable_expand is not
> > enough since if we start a shrinking process after freeing all
> > elements in the table that's also going to crash.
> 
> (The check in expand() is just an optimization to drop out of
>  work cycles if it does not make sense to continue anymore.)
> 
> > 
> > This patch adds a being_destroyed check to the deferred worker
> > thread so that we bail out as soon as we take the lock.
> 
> Shouldn't the cancel_work_sync() in rhashtable_destroy() block
> until the deferred worker is done and cancelled?

That's too late.  nft_hash will have freed all the elements
before rhashtable_destroy gets called.

Cheers,
Thomas Graf Jan. 31, 2015, 12:15 p.m. UTC | #3
On 01/31/15 at 10:22pm, Herbert Xu wrote:
> That's too late.  nft_hash will have freed all the elements
> before rhashtable_destroy gets called.

I see, so this is to accomodate nft_hash which doesn't remove
the elements from the hash but just frees them.

Acked-by: Thomas Graf <tgraf@suug.ch>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ying Xue Feb. 2, 2015, 9:34 a.m. UTC | #4
On 01/31/2015 05:36 PM, Herbert Xu wrote:
> The current being_destroyed check in rhashtable_expand is not
> enough since if we start a shrinking process after freeing all
> elements in the table that's also going to crash.
> 

Sorry, I cannot understand the scenario.

When we free the table in rhashtable_destroy(), we call
cancel_work_sync() to synchronously cancel the work. So, why does your
described crash still happen?

Please give a more explanation.

Thanks,
Ying

> This patch adds a being_destroyed check to the deferred worker
> thread so that we bail out as soon as we take the lock.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index 69a4eb0..4c3da1f 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -489,6 +489,9 @@ static void rht_deferred_worker(struct work_struct *work)
>  
>  	ht = container_of(work, struct rhashtable, run_work);
>  	mutex_lock(&ht->mutex);
> +	if (ht->being_destroyed)
> +		goto unlock;
> +
>  	tbl = rht_dereference(ht->tbl, ht);
>  
>  	list_for_each_entry(walker, &ht->walkers, list)
> @@ -499,6 +502,7 @@ static void rht_deferred_worker(struct work_struct *work)
>  	else if (ht->p.shrink_decision && ht->p.shrink_decision(ht, tbl->size))
>  		rhashtable_shrink(ht);
>  
> +unlock:
>  	mutex_unlock(&ht->mutex);
>  }
>  
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Graf Feb. 2, 2015, 9:48 a.m. UTC | #5
On 02/02/15 at 05:34pm, Ying Xue wrote:
> On 01/31/2015 05:36 PM, Herbert Xu wrote:
> > The current being_destroyed check in rhashtable_expand is not
> > enough since if we start a shrinking process after freeing all
> > elements in the table that's also going to crash.
> > 
> 
> Sorry, I cannot understand the scenario.
> 
> When we free the table in rhashtable_destroy(), we call
> cancel_work_sync() to synchronously cancel the work. So, why does your
> described crash still happen?

It's nft_hash specific. nft_hash frees all entries under ht->mutex without
unlinking them upon destroy and sets being_destroyed before doing. Therefore
it must be ensured that a resize is not started afterwards because the table
contains freed entries.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Feb. 3, 2015, 3:19 a.m. UTC | #6
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sat, 31 Jan 2015 20:36:38 +1100

> The current being_destroyed check in rhashtable_expand is not
> enough since if we start a shrinking process after freeing all
> elements in the table that's also going to crash.
> 
> This patch adds a being_destroyed check to the deferred worker
> thread so that we bail out as soon as we take the lock.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied to net-next
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 69a4eb0..4c3da1f 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -489,6 +489,9 @@  static void rht_deferred_worker(struct work_struct *work)
 
 	ht = container_of(work, struct rhashtable, run_work);
 	mutex_lock(&ht->mutex);
+	if (ht->being_destroyed)
+		goto unlock;
+
 	tbl = rht_dereference(ht->tbl, ht);
 
 	list_for_each_entry(walker, &ht->walkers, list)
@@ -499,6 +502,7 @@  static void rht_deferred_worker(struct work_struct *work)
 	else if (ht->p.shrink_decision && ht->p.shrink_decision(ht, tbl->size))
 		rhashtable_shrink(ht);
 
+unlock:
 	mutex_unlock(&ht->mutex);
 }