diff mbox

[net,2/2] rhashtable: remove indirection for grow/shrink decision functions

Message ID cd3e70163920753c7f1a492e6df253cff9b1b080.1424877322.git.daniel@iogearbox.net
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Daniel Borkmann Feb. 25, 2015, 3:31 p.m. UTC
Currently, all real users of rhashtable default their grow and shrink
decision functions to rht_grow_above_75() and rht_shrink_below_30(),
so that there's currently no need to have this explicitly selectable.

It can/should be generic and private inside rhashtable until a real
use case pops up. Since we can make this private, we'll save us this
additional indirection layer and can improve insertion/deletion time
as well.

Reference: http://patchwork.ozlabs.org/patch/443040/
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 include/linux/rhashtable.h | 13 -----------
 lib/rhashtable.c           | 56 ++++++++++++++--------------------------------
 lib/test_rhashtable.c      |  1 +
 net/netfilter/nft_hash.c   |  2 --
 net/netlink/af_netlink.c   |  2 --
 net/tipc/socket.c          |  2 --
 6 files changed, 18 insertions(+), 58 deletions(-)

Comments

David Laight Feb. 25, 2015, 3:41 p.m. UTC | #1
From: Daniel Borkmann
> Currently, all real users of rhashtable default their grow and shrink
> decision functions to rht_grow_above_75() and rht_shrink_below_30(),
> so that there's currently no need to have this explicitly selectable.
> 
> It can/should be generic and private inside rhashtable until a real
> use case pops up. Since we can make this private, we'll save us this
> additional indirection layer and can improve insertion/deletion time
> as well.
...
> +static bool rht_grow_above_75(const struct rhashtable *ht, size_t new_size)
>  {
>  	/* Expand table when exceeding 75% load */
>  	return atomic_read(&ht->nelems) > (new_size / 4 * 3) &&
>  	       (!ht->p.max_shift || atomic_read(&ht->shift) < ht->p.max_shift);
>  }

Why not cache the 'number of items before we need to expand' value
after each expansion, setting it to 'infinite' when expansion is disabled.
Then the above check is a simple comparison.
You probably don't even need an atomic_read() - provided something is
double checked once the first test determines that an expansion is needed.

	David

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann Feb. 25, 2015, 3:46 p.m. UTC | #2
On 02/25/2015 04:41 PM, David Laight wrote:
...
> Why not cache the 'number of items before we need to expand' value
> after each expansion, setting it to 'infinite' when expansion is disabled.
> Then the above check is a simple comparison.
> You probably don't even need an atomic_read() - provided something is
> double checked once the first test determines that an expansion is needed.

Did you read my cover letter? ;)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight Feb. 25, 2015, 3:51 p.m. UTC | #3
From: Daniel Borkmann [ 
> On 02/25/2015 04:41 PM, David Laight wrote:
> ...
> > Why not cache the 'number of items before we need to expand' value
> > after each expansion, setting it to 'infinite' when expansion is disabled.
> > Then the above check is a simple comparison.
> > You probably don't even need an atomic_read() - provided something is
> > double checked once the first test determines that an expansion is needed.
> 
> Did you read my cover letter? ;)

I probably didn't infer the relevant references :-)

Actually even if some code wanted other rules, provided they
are based on comparing the 'number of items' to a preset limit
the limit could be set during expansion/contraction.

	David

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann Feb. 25, 2015, 4:14 p.m. UTC | #4
On 02/25/2015 04:51 PM, David Laight wrote:
> From: Daniel Borkmann [
>> On 02/25/2015 04:41 PM, David Laight wrote:
>> ...
>>> Why not cache the 'number of items before we need to expand' value
>>> after each expansion, setting it to 'infinite' when expansion is disabled.
>>> Then the above check is a simple comparison.
>>> You probably don't even need an atomic_read() - provided something is
>>> double checked once the first test determines that an expansion is needed.
>>
>> Did you read my cover letter? ;)
>
> I probably didn't infer the relevant references :-)
>
> Actually even if some code wanted other rules, provided they
> are based on comparing the 'number of items' to a preset limit
> the limit could be set during expansion/contraction.

The shift itself doesn't need to be atomic, if we store that in
the table directly instead of rhashtable, once on allocation it's
being set and can stay immutable during its table lifetime.

The nelems can be a percpu_counter where we have a batched sync
point and can make that dependent on the current table shift as
we don't need to be overly precise, we can just read the sync'ed
value. Currently, nelems are also being used by rhashtable users
outside of the core code to track if we are still allowed to
insert new elements, but I think we might also want to address
that at some point.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Graf Feb. 25, 2015, 5:23 p.m. UTC | #5
On 02/25/15 at 04:31pm, Daniel Borkmann wrote:
> Currently, all real users of rhashtable default their grow and shrink
> decision functions to rht_grow_above_75() and rht_shrink_below_30(),
> so that there's currently no need to have this explicitly selectable.
> 
> It can/should be generic and private inside rhashtable until a real
> use case pops up. Since we can make this private, we'll save us this
> additional indirection layer and can improve insertion/deletion time
> as well.
> 
> Reference: http://patchwork.ozlabs.org/patch/443040/
> Suggested-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Nice

Acked-by: Thomas Graf <tgraf@suug.ch>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 25, 2015, 5:41 p.m. UTC | #6
On Wed, 2015-02-25 at 17:14 +0100, Daniel Borkmann wrote:

> The nelems can be a percpu_counter where we have a batched sync
> point and can make that dependent on the current table shift as
> we don't need to be overly precise, we can just read the sync'ed
> value. Currently, nelems are also being used by rhashtable users
> outside of the core code to track if we are still allowed to
> insert new elements, but I think we might also want to address
> that at some point.

There is no need to track number of elements, using either an atomic_t
or percpu_counter. This adds unnecessary burden.

1) Automatic shrinking is a non issue. This will free very little
memory, compared to previous peak usage (including objects put in
rhashtable). If hash grown to a certain point, it's likely it will grow
again later.

2) Growing can be triggered when any bucket has more than X elems, and
that is given for free at insert time.
X could be log2(buckets)/2 I guess. (aka shift/2)

A global limit on number of elements should be controlled by rhashtable
users - if needed -, not in the rhashtable itself.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 25, 2015, 6:15 p.m. UTC | #7
On Wed, 2015-02-25 at 17:49 +0000, David Laight wrote:

> That won't work if someone is managing to break the hash function.
> You need some approximation to the total count as well.
> OTOH there is no point growing the table if the current list isn't 'full'.

Well, if hash function is broken, having 1,000,000 elements in one
bucket will kill your host, even if hash table has 1 million buckets.

Better find this out sooner than later.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann Feb. 25, 2015, 6:56 p.m. UTC | #8
On 02/25/2015 06:41 PM, Eric Dumazet wrote:
...
> There is no need to track number of elements, using either an atomic_t
> or percpu_counter. This adds unnecessary burden.
>
> 1) Automatic shrinking is a non issue. This will free very little
> memory, compared to previous peak usage (including objects put in
> rhashtable). If hash grown to a certain point, it's likely it will grow
> again later.

So you are saying that shrinking is most likely a rather undesirable
use-case in rhashtable?

> 2) Growing can be triggered when any bucket has more than X elems, and
> that is given for free at insert time.
> X could be log2(buckets)/2 I guess. (aka shift/2)

Yes, I think such a facility can be added easily.

> A global limit on number of elements should be controlled by rhashtable
> users - if needed -, not in the rhashtable itself.

Makes sense, thanks for the suggestion! I'll experiment with the above
per bucket tracking a bit if you're okay with that, and also unfiddle
the rht counter internals from netlink.

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 25, 2015, 7:52 p.m. UTC | #9
On Wed, 2015-02-25 at 19:56 +0100, Daniel Borkmann wrote:
> On 02/25/2015 06:41 PM, Eric Dumazet wrote:
> ...
> > There is no need to track number of elements, using either an atomic_t
> > or percpu_counter. This adds unnecessary burden.
> >
> > 1) Automatic shrinking is a non issue. This will free very little
> > memory, compared to previous peak usage (including objects put in
> > rhashtable). If hash grown to a certain point, it's likely it will grow
> > again later.
> 
> So you are saying that shrinking is most likely a rather undesirable
> use-case in rhashtable?

Main point of rhashtable is to start with a small/reasonable size,
and only big consumers need to _expand_ the table.

Like, instead of having 512000 slots in TCP hash table, start with a
2^10 size.

Like my laptop currently has :

TCP established hash table entries: 131072 (order: 8, 1048576 bytes)

Which is kind of ridiculous....

But if any workload had to grow the table to 2^20 slots, we had to
consume GB of memory anyway to hold sockets and everything.

Trying to shrink is simply not worth it, unless you expect your host
never reboots and you desperately need back these 8 MBytes of memory.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Feb. 25, 2015, 8:10 p.m. UTC | #10
On 25.02, Eric Dumazet wrote:
> On Wed, 2015-02-25 at 19:56 +0100, Daniel Borkmann wrote:
> > On 02/25/2015 06:41 PM, Eric Dumazet wrote:
> > ...
> > > There is no need to track number of elements, using either an atomic_t
> > > or percpu_counter. This adds unnecessary burden.
> > >
> > > 1) Automatic shrinking is a non issue. This will free very little
> > > memory, compared to previous peak usage (including objects put in
> > > rhashtable). If hash grown to a certain point, it's likely it will grow
> > > again later.
> > 
> > So you are saying that shrinking is most likely a rather undesirable
> > use-case in rhashtable?
> 
> Main point of rhashtable is to start with a small/reasonable size,
> and only big consumers need to _expand_ the table.
> 
> Like, instead of having 512000 slots in TCP hash table, start with a
> 2^10 size.
> 
> Like my laptop currently has :
> 
> TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
> 
> Which is kind of ridiculous....
> 
> But if any workload had to grow the table to 2^20 slots, we had to
> consume GB of memory anyway to hold sockets and everything.
> 
> Trying to shrink is simply not worth it, unless you expect your host
> never reboots and you desperately need back these 8 MBytes of memory.

That may be true in the TCP case, but for not for nftables. We might
have many sets and, especially when used to represent more complicated
classification algorithms, their size might change by a lot.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Graf Feb. 26, 2015, 10:02 a.m. UTC | #11
On 02/25/15 at 09:41am, Eric Dumazet wrote:
> There is no need to track number of elements, using either an atomic_t
> or percpu_counter. This adds unnecessary burden.
> 
> 1) Automatic shrinking is a non issue. This will free very little
> memory, compared to previous peak usage (including objects put in
> rhashtable). If hash grown to a certain point, it's likely it will grow
> again later.
> 
> 2) Growing can be triggered when any bucket has more than X elems, and
> that is given for free at insert time.
> X could be log2(buckets)/2 I guess. (aka shift/2)
> 
> A global limit on number of elements should be controlled by rhashtable
> users - if needed -, not in the rhashtable itself.

I don't think we need to kill shrink to get rid of the elements
counters. It looks all users of shrink could trigger it manually
by maintaing their own element counter, we just have to export the
shrink function.

I just want to point out here that TCP is not the only future
use case. The station table for mac80211 has recently been
converted. We need to keep all of them on the radar.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight Feb. 26, 2015, 2:18 p.m. UTC | #12
> 1) Automatic shrinking is a non issue. This will free very little

> memory, compared to previous peak usage (including objects put in

> rhashtable). If hash grown to a certain point, it's likely it will grow

> again later.


Shrink could be done 'on demand' from the 'user' when the number of
elements is known to have significantly reduced.

> 2) Growing can be triggered when any bucket has more than X elems, and

> that is given for free at insert time.

> X could be log2(buckets)/2 I guess. (aka shift/2)


If growing is triggered when a bucket has more than X elems, then setting
X very high will stop automatic growing.
Growing would then only be done in response to user requests.

Whether X should be constant, or log2(buckets)/n is another matter.
Or something like N1 + (N2 * log2(buckets))/64 in any case it
only needs recalculating after an auto-grow.

	David
David Miller Feb. 27, 2015, 10:30 p.m. UTC | #13
From: "tgraf@suug.ch" <tgraf@suug.ch>
Date: Thu, 26 Feb 2015 10:02:35 +0000

> I just want to point out here that TCP is not the only future
> use case. The station table for mac80211 has recently been
> converted. We need to keep all of them on the radar.

Just wanted to note in passing that I wonder how legal
net/core/neighbour.c:neigh_hash_grow() is.  It's growing an RCU
hashtable without using rhashtable. :-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Feb. 28, 2015, 12:48 a.m. UTC | #14
On 27.02, David Miller wrote:
> From: "tgraf@suug.ch" <tgraf@suug.ch>
> Date: Thu, 26 Feb 2015 10:02:35 +0000
> 
> > I just want to point out here that TCP is not the only future
> > use case. The station table for mac80211 has recently been
> > converted. We need to keep all of them on the radar.
> 
> Just wanted to note in passing that I wonder how legal
> net/core/neighbour.c:neigh_hash_grow() is.  It's growing an RCU
> hashtable without using rhashtable. :-)

I would like to note that it seem premature to convert all these
users without having the fundamental problems solved, specifically
the unbounded growing hash chains with async resizing.

It seems to me that we have enough use cases right now that problems
have become quite visible, and until we're sure they're actually
solvable for all these different use cases I'd advocate for not
moving more subsystems to use this.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Feb. 28, 2015, 1:35 a.m. UTC | #15
From: Patrick McHardy <kaber@trash.net>
Date: Sat, 28 Feb 2015 00:48:40 +0000

> It seems to me that we have enough use cases right now that problems
> have become quite visible, and until we're sure they're actually
> solvable for all these different use cases I'd advocate for not
> moving more subsystems to use this.

No objections.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu March 11, 2015, 6:42 a.m. UTC | #16
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Wed, 2015-02-25 at 17:14 +0100, Daniel Borkmann wrote:
> 
>> The nelems can be a percpu_counter where we have a batched sync
>> point and can make that dependent on the current table shift as
>> we don't need to be overly precise, we can just read the sync'ed
>> value. Currently, nelems are also being used by rhashtable users
>> outside of the core code to track if we are still allowed to
>> insert new elements, but I think we might also want to address
>> that at some point.
> 
> There is no need to track number of elements, using either an atomic_t
> or percpu_counter. This adds unnecessary burden.
> 
> 1) Automatic shrinking is a non issue. This will free very little
> memory, compared to previous peak usage (including objects put in
> rhashtable). If hash grown to a certain point, it's likely it will grow
> again later.
> 
> 2) Growing can be triggered when any bucket has more than X elems, and
> that is given for free at insert time.
> X could be log2(buckets)/2 I guess. (aka shift/2)
> 
> A global limit on number of elements should be controlled by rhashtable
> users - if needed -, not in the rhashtable itself.

I agree in the strongest terms :)

Cheers,
Thomas Graf March 12, 2015, 4:57 p.m. UTC | #17
On 03/11/15 at 05:42pm, Herbert Xu wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > There is no need to track number of elements, using either an atomic_t
> > or percpu_counter. This adds unnecessary burden.
> > 
> > 1) Automatic shrinking is a non issue. This will free very little
> > memory, compared to previous peak usage (including objects put in
> > rhashtable). If hash grown to a certain point, it's likely it will grow
> > again later.
> > 
> > 2) Growing can be triggered when any bucket has more than X elems, and
> > that is given for free at insert time.
> > X could be log2(buckets)/2 I guess. (aka shift/2)
> > 
> > A global limit on number of elements should be controlled by rhashtable
> > users - if needed -, not in the rhashtable itself.
> 
> I agree in the strongest terms :)

I agree that the max elements should be enforced by users as done by
Netlink right now. I'm also fine with ditching nelems from rhashtable
as long as shrinking is covered. Patrick mentioned several times by
now that nft sets want to have shrinking.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu March 13, 2015, 7:06 a.m. UTC | #18
On Thu, Mar 12, 2015 at 04:57:33PM +0000, Thomas Graf wrote:
>
> I agree that the max elements should be enforced by users as done by
> Netlink right now. I'm also fine with ditching nelems from rhashtable
> as long as shrinking is covered. Patrick mentioned several times by
> now that nft sets want to have shrinking.

My plan is to have expansions triggered solely through insert,
either immediate or delayed, and shrinking will only be done
when explicitly requested by the user.

Cheers,
diff mbox

Patch

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index cb2104b..d438eeb 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -79,12 +79,6 @@  struct rhashtable;
  * @locks_mul: Number of bucket locks to allocate per cpu (default: 128)
  * @hashfn: Function to hash key
  * @obj_hashfn: Function to hash object
- * @grow_decision: If defined, may return true if table should expand
- * @shrink_decision: If defined, may return true if table should shrink
- *
- * Note: when implementing the grow and shrink decision function, min/max
- * shift must be enforced, otherwise, resizing watermarks they set may be
- * useless.
  */
 struct rhashtable_params {
 	size_t			nelem_hint;
@@ -98,10 +92,6 @@  struct rhashtable_params {
 	size_t			locks_mul;
 	rht_hashfn_t		hashfn;
 	rht_obj_hashfn_t	obj_hashfn;
-	bool			(*grow_decision)(const struct rhashtable *ht,
-						 size_t new_size);
-	bool			(*shrink_decision)(const struct rhashtable *ht,
-						   size_t new_size);
 };
 
 /**
@@ -193,9 +183,6 @@  int rhashtable_init(struct rhashtable *ht, struct rhashtable_params *params);
 void rhashtable_insert(struct rhashtable *ht, struct rhash_head *node);
 bool rhashtable_remove(struct rhashtable *ht, struct rhash_head *node);
 
-bool rht_grow_above_75(const struct rhashtable *ht, size_t new_size);
-bool rht_shrink_below_30(const struct rhashtable *ht, size_t new_size);
-
 int rhashtable_expand(struct rhashtable *ht);
 int rhashtable_shrink(struct rhashtable *ht);
 
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index bcf119b..090641d 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -247,26 +247,24 @@  static struct bucket_table *bucket_table_alloc(struct rhashtable *ht,
  * @ht:		hash table
  * @new_size:	new table size
  */
-bool rht_grow_above_75(const struct rhashtable *ht, size_t new_size)
+static bool rht_grow_above_75(const struct rhashtable *ht, size_t new_size)
 {
 	/* Expand table when exceeding 75% load */
 	return atomic_read(&ht->nelems) > (new_size / 4 * 3) &&
 	       (!ht->p.max_shift || atomic_read(&ht->shift) < ht->p.max_shift);
 }
-EXPORT_SYMBOL_GPL(rht_grow_above_75);
 
 /**
  * rht_shrink_below_30 - returns true if nelems < 0.3 * table-size
  * @ht:		hash table
  * @new_size:	new table size
  */
-bool rht_shrink_below_30(const struct rhashtable *ht, size_t new_size)
+static bool rht_shrink_below_30(const struct rhashtable *ht, size_t new_size)
 {
 	/* Shrink table beneath 30% load */
 	return atomic_read(&ht->nelems) < (new_size * 3 / 10) &&
 	       (atomic_read(&ht->shift) > ht->p.min_shift);
 }
-EXPORT_SYMBOL_GPL(rht_shrink_below_30);
 
 static void lock_buckets(struct bucket_table *new_tbl,
 			 struct bucket_table *old_tbl, unsigned int hash)
@@ -528,40 +526,19 @@  static void rht_deferred_worker(struct work_struct *work)
 	list_for_each_entry(walker, &ht->walkers, list)
 		walker->resize = true;
 
-	if (ht->p.grow_decision && ht->p.grow_decision(ht, tbl->size))
+	if (rht_grow_above_75(ht, tbl->size))
 		rhashtable_expand(ht);
-	else if (ht->p.shrink_decision && ht->p.shrink_decision(ht, tbl->size))
+	else if (rht_shrink_below_30(ht, tbl->size))
 		rhashtable_shrink(ht);
-
 unlock:
 	mutex_unlock(&ht->mutex);
 }
 
-static void rhashtable_probe_expand(struct rhashtable *ht)
-{
-	const struct bucket_table *new_tbl = rht_dereference_rcu(ht->future_tbl, ht);
-	const struct bucket_table *tbl = rht_dereference_rcu(ht->tbl, ht);
-
-	/* Only adjust the table if no resizing is currently in progress. */
-	if (tbl == new_tbl && ht->p.grow_decision &&
-	    ht->p.grow_decision(ht, tbl->size))
-		schedule_work(&ht->run_work);
-}
-
-static void rhashtable_probe_shrink(struct rhashtable *ht)
-{
-	const struct bucket_table *new_tbl = rht_dereference_rcu(ht->future_tbl, ht);
-	const struct bucket_table *tbl = rht_dereference_rcu(ht->tbl, ht);
-
-	/* Only adjust the table if no resizing is currently in progress. */
-	if (tbl == new_tbl && ht->p.shrink_decision &&
-	    ht->p.shrink_decision(ht, tbl->size))
-		schedule_work(&ht->run_work);
-}
-
 static void __rhashtable_insert(struct rhashtable *ht, struct rhash_head *obj,
-				struct bucket_table *tbl, u32 hash)
+				struct bucket_table *tbl,
+				const struct bucket_table *old_tbl, u32 hash)
 {
+	bool no_resize_running = tbl == old_tbl;
 	struct rhash_head *head;
 
 	hash = rht_bucket_index(tbl, hash);
@@ -577,8 +554,8 @@  static void __rhashtable_insert(struct rhashtable *ht, struct rhash_head *obj,
 	rcu_assign_pointer(tbl->buckets[hash], obj);
 
 	atomic_inc(&ht->nelems);
-
-	rhashtable_probe_expand(ht);
+	if (no_resize_running && rht_grow_above_75(ht, tbl->size))
+		schedule_work(&ht->run_work);
 }
 
 /**
@@ -608,7 +585,7 @@  void rhashtable_insert(struct rhashtable *ht, struct rhash_head *obj)
 	hash = obj_raw_hashfn(ht, rht_obj(ht, obj));
 
 	lock_buckets(tbl, old_tbl, hash);
-	__rhashtable_insert(ht, obj, tbl, hash);
+	__rhashtable_insert(ht, obj, tbl, old_tbl, hash);
 	unlock_buckets(tbl, old_tbl, hash);
 
 	rcu_read_unlock();
@@ -690,8 +667,11 @@  found:
 	unlock_buckets(new_tbl, old_tbl, new_hash);
 
 	if (ret) {
+		bool no_resize_running = new_tbl == old_tbl;
+
 		atomic_dec(&ht->nelems);
-		rhashtable_probe_shrink(ht);
+		if (no_resize_running && rht_shrink_below_30(ht, new_tbl->size))
+			schedule_work(&ht->run_work);
 	}
 
 	rcu_read_unlock();
@@ -861,7 +841,7 @@  bool rhashtable_lookup_compare_insert(struct rhashtable *ht,
 		goto exit;
 	}
 
-	__rhashtable_insert(ht, obj, new_tbl, new_hash);
+	__rhashtable_insert(ht, obj, new_tbl, old_tbl, new_hash);
 
 exit:
 	unlock_buckets(new_tbl, old_tbl, new_hash);
@@ -1123,8 +1103,7 @@  int rhashtable_init(struct rhashtable *ht, struct rhashtable_params *params)
 	if (!ht->p.hash_rnd)
 		get_random_bytes(&ht->p.hash_rnd, sizeof(ht->p.hash_rnd));
 
-	if (ht->p.grow_decision || ht->p.shrink_decision)
-		INIT_WORK(&ht->run_work, rht_deferred_worker);
+	INIT_WORK(&ht->run_work, rht_deferred_worker);
 
 	return 0;
 }
@@ -1142,8 +1121,7 @@  void rhashtable_destroy(struct rhashtable *ht)
 {
 	ht->being_destroyed = true;
 
-	if (ht->p.grow_decision || ht->p.shrink_decision)
-		cancel_work_sync(&ht->run_work);
+	cancel_work_sync(&ht->run_work);
 
 	mutex_lock(&ht->mutex);
 	bucket_table_free(rht_dereference(ht->tbl, ht));
diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index f9e9d73..67c7593 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -201,6 +201,7 @@  static int __init test_rht_init(void)
 		.key_offset = offsetof(struct test_obj, value),
 		.key_len = sizeof(int),
 		.hashfn = jhash,
+		.max_shift = 1, /* we expand/shrink manually here */
 		.nulls_base = (3U << RHT_BASE_SHIFT),
 	};
 	int err;
diff --git a/net/netfilter/nft_hash.c b/net/netfilter/nft_hash.c
index 61e6c40..c82df0a 100644
--- a/net/netfilter/nft_hash.c
+++ b/net/netfilter/nft_hash.c
@@ -192,8 +192,6 @@  static int nft_hash_init(const struct nft_set *set,
 		.key_offset = offsetof(struct nft_hash_elem, key),
 		.key_len = set->klen,
 		.hashfn = jhash,
-		.grow_decision = rht_grow_above_75,
-		.shrink_decision = rht_shrink_below_30,
 	};
 
 	return rhashtable_init(priv, &params);
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 2702673..05919bf 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -3126,8 +3126,6 @@  static int __init netlink_proto_init(void)
 		.key_len = sizeof(u32), /* portid */
 		.hashfn = jhash,
 		.max_shift = 16, /* 64K */
-		.grow_decision = rht_grow_above_75,
-		.shrink_decision = rht_shrink_below_30,
 	};
 
 	if (err != 0)
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index f73e975..b4d4467 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -2364,8 +2364,6 @@  int tipc_sk_rht_init(struct net *net)
 		.hashfn = jhash,
 		.max_shift = 20, /* 1M */
 		.min_shift = 8,  /* 256 */
-		.grow_decision = rht_grow_above_75,
-		.shrink_decision = rht_shrink_below_30,
 	};
 
 	return rhashtable_init(&tn->sk_rht, &rht_params);