diff mbox

[net,1/2] rhashtable: unconditionally grow when max_shift is not specified

Message ID 0973a6e9b73a2ca14477c0cf98093d5f1b3d3d40.1424877322.git.daniel@iogearbox.net
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Daniel Borkmann Feb. 25, 2015, 3:31 p.m. UTC
While commit c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for
worker queue") rightfully moved part of the decision making of
whether we should expand or shrink from the expand/shrink functions
themselves into insert/delete functions in order to avoid unnecessary
worker wake-ups, it however introduced a regression by doing so.

Before that change, if no max_shift was specified (= 0) on rhashtable
initialization, rhashtable_expand() would just grow unconditionally
and lets the available memory be the limiting factor. After that
change, if no max_shift was specified, there would be _no_ expansion
step at all.

Given that netlink and tipc have a max_shift specified, it was not
visible there, but Josh Hunt reported that if nft that starts out
with a default element hint of 3 if not otherwise provided, would
slow i.e. inserts down trememdously as it cannot grow larger to
relax table occupancy.

Given that the test case verifies shrinks/expands manually, we also
must remove pointer to the helper functions to explicitly avoid
parallel resizing on insertions/deletions. test_bucket_stats() and
test_rht_lookup() could also be wrapped around rhashtable mutex to
explicitly synchronize a walk from resizing, but I think that defeats
the actual test case which intended to have explicit test steps,
i.e. 1) inserts, 2) expands, 3) shrinks, 4) deletions, with object
verification after each stage.

Reported-by: Josh Hunt <johunt@akamai.com>
Fixes: c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker queue")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Josh Hunt <johunt@akamai.com>
---
 lib/rhashtable.c      | 2 +-
 lib/test_rhashtable.c | 2 --
 2 files changed, 1 insertion(+), 3 deletions(-)

Comments

Thomas Graf Feb. 25, 2015, 4:28 p.m. UTC | #1
On 02/25/15 at 04:31pm, Daniel Borkmann wrote:
> diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
> index 58b9953..f9e9d73 100644
> --- a/lib/test_rhashtable.c
> +++ b/lib/test_rhashtable.c
> @@ -202,8 +202,6 @@ static int __init test_rht_init(void)
>  		.key_len = sizeof(int),
>  		.hashfn = jhash,
>  		.nulls_base = (3U << RHT_BASE_SHIFT),
> -		.grow_decision = rht_grow_above_75,
> -		.shrink_decision = rht_shrink_below_30,
>  	};
>  	int err;

I assume you wanted this chunk in patch 2.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann Feb. 25, 2015, 4:36 p.m. UTC | #2
On 02/25/2015 05:28 PM, Thomas Graf wrote:
> On 02/25/15 at 04:31pm, Daniel Borkmann wrote:
>> diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
>> index 58b9953..f9e9d73 100644
>> --- a/lib/test_rhashtable.c
>> +++ b/lib/test_rhashtable.c
>> @@ -202,8 +202,6 @@ static int __init test_rht_init(void)
>>   		.key_len = sizeof(int),
>>   		.hashfn = jhash,
>>   		.nulls_base = (3U << RHT_BASE_SHIFT),
>> -		.grow_decision = rht_grow_above_75,
>> -		.shrink_decision = rht_shrink_below_30,
>>   	};
>>   	int err;
>
> I assume you wanted this chunk in patch 2.

No, it's in this chunk on purpose. ;)

I've tried to explain it here in the commit message:

   Given that the test case verifies shrinks/expands manually, we also
   must remove pointer to the helper functions to explicitly avoid
   parallel resizing on insertions/deletions. test_bucket_stats() and
   test_rht_lookup() could also be wrapped around rhashtable mutex to
   explicitly synchronize a walk from resizing, but I think that defeats
   the actual test case which intended to have explicit test steps,
   i.e. 1) inserts, 2) expands, 3) shrinks, 4) deletions, with object
   verification after each stage.

If we leave it as is, the test case may fail due to a resize run
competing in parallel with the walk (as it's not using Herbert's
API) - I assume the purpose of the test case was to test expands
and shrinks manually in stages and walk/verify the table after
each step.

Cheers,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann Feb. 25, 2015, 4:44 p.m. UTC | #3
On 02/25/2015 05:36 PM, Daniel Borkmann wrote:
> On 02/25/2015 05:28 PM, Thomas Graf wrote:
...
>> I assume you wanted this chunk in patch 2.
>
> No, it's in this chunk on purpose. ;)
>
> I've tried to explain it here in the commit message:
>
>    Given that the test case verifies shrinks/expands manually, we also
>    must remove pointer to the helper functions to explicitly avoid
>    parallel resizing on insertions/deletions. test_bucket_stats() and
>    test_rht_lookup() could also be wrapped around rhashtable mutex to
>    explicitly synchronize a walk from resizing, but I think that defeats
>    the actual test case which intended to have explicit test steps,
>    i.e. 1) inserts, 2) expands, 3) shrinks, 4) deletions, with object
>    verification after each stage.

Note, the reason of this exercise is, because the semantics of
max_shift == 0 have changed. A resize on inserts _before_ this
change was impossible, now it becomes possible, so we have to
disallow resizes (only) for the test case module specifically.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Graf Feb. 25, 2015, 5:09 p.m. UTC | #4
On 02/25/15 at 05:44pm, Daniel Borkmann wrote:
> Note, the reason of this exercise is, because the semantics of
> max_shift == 0 have changed. A resize on inserts _before_ this
> change was impossible, now it becomes possible, so we have to
> disallow resizes (only) for the test case module specifically.

OK, I see what you are doing here. I will work on an improved
test case with parallel resizes after Herbert's rehashing.

Acked-by: Thomas Graf <tgraf@suug.ch>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index e3a04e4..bcf119b 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -251,7 +251,7 @@  bool rht_grow_above_75(const struct rhashtable *ht, size_t new_size)
 {
 	/* Expand table when exceeding 75% load */
 	return atomic_read(&ht->nelems) > (new_size / 4 * 3) &&
-	       (ht->p.max_shift && atomic_read(&ht->shift) < ht->p.max_shift);
+	       (!ht->p.max_shift || atomic_read(&ht->shift) < ht->p.max_shift);
 }
 EXPORT_SYMBOL_GPL(rht_grow_above_75);
 
diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index 58b9953..f9e9d73 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -202,8 +202,6 @@  static int __init test_rht_init(void)
 		.key_len = sizeof(int),
 		.hashfn = jhash,
 		.nulls_base = (3U << RHT_BASE_SHIFT),
-		.grow_decision = rht_grow_above_75,
-		.shrink_decision = rht_shrink_below_30,
 	};
 	int err;