Message ID | 20190124030841.n4jtsqka5zji3e62@gondor.apana.org.au |
---|---|
State | RFC |
Delegated to: | David Miller |
Headers | show |
Series | [v2] rhashtable: Still do rehash when we get EEXIST | expand |
On Jan 23, 2019, at 7:08 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote: > Thanks for catching this! > > Although I think we should fix this in a different way. The problem > here is that the shrink cannot proceed because there was a previous > rehash that is still incomplete. We should wait for its completion > and then reattempt a shrinnk should it still be necessary. > > So something like this: SGTM. I can't test this right now because our VM server's down after a power outage this evening, but I tried a similar patch that swallowed the -EEXIST err and even with that oversight the hashtable dodged the reschedule loop. - Josh
On Jan 23, 2019, at 7:40 PM, Josh Elsasser <jelsasser@appneta.com> wrote: > On Jan 23, 2019, at 7:08 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote: > >> Thanks for catching this! >> >> Although I think we should fix this in a different way. The problem >> here is that the shrink cannot proceed because there was a previous >> rehash that is still incomplete. We should wait for its completion >> and then reattempt a shrinnk should it still be necessary. > > I can't test this right now because our VM server's down Got one of the poor little reproducer VM's back up and running and loaded up this patch. Works like a charm. For the v2 PATCH, can add my: Tested-by: Josh Elsasser <jelsasser@appneta.com>
On Sat, Jan 26, 2019 at 2:03 PM Josh Elsasser <jelsasser@appneta.com> wrote: > > On Jan 23, 2019, at 7:40 PM, Josh Elsasser <jelsasser@appneta.com> wrote: > > On Jan 23, 2019, at 7:08 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote: > > > >> Thanks for catching this! > >> > >> Although I think we should fix this in a different way. The problem > >> here is that the shrink cannot proceed because there was a previous > >> rehash that is still incomplete. We should wait for its completion > >> and then reattempt a shrinnk should it still be necessary. > > > > I can't test this right now because our VM server's down > > Got one of the poor little reproducer VM's back up and running and loaded > up this patch. Works like a charm. For the v2 PATCH, can add my: > > Tested-by: Josh Elsasser <jelsasser@appneta.com> Trying again... gmail sent HTML mail first time. Herbert We're seeing this pretty regularly on 4.14 LTS kernels. I didn't see your change in any of the regular trees. Are there plans to submit this? If so, can it get queued up for 4.14 stable too? Thanks!
diff --git a/lib/rhashtable.c b/lib/rhashtable.c index 852ffa5160f1..4edcf3310513 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -416,8 +416,12 @@ static void rht_deferred_worker(struct work_struct *work) else if (tbl->nest) err = rhashtable_rehash_alloc(ht, tbl, tbl->size); - if (!err) - err = rhashtable_rehash_table(ht); + if (!err || err == -EEXIST) { + int nerr; + + nerr = rhashtable_rehash_table(ht); + err = err ?: nerr; + } mutex_unlock(&ht->mutex);