[v2] rhashtable: add restart routine in rhashtable_free_and_destroy()

Message ID 20180708025551.25879-1-ap420073@gmail.com
State Accepted
Delegated to: David Miller
Headers show
Series
  • [v2] rhashtable: add restart routine in rhashtable_free_and_destroy()
Related show

Commit Message

Taehee Yoo July 8, 2018, 2:55 a.m.
rhashtable_free_and_destroy() cancels re-hash deferred work
then walks and destroys elements. at this moment, some elements can be
still in future_tbl. that elements are not destroyed.

test case:
nft_rhash_destroy() calls rhashtable_free_and_destroy() to destroy
all elements of sets before destroying sets and chains.
But rhashtable_free_and_destroy() doesn't destroy elements of future_tbl.
so that splat occurred.

test script:
   %cat test.nft
   table ip aa {
	   map map1 {
		   type ipv4_addr : verdict;
		   elements = {
			   0 : jump a0,
			   1 : jump a0,
			   2 : jump a0,
			   3 : jump a0,
			   4 : jump a0,
			   5 : jump a0,
			   6 : jump a0,
			   7 : jump a0,
			   8 : jump a0,
			   9 : jump a0,
		}
	   }
	   chain a0 {
	   }
   }
   flush ruleset
   table ip aa {
	   map map1 {
		   type ipv4_addr : verdict;
		   elements = {
			   0 : jump a0,
			   1 : jump a0,
			   2 : jump a0,
			   3 : jump a0,
			   4 : jump a0,
			   5 : jump a0,
			   6 : jump a0,
			   7 : jump a0,
			   8 : jump a0,
			   9 : jump a0,
		   }
	   }
	   chain a0 {
	   }
   }
   flush ruleset

   %while :; do nft -f test.nft; done

Splat looks like:
[  200.795603] kernel BUG at net/netfilter/nf_tables_api.c:1363!
[  200.806944] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  200.812253] CPU: 1 PID: 1582 Comm: nft Not tainted 4.17.0+ #24
[  200.820297] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
[  200.830309] RIP: 0010:nf_tables_chain_destroy.isra.34+0x62/0x240 [nf_tables]
[  200.838317] Code: 43 50 85 c0 74 26 48 8b 45 00 48 8b 4d 08 ba 54 05 00 00 48 c7 c6 60 6d 29 c0 48 c7 c7 c0 65 29 c0 4c 8b 40 08 e8 58 e5 fd f8 <0f> 0b 48 89 da 48 b8 00 00 00 00 00 fc ff
[  200.860366] RSP: 0000:ffff880118dbf4d0 EFLAGS: 00010282
[  200.866354] RAX: 0000000000000061 RBX: ffff88010cdeaf08 RCX: 0000000000000000
[  200.874355] RDX: 0000000000000061 RSI: 0000000000000008 RDI: ffffed00231b7e90
[  200.882361] RBP: ffff880118dbf4e8 R08: ffffed002373bcfb R09: ffffed002373bcfa
[  200.890354] R10: 0000000000000000 R11: ffffed002373bcfb R12: dead000000000200
[  200.898356] R13: dead000000000100 R14: ffffffffbb62af38 R15: dffffc0000000000
[  200.906354] FS:  00007fefc31fd700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000
[  200.915533] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  200.922355] CR2: 0000557f1c8e9128 CR3: 0000000106880000 CR4: 00000000001006e0
[  200.930353] Call Trace:
[  200.932351]  ? nf_tables_commit+0x26f6/0x2c60 [nf_tables]
[  200.939525]  ? nf_tables_setelem_notify.constprop.49+0x1a0/0x1a0 [nf_tables]
[  200.947525]  ? nf_tables_delchain+0x6e0/0x6e0 [nf_tables]
[  200.952383]  ? nft_add_set_elem+0x1700/0x1700 [nf_tables]
[  200.959532]  ? nla_parse+0xab/0x230
[  200.963529]  ? nfnetlink_rcv_batch+0xd06/0x10d0 [nfnetlink]
[  200.968384]  ? nfnetlink_net_init+0x130/0x130 [nfnetlink]
[  200.975525]  ? debug_show_all_locks+0x290/0x290
[  200.980363]  ? debug_show_all_locks+0x290/0x290
[  200.986356]  ? sched_clock_cpu+0x132/0x170
[  200.990352]  ? find_held_lock+0x39/0x1b0
[  200.994355]  ? sched_clock_local+0x10d/0x130
[  200.999531]  ? memset+0x1f/0x40

V2:
 - free all tables requested by Herbert Xu

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
 lib/rhashtable.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

David Miller July 8, 2018, 4:09 a.m. | #1
From: Taehee Yoo <ap420073@gmail.com>
Date: Sun,  8 Jul 2018 11:55:51 +0900

> @@ -1143,13 +1143,14 @@ void rhashtable_free_and_destroy(struct rhashtable *ht,
>  				 void (*free_fn)(void *ptr, void *arg),
>  				 void *arg)
>  {
> -	struct bucket_table *tbl;
> +	struct bucket_table *tbl, *next_tbl;
>  	unsigned int i;
 ...
>  	tbl = rht_dereference(ht->tbl, ht);
> +restart:
>  	if (free_fn) {
 ...
> @@ -1166,7 +1167,12 @@ void rhashtable_free_and_destroy(struct rhashtable *ht,
>  		}
>  	}
>  
> +	next_tbl = rht_dereference(tbl->future_tbl, ht);
>  	bucket_table_free(tbl);
> +	if (next_tbl) {
> +		tbl = next_tbl;
> +		goto restart;
> +	}

This looks good to me, Herbert please review.
Herbert Xu July 8, 2018, 4:11 p.m. | #2
On Sun, Jul 08, 2018 at 11:55:51AM +0900, Taehee Yoo wrote:
> rhashtable_free_and_destroy() cancels re-hash deferred work
> then walks and destroys elements. at this moment, some elements can be
> still in future_tbl. that elements are not destroyed.
> 
> test case:
> nft_rhash_destroy() calls rhashtable_free_and_destroy() to destroy
> all elements of sets before destroying sets and chains.
> But rhashtable_free_and_destroy() doesn't destroy elements of future_tbl.
> so that splat occurred.
> 
> test script:
>    %cat test.nft
>    table ip aa {
> 	   map map1 {
> 		   type ipv4_addr : verdict;
> 		   elements = {
> 			   0 : jump a0,
> 			   1 : jump a0,
> 			   2 : jump a0,
> 			   3 : jump a0,
> 			   4 : jump a0,
> 			   5 : jump a0,
> 			   6 : jump a0,
> 			   7 : jump a0,
> 			   8 : jump a0,
> 			   9 : jump a0,
> 		}
> 	   }
> 	   chain a0 {
> 	   }
>    }
>    flush ruleset
>    table ip aa {
> 	   map map1 {
> 		   type ipv4_addr : verdict;
> 		   elements = {
> 			   0 : jump a0,
> 			   1 : jump a0,
> 			   2 : jump a0,
> 			   3 : jump a0,
> 			   4 : jump a0,
> 			   5 : jump a0,
> 			   6 : jump a0,
> 			   7 : jump a0,
> 			   8 : jump a0,
> 			   9 : jump a0,
> 		   }
> 	   }
> 	   chain a0 {
> 	   }
>    }
>    flush ruleset
> 
>    %while :; do nft -f test.nft; done
> 
> Splat looks like:
> [  200.795603] kernel BUG at net/netfilter/nf_tables_api.c:1363!
> [  200.806944] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> [  200.812253] CPU: 1 PID: 1582 Comm: nft Not tainted 4.17.0+ #24
> [  200.820297] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
> [  200.830309] RIP: 0010:nf_tables_chain_destroy.isra.34+0x62/0x240 [nf_tables]
> [  200.838317] Code: 43 50 85 c0 74 26 48 8b 45 00 48 8b 4d 08 ba 54 05 00 00 48 c7 c6 60 6d 29 c0 48 c7 c7 c0 65 29 c0 4c 8b 40 08 e8 58 e5 fd f8 <0f> 0b 48 89 da 48 b8 00 00 00 00 00 fc ff
> [  200.860366] RSP: 0000:ffff880118dbf4d0 EFLAGS: 00010282
> [  200.866354] RAX: 0000000000000061 RBX: ffff88010cdeaf08 RCX: 0000000000000000
> [  200.874355] RDX: 0000000000000061 RSI: 0000000000000008 RDI: ffffed00231b7e90
> [  200.882361] RBP: ffff880118dbf4e8 R08: ffffed002373bcfb R09: ffffed002373bcfa
> [  200.890354] R10: 0000000000000000 R11: ffffed002373bcfb R12: dead000000000200
> [  200.898356] R13: dead000000000100 R14: ffffffffbb62af38 R15: dffffc0000000000
> [  200.906354] FS:  00007fefc31fd700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000
> [  200.915533] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  200.922355] CR2: 0000557f1c8e9128 CR3: 0000000106880000 CR4: 00000000001006e0
> [  200.930353] Call Trace:
> [  200.932351]  ? nf_tables_commit+0x26f6/0x2c60 [nf_tables]
> [  200.939525]  ? nf_tables_setelem_notify.constprop.49+0x1a0/0x1a0 [nf_tables]
> [  200.947525]  ? nf_tables_delchain+0x6e0/0x6e0 [nf_tables]
> [  200.952383]  ? nft_add_set_elem+0x1700/0x1700 [nf_tables]
> [  200.959532]  ? nla_parse+0xab/0x230
> [  200.963529]  ? nfnetlink_rcv_batch+0xd06/0x10d0 [nfnetlink]
> [  200.968384]  ? nfnetlink_net_init+0x130/0x130 [nfnetlink]
> [  200.975525]  ? debug_show_all_locks+0x290/0x290
> [  200.980363]  ? debug_show_all_locks+0x290/0x290
> [  200.986356]  ? sched_clock_cpu+0x132/0x170
> [  200.990352]  ? find_held_lock+0x39/0x1b0
> [  200.994355]  ? sched_clock_local+0x10d/0x130
> [  200.999531]  ? memset+0x1f/0x40
> 
> V2:
>  - free all tables requested by Herbert Xu
> 
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
David Miller July 9, 2018, 11:29 p.m. | #3
From: Taehee Yoo <ap420073@gmail.com>
Date: Sun,  8 Jul 2018 11:55:51 +0900

> rhashtable_free_and_destroy() cancels re-hash deferred work
> then walks and destroys elements. at this moment, some elements can be
> still in future_tbl. that elements are not destroyed.
> 
> test case:
> nft_rhash_destroy() calls rhashtable_free_and_destroy() to destroy
> all elements of sets before destroying sets and chains.
> But rhashtable_free_and_destroy() doesn't destroy elements of future_tbl.
> so that splat occurred.
> 
> test script:
 ...
> Splat looks like:
 ...
> V2:
>  - free all tables requested by Herbert Xu
> 
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>

Applied and queued up for -stable.

Patch

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 9427b57..fa016b2 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -1143,13 +1143,14 @@  void rhashtable_free_and_destroy(struct rhashtable *ht,
 				 void (*free_fn)(void *ptr, void *arg),
 				 void *arg)
 {
-	struct bucket_table *tbl;
+	struct bucket_table *tbl, *next_tbl;
 	unsigned int i;
 
 	cancel_work_sync(&ht->run_work);
 
 	mutex_lock(&ht->mutex);
 	tbl = rht_dereference(ht->tbl, ht);
+restart:
 	if (free_fn) {
 		for (i = 0; i < tbl->size; i++) {
 			struct rhash_head *pos, *next;
@@ -1166,7 +1167,12 @@  void rhashtable_free_and_destroy(struct rhashtable *ht,
 		}
 	}
 
+	next_tbl = rht_dereference(tbl->future_tbl, ht);
 	bucket_table_free(tbl);
+	if (next_tbl) {
+		tbl = next_tbl;
+		goto restart;
+	}
 	mutex_unlock(&ht->mutex);
 }
 EXPORT_SYMBOL_GPL(rhashtable_free_and_destroy);