[V2,nf,2/3] netfilter: nft_set_rbtree: fix panic when destroying set by GC

Message ID 20180710142201.18638-1-ap420073@gmail.com
State Accepted
Delegated to: Pablo Neira
Headers show
Series
  • netfilter: nf_tables: fix set destroying bugs
Related show

Commit Message

Taehee Yoo July 10, 2018, 2:22 p.m.
This patch fixes below.
1. check null pointer of rb_next.
 rb_next can return null. so null check routine should be added.
2. add rcu_barrier in destroy routine.
 GC uses call_rcu to remove elements. but all elements should be
 removed before destroying set and chains. so that rcu_barrier is added.

test script:
   %cat test.nft
   table inet aa {
	   map map1 {
		   type ipv4_addr : verdict; flags interval, timeout;
		   elements = {
			   0-1 : jump a0,
			   3-4 : jump a0,
			   6-7 : jump a0,
			   9-10 : jump a0,
			   12-13 : jump a0,
			   15-16 : jump a0,
			   18-19 : jump a0,
			   21-22 : jump a0,
			   24-25 : jump a0,
			   27-28 : jump a0,
		   }
		   timeout 1s;
	   }
	   chain a0 {
	   }
   }
   flush ruleset
   table inet aa {
	   map map1 {
		   type ipv4_addr : verdict; flags interval, timeout;
		   elements = {
			   0-1 : jump a0,
			   3-4 : jump a0,
			   6-7 : jump a0,
			   9-10 : jump a0,
			   12-13 : jump a0,
			   15-16 : jump a0,
			   18-19 : jump a0,
			   21-22 : jump a0,
			   24-25 : jump a0,
			   27-28 : jump a0,
		   }
		   timeout 1s;
	   }
	   chain a0 {
	   }
   }
   flush ruleset

splat looks like:
[ 2402.419838] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ 2402.428433] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 2402.429343] CPU: 1 PID: 1350 Comm: kworker/1:1 Not tainted 4.18.0-rc2+ #1
[ 2402.429343] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 03/23/2017
[ 2402.429343] Workqueue: events_power_efficient nft_rbtree_gc [nft_set_rbtree]
[ 2402.429343] RIP: 0010:rb_next+0x1e/0x130
[ 2402.429343] Code: e9 de f2 ff ff 0f 1f 80 00 00 00 00 41 55 48 89 fa 41 54 55 53 48 c1 ea 03 48 b8 00 00 00 0
[ 2402.429343] RSP: 0018:ffff880105f77678 EFLAGS: 00010296
[ 2402.429343] RAX: dffffc0000000000 RBX: ffff8801143e3428 RCX: 1ffff1002287c69c
[ 2402.429343] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000
[ 2402.429343] RBP: 0000000000000000 R08: ffffed0016aabc24 R09: ffffed0016aabc24
[ 2402.429343] R10: 0000000000000001 R11: ffffed0016aabc23 R12: 0000000000000000
[ 2402.429343] R13: ffff8800b6933388 R14: dffffc0000000000 R15: ffff8801143e3440
[ 2402.534486] kasan: CONFIG_KASAN_INLINE enabled
[ 2402.534212] FS:  0000000000000000(0000) GS:ffff88011b600000(0000) knlGS:0000000000000000
[ 2402.534212] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2402.534212] CR2: 0000000000863008 CR3: 00000000a3c16000 CR4: 00000000001006e0
[ 2402.534212] Call Trace:
[ 2402.534212]  nft_rbtree_gc+0x2b5/0x5f0 [nft_set_rbtree]
[ 2402.534212]  process_one_work+0xc1b/0x1ee0
[ 2402.540329] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ 2402.534212]  ? _raw_spin_unlock_irq+0x29/0x40
[ 2402.534212]  ? pwq_dec_nr_in_flight+0x3e0/0x3e0
[ 2402.534212]  ? set_load_weight+0x270/0x270
[ 2402.534212]  ? __schedule+0x6ea/0x1fb0
[ 2402.534212]  ? __sched_text_start+0x8/0x8
[ 2402.534212]  ? save_trace+0x320/0x320
[ 2402.534212]  ? sched_clock_local+0xe2/0x150
[ 2402.534212]  ? find_held_lock+0x39/0x1c0
[ 2402.534212]  ? worker_thread+0x35f/0x1150
[ 2402.534212]  ? lock_contended+0xe90/0xe90
[ 2402.534212]  ? __lock_acquire+0x4520/0x4520
[ 2402.534212]  ? do_raw_spin_unlock+0xb1/0x350
[ 2402.534212]  ? do_raw_spin_trylock+0x111/0x1b0
[ 2402.534212]  ? do_raw_spin_lock+0x1f0/0x1f0
[ 2402.534212]  worker_thread+0x169/0x1150

V2:
 - Do not add interval check routine in nft_set_rbtree.
  - Requested by Pablo Neira Ayuso

Fixes: 8d8540c4f5e0("netfilter: nft_set_rbtree: add timeout support")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
 net/netfilter/nft_set_rbtree.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Pablo Neira Ayuso July 16, 2018, 4:09 p.m. | #1
Hi Taehee,

On Tue, Jul 10, 2018 at 11:22:01PM +0900, Taehee Yoo wrote:
> This patch fixes below.
> 1. check null pointer of rb_next.
>  rb_next can return null. so null check routine should be added.
> 2. add rcu_barrier in destroy routine.
>  GC uses call_rcu to remove elements. but all elements should be
>  removed before destroying set and chains. so that rcu_barrier is added.
[...]
> diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
> index 1f8f257..09e3a15 100644
> --- a/net/netfilter/nft_set_rbtree.c
> +++ b/net/netfilter/nft_set_rbtree.c
> @@ -381,7 +381,7 @@ static void nft_rbtree_gc(struct work_struct *work)
>  
>  		gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC);
>  		if (!gcb)
> -			goto out;
> +			break;
>  
>  		atomic_dec(&set->nelems);
>  		nft_set_gc_batch_add(gcb, rbe);
> @@ -390,10 +390,12 @@ static void nft_rbtree_gc(struct work_struct *work)
>  			rbe = rb_entry(prev, struct nft_rbtree_elem, node);
>  			atomic_dec(&set->nelems);
>  			nft_set_gc_batch_add(gcb, rbe);
> +			prev = NULL;
>  		}
>  		node = rb_next(node);
> +		if (!node)
> +			break;
>  	}
> -out:
>  	if (gcb) {
>  		for (i = 0; i < gcb->head.cnt; i++) {
>  			rbe = gcb->elems[i];
> @@ -440,9 +442,10 @@ static void nft_rbtree_destroy(const struct nft_set *set)
>  	struct rb_node *node;
>  
>  	cancel_delayed_work_sync(&priv->gc_work);
> +	rcu_barrier();
>  	while ((node = priv->root.rb_node) != NULL) {
> -		rb_erase(node, &priv->root);
>  		rbe = rb_entry(node, struct nft_rbtree_elem, node);
> +		rb_erase(node, &priv->root);

Just to clarify: Do we have to reorder these lines? I mean, place
rb_erase() after rb_entry(). Just asking because this is not described
in the patch description - I just came from long trip, I'm tired so I
may be overlooking anything. No need to resend in any case.

Let me know, thanks!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Taehee Yoo July 16, 2018, 5:34 p.m. | #2
2018-07-17 1:09 GMT+09:00 Pablo Neira Ayuso <pablo@netfilter.org>:
> Hi Taehee,
>
> On Tue, Jul 10, 2018 at 11:22:01PM +0900, Taehee Yoo wrote:
>> This patch fixes below.
>> 1. check null pointer of rb_next.
>>  rb_next can return null. so null check routine should be added.
>> 2. add rcu_barrier in destroy routine.
>>  GC uses call_rcu to remove elements. but all elements should be
>>  removed before destroying set and chains. so that rcu_barrier is added.
> [...]
>> diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
>> index 1f8f257..09e3a15 100644
>> --- a/net/netfilter/nft_set_rbtree.c
>> +++ b/net/netfilter/nft_set_rbtree.c
>> @@ -381,7 +381,7 @@ static void nft_rbtree_gc(struct work_struct *work)
>>
>>               gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC);
>>               if (!gcb)
>> -                     goto out;
>> +                     break;
>>
>>               atomic_dec(&set->nelems);
>>               nft_set_gc_batch_add(gcb, rbe);
>> @@ -390,10 +390,12 @@ static void nft_rbtree_gc(struct work_struct *work)
>>                       rbe = rb_entry(prev, struct nft_rbtree_elem, node);
>>                       atomic_dec(&set->nelems);
>>                       nft_set_gc_batch_add(gcb, rbe);
>> +                     prev = NULL;
>>               }
>>               node = rb_next(node);
>> +             if (!node)
>> +                     break;
>>       }
>> -out:
>>       if (gcb) {
>>               for (i = 0; i < gcb->head.cnt; i++) {
>>                       rbe = gcb->elems[i];
>> @@ -440,9 +442,10 @@ static void nft_rbtree_destroy(const struct nft_set *set)
>>       struct rb_node *node;
>>
>>       cancel_delayed_work_sync(&priv->gc_work);
>> +     rcu_barrier();
>>       while ((node = priv->root.rb_node) != NULL) {
>> -             rb_erase(node, &priv->root);
>>               rbe = rb_entry(node, struct nft_rbtree_elem, node);
>> +             rb_erase(node, &priv->root);
>
> Just to clarify: Do we have to reorder these lines? I mean, place
> rb_erase() after rb_entry(). Just asking because this is not described
> in the patch description - I just came from long trip, I'm tired so I
> may be overlooking anything. No need to resend in any case.
>
> Let me know, thanks!

Hi Pablo,

It doesn't change actural logic and doesn't fix problem.
I prefer that setting variables on top of while statement.
so I added it.
I'm so sorry to make you confused.

Thanks for reviewing!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso July 17, 2018, 11:09 a.m. | #3
On Tue, Jul 17, 2018 at 02:34:13AM +0900, Taehee Yoo wrote:
> 2018-07-17 1:09 GMT+09:00 Pablo Neira Ayuso <pablo@netfilter.org>:
[...]
> > Just to clarify: Do we have to reorder these lines? I mean, place
> > rb_erase() after rb_entry(). Just asking because this is not described
> > in the patch description - I just came from long trip, I'm tired so I
> > may be overlooking anything. No need to resend in any case.
[...]
> 
> It doesn't change actural logic and doesn't fix problem.
> I prefer that setting variables on top of while statement.
> so I added it. I'm so sorry to make you confused.

OK, so it's just a cleanup, thanks for clarifying.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 1f8f257..09e3a15 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -381,7 +381,7 @@  static void nft_rbtree_gc(struct work_struct *work)
 
 		gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC);
 		if (!gcb)
-			goto out;
+			break;
 
 		atomic_dec(&set->nelems);
 		nft_set_gc_batch_add(gcb, rbe);
@@ -390,10 +390,12 @@  static void nft_rbtree_gc(struct work_struct *work)
 			rbe = rb_entry(prev, struct nft_rbtree_elem, node);
 			atomic_dec(&set->nelems);
 			nft_set_gc_batch_add(gcb, rbe);
+			prev = NULL;
 		}
 		node = rb_next(node);
+		if (!node)
+			break;
 	}
-out:
 	if (gcb) {
 		for (i = 0; i < gcb->head.cnt; i++) {
 			rbe = gcb->elems[i];
@@ -440,9 +442,10 @@  static void nft_rbtree_destroy(const struct nft_set *set)
 	struct rb_node *node;
 
 	cancel_delayed_work_sync(&priv->gc_work);
+	rcu_barrier();
 	while ((node = priv->root.rb_node) != NULL) {
-		rb_erase(node, &priv->root);
 		rbe = rb_entry(node, struct nft_rbtree_elem, node);
+		rb_erase(node, &priv->root);
 		nft_set_elem_destroy(set, rbe, true);
 	}
 }