[nf] Revert "netfilter: unlock xt_table earlier in __do_replace"
diff mbox series

Message ID 1579740455-17249-1-git-send-email-stranche@codeaurora.org
State Under Review
Delegated to: Pablo Neira
Headers show
Series
  • [nf] Revert "netfilter: unlock xt_table earlier in __do_replace"
Related show

Commit Message

Sean Tranchetti Jan. 23, 2020, 12:47 a.m. UTC
A recently reported crash in the x_tables framework seems to stem from
a potential race condition between adding rules to a table and having a
packet traversing the table at the same time.

In the crash, the jumpstack being used by the table traversal was freed
by the table replace code. After performing some bisection, it seems that
commit f31e5f1a891f ("netfilter: unlock xt_table earlier in __do_replace")
exposed this race condition by unlocking the table before the
get_old_counters() routine was called to perform the synchronization.

Call Stack:
	Unable to handle kernel paging request at virtual address
	006b6b6b6b6b6bc5

	pc : ipt_do_table+0x3b8/0x660
	lr : ipt_do_table+0x31c/0x660
	Call trace:
	ipt_do_table+0x3b8/0x660
	iptable_mangle_hook+0x58/0xf8
	nf_hook_slow+0x48/0xd8
	__ip_local_out+0xf4/0x138
	__ip_queue_xmit+0x348/0x3a0
	ip_queue_xmit+0x10/0x18

Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
---
 net/ipv4/netfilter/arp_tables.c | 3 +--
 net/ipv4/netfilter/ip_tables.c  | 3 +--
 net/ipv6/netfilter/ip6_tables.c | 3 +--
 3 files changed, 3 insertions(+), 6 deletions(-)

Comments

Florian Westphal Jan. 23, 2020, 10:29 a.m. UTC | #1
Sean Tranchetti <stranche@codeaurora.org> wrote:

[ CC Xin Long ]

> A recently reported crash in the x_tables framework seems to stem from
> a potential race condition between adding rules to a table and having a
> packet traversing the table at the same time.
> 
> In the crash, the jumpstack being used by the table traversal was freed
> by the table replace code. After performing some bisection, it seems that
> commit f31e5f1a891f ("netfilter: unlock xt_table earlier in __do_replace")
> exposed this race condition by unlocking the table before the
> get_old_counters() routine was called to perform the synchronization.

But the packet path doesn't grab the table mutex.

> Call Stack:
> 	Unable to handle kernel paging request at virtual address
> 	006b6b6b6b6b6bc5
> 
> 	pc : ipt_do_table+0x3b8/0x660
> 	lr : ipt_do_table+0x31c/0x660
> 	Call trace:
> 	ipt_do_table+0x3b8/0x660
> 	iptable_mangle_hook+0x58/0xf8
> 	nf_hook_slow+0x48/0xd8
> 	__ip_local_out+0xf4/0x138
> 	__ip_queue_xmit+0x348/0x3a0
> 	ip_queue_xmit+0x10/0x18
> 
> Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
> ---
> @@ -921,8 +921,6 @@ static int __do_replace(struct net *net, const char *name,
>  	    (newinfo->number <= oldinfo->initial_entries))
>  		module_put(t->me);
>  
> -	xt_table_unlock(t);
> -
>  	get_old_counters(oldinfo, counters);
>  
>  	/* Decrease module usage counts and free resource */
> @@ -937,6 +935,7 @@ static int __do_replace(struct net *net, const char *name,
>  		net_warn_ratelimited("arptables: counters copy to user failed while replacing table\n");
>  	}
>  	vfree(counters);
> +	xt_table_unlock(t);

I don't see how this changes anything wrt. packet path.
This disallows another instance of iptables(-restore) to come in
before the counters have been copied/freed and the destructors have run.

But as those have nothing to do with the jumpstack I don't see how this
helps.
Sean Tranchetti Jan. 23, 2020, 8:08 p.m. UTC | #2
On 2020-01-23 03:29, Florian Westphal wrote:
> 
> I don't see how this changes anything wrt. packet path.
> This disallows another instance of iptables(-restore) to come in
> before the counters have been copied/freed and the destructors have 
> run.
> 
> But as those have nothing to do with the jumpstack I don't see how this
> helps.

Based on on the stack of the iptables-restore task that freed the 
jumpstack being accessed in the ipt_do_table() routine, we end up in 
__do_replace()
       0xFFFFFF9239243AE0, ->kvfree
       0xFFFFFF923A1969EC, ->xt_free_table_info+0x50
       0xFFFFFF923A2100E0, ->__do_replace+0x200

Prior to the original patch, this xt_free_table_info was under lock, so 
it seems that having this call under lock guarantees that the new 
table->private entry that contains the jumpstack is seen across all 
CPUs.

> 
> But the packet path doesn't grab the table mutex.
> 

Good point. Perhaps the reason that moving this lock helps is because it 
prevents multiple writers from stepping on one another in such a way 
that the private entry is left in a bad state. Or this whole thing is a 
red herring and the problem is actually that xt_replace_table() is able 
to return prematurely and not all CPUs are finished with the old 
jumpstack by the time the old table info is freed.
Xin Long Feb. 3, 2020, 10:51 a.m. UTC | #3
On Thu, Jan 23, 2020 at 6:29 PM Florian Westphal <fw@strlen.de> wrote:
>
> Sean Tranchetti <stranche@codeaurora.org> wrote:
>
> [ CC Xin Long ]
>
> > A recently reported crash in the x_tables framework seems to stem from
> > a potential race condition between adding rules to a table and having a
> > packet traversing the table at the same time.
> >
> > In the crash, the jumpstack being used by the table traversal was freed
> > by the table replace code. After performing some bisection, it seems that
> > commit f31e5f1a891f ("netfilter: unlock xt_table earlier in __do_replace")
> > exposed this race condition by unlocking the table before the
> > get_old_counters() routine was called to perform the synchronization.
>
> But the packet path doesn't grab the table mutex.
>
> > Call Stack:
> >       Unable to handle kernel paging request at virtual address
> >       006b6b6b6b6b6bc5
> >
> >       pc : ipt_do_table+0x3b8/0x660
> >       lr : ipt_do_table+0x31c/0x660
> >       Call trace:
> >       ipt_do_table+0x3b8/0x660
> >       iptable_mangle_hook+0x58/0xf8
> >       nf_hook_slow+0x48/0xd8
> >       __ip_local_out+0xf4/0x138
> >       __ip_queue_xmit+0x348/0x3a0
> >       ip_queue_xmit+0x10/0x18
I don't see how this happens either.

Hi Sean,
do you have a script to reproduce this issue?

Thanks.
> >
> > Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
> > ---
> > @@ -921,8 +921,6 @@ static int __do_replace(struct net *net, const char *name,
> >           (newinfo->number <= oldinfo->initial_entries))
> >               module_put(t->me);
> >
> > -     xt_table_unlock(t);
> > -
> >       get_old_counters(oldinfo, counters);
> >
> >       /* Decrease module usage counts and free resource */
> > @@ -937,6 +935,7 @@ static int __do_replace(struct net *net, const char *name,
> >               net_warn_ratelimited("arptables: counters copy to user failed while replacing table\n");
> >       }
> >       vfree(counters);
> > +     xt_table_unlock(t);
>
> I don't see how this changes anything wrt. packet path.
> This disallows another instance of iptables(-restore) to come in
> before the counters have been copied/freed and the destructors have run.
>
> But as those have nothing to do with the jumpstack I don't see how this
> helps.

Patch
diff mbox series

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index f1f78a7..85cb189 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -921,8 +921,6 @@  static int __do_replace(struct net *net, const char *name,
 	    (newinfo->number <= oldinfo->initial_entries))
 		module_put(t->me);
 
-	xt_table_unlock(t);
-
 	get_old_counters(oldinfo, counters);
 
 	/* Decrease module usage counts and free resource */
@@ -937,6 +935,7 @@  static int __do_replace(struct net *net, const char *name,
 		net_warn_ratelimited("arptables: counters copy to user failed while replacing table\n");
 	}
 	vfree(counters);
+	xt_table_unlock(t);
 	return ret;
 
  put_module:
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 10b91eb..9f98bc5 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -1076,8 +1076,6 @@  static int get_info(struct net *net, void __user *user,
 	    (newinfo->number <= oldinfo->initial_entries))
 		module_put(t->me);
 
-	xt_table_unlock(t);
-
 	get_old_counters(oldinfo, counters);
 
 	/* Decrease module usage counts and free resource */
@@ -1091,6 +1089,7 @@  static int get_info(struct net *net, void __user *user,
 		net_warn_ratelimited("iptables: counters copy to user failed while replacing table\n");
 	}
 	vfree(counters);
+	xt_table_unlock(t);
 	return ret;
 
  put_module:
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index c973ace..f2637bfb 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -1093,8 +1093,6 @@  static int get_info(struct net *net, void __user *user,
 	    (newinfo->number <= oldinfo->initial_entries))
 		module_put(t->me);
 
-	xt_table_unlock(t);
-
 	get_old_counters(oldinfo, counters);
 
 	/* Decrease module usage counts and free resource */
@@ -1108,6 +1106,7 @@  static int get_info(struct net *net, void __user *user,
 		net_warn_ratelimited("ip6tables: counters copy to user failed while replacing table\n");
 	}
 	vfree(counters);
+	xt_table_unlock(t);
 	return ret;
 
  put_module: