Patchwork Oops with latest (netfilter) nf-next tree, when unloading iptable_nat

login
register
mail settings
Submitter Patrick McHardy
Date Sept. 14, 2012, 1:15 p.m.
Message ID <Pine.GSO.4.63.1209141511020.11622@stinky-local.trash.net>
Download mbox | patch
Permalink /patch/183922/
State Not Applicable
Headers show

Comments

Patrick McHardy - Sept. 14, 2012, 1:15 p.m.
On Fri, 14 Sep 2012, Pablo Neira Ayuso wrote:

> On Wed, Sep 12, 2012 at 11:36:27PM +0200, Florian Westphal wrote:
>> Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>>
>> [ CC'd Patrick ]
>>
>>> I'm hitting this general protection fault, when unloading iptables_nat.
>>> [  524.591067] Pid: 5842, comm: modprobe Not tainted 3.6.0-rc3-pablo-nf-next+ #1 Red Hat KVM
>>> [  524.591067] RIP: 0010:[<ffffffffa002c2fd>]  [<ffffffffa002c2fd>] nf_nat_proto_clean+0x6d/0xc0 [nf_nat]
>>> [  524.591067] RSP: 0018:ffff880073203e18  EFLAGS: 00010246
>>> [  524.591067] RAX: 0000000000000000 RBX: ffff880077dff2c8 RCX: ffff8800797fab70
>>> [  524.591067] RDX: dead000000200200 RSI: ffff880073203e88 RDI: ffffffffa002f208
>>> [  524.591067] RBP: ffff880073203e28 R08: ffff880073202000 R09: 0000000000000000
>>> [  524.591067] R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00
>>>  list corruption?   ^^^^^^^^^^^^^^^^      ^^^^^^^^^^^^^^^^
>>
>> Yep, looks like it.
>>
>>> [  524.591067]  [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
>>> [  524.591067]  [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170
>>> [  524.591067]  [<ffffffffa002c54a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
>>> [  524.591067]  [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0
>>> [  524.591067]  [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
>>
>> On module removal nf_nat_ipv4 calls nf_iterate_cleanup which invokes
>> nf_nat_proto_clean() for each conntrack.  That will then call
>> hlist_del_rcu(&nat->bysource) using eachs conntracks nat ext area.
>>
>> Problem is that nf_nat_proto_clean() is called multiple times for the same
>> conntrack:
>> a) nf_ct_iterate_cleanup() returns each ct twice (origin, reply)
>> b) we call it both for l3 and for l4 protocol ids
>>
>> We barf in hlist_del_rcu the 2nd time because ->pprev is poisoned.
>>
>> This was introduced with the ipv6 nat patches.
>>
>> --- a/net/netfilter/nf_nat_core.c
>> +++ b/net/netfilter/nf_nat_core.c
>> @@ -487,7 +487,7 @@ static int nf_nat_proto_clean(struct nf_conn *i, void *data)
>>
>>         if (clean->hash) {
>>                 spin_lock_bh(&nf_nat_lock);
>> -               hlist_del_rcu(&nat->bysource);
>> +               hlist_del_init_rcu(&nat->bysource);
>>                 spin_unlock_bh(&nf_nat_lock);
>>         } else {
>>
>> Would probably avoid it.  I guess it would be nicer to only call this
>> once for each ct.
>>
>> Patrick, any other idea?
>
> I already discussed this with Florian (I've been having problems with
> two out of three of my email accounts this week... so I couldn't reply
> to this email in the mailing list).
>
> We can add nf_nat_iterate_cleanup that can iterate over the NAT
> hashtable to replace current usage of nf_ct_iterate_cleanup.

Lets just bail out when IPS_SRC_NAT_DONE is not set, that should also fix
it. Could you try this patch please?
Jesper Dangaard Brouer - Sept. 19, 2012, 12:46 p.m.
On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
> On Fri, 14 Sep 2012, Pablo Neira Ayuso wrote:
> 
[...cut...]
> >> Patrick, any other idea?
> >
[...cut...]
> > >
> > We can add nf_nat_iterate_cleanup that can iterate over the NAT
> > hashtable to replace current usage of nf_ct_iterate_cleanup.
> 
> Lets just bail out when IPS_SRC_NAT_DONE is not set, that should also fix
> it. Could you try this patch please?

On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
> index 29d4452..8b5d220 100644
> --- a/net/netfilter/nf_nat_core.c
> +++ b/net/netfilter/nf_nat_core.c
> @@ -481,6 +481,8 @@ static int nf_nat_proto_clean(struct nf_conn *i,
void *data)
>  
>         if (!nat)
>                 return 0;
> +       if (!(i->status & IPS_SRC_NAT_DONE))
> +               return 0;
>         if ((clean->l3proto && nf_ct_l3num(i) != clean->l3proto) ||
>             (clean->l4proto && nf_ct_protonum(i) != clean->l4proto))
>                 return 0;
> 

No it does not work :-(


[ 1216.310146] general protection fault: 0000 [#1] SMP 
[ 1216.311046] Modules linked in: netconsole ip_vs_lblc ip_vs_lc ip_vs_rr ip_vs libcrc32c ipt_MASQUERADE nf_nat_ipv4(-) nf_nat iptable_mangle xt_mark ip6table_mangle xt_LOG ip6table_filter ip6_tables virtio_balloon virtio_net [last unloaded: iptable_nat]
[ 1216.311046] CPU 1 
[ 1216.311046] Pid: 4052, comm: modprobe Not tainted 3.6.0-rc3-test-nat-unload-fix+ #32 Red Hat KVM
[ 1216.311046] RIP: 0010:[<ffffffffa002c303>]  [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat]
[ 1216.311046] RSP: 0018:ffff88007808fe18  EFLAGS: 00010246
[ 1216.311046] RAX: 0000000000000000 RBX: ffff8800728550c0 RCX: ffff8800756288b0
[ 1216.311046] RDX: dead000000200200 RSI: ffff88007808fe88 RDI: ffffffffa002f208
[ 1216.311046] RBP: ffff88007808fe28 R08: ffff88007808e000 R09: 0000000000000000
[ 1216.311046] R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00
[ 1216.311046] R13: ffff8800787582b8 R14: ffff880078758278 R15: ffff88007808fe88
[ 1216.311046] FS:  00007f515985d700(0000) GS:ffff88007cd00000(0000) knlGS:0000000000000000
[ 1216.311046] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1216.311046] CR2: 00007f515986a000 CR3: 000000007867a000 CR4: 00000000000006e0
[ 1216.311046] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1216.311046] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1216.311046] Process modprobe (pid: 4052, threadinfo ffff88007808e000, task ffff8800756288b0)
[ 1216.311046] Stack:
[ 1216.311046]  ffff88007808fe68 ffffffffa002c290 ffff88007808fe78 ffffffff815614e3
[ 1216.311046]  ffffffff00000000 00000aeb00000246 ffff88007808fe68 ffffffff81c6dc00
[ 1216.311046]  ffff88007808fe88 ffffffffa00358a0 0000000000000000 000000000040f5b0
[ 1216.311046] Call Trace:
[ 1216.311046]  [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
[ 1216.311046]  [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170
[ 1216.311046]  [<ffffffffa002c55a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
[ 1216.311046]  [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0
[ 1216.311046]  [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
[ 1216.311046]  [<ffffffff8109f4a5>] sys_delete_module+0x235/0x2b0
[ 1216.311046]  [<ffffffff810b8193>] ? __audit_syscall_entry+0x1b3/0x1f0
[ 1216.311046]  [<ffffffff810b8776>] ? __audit_syscall_exit+0x3e6/0x410
[ 1216.311046]  [<ffffffff816679e2>] system_call_fastpath+0x16/0x1b
[ 1216.311046] Code: 75 6e 0f b6 46 01 84 c0 74 05 3a 42 3e 75 61 80 7e 02 00 74 43 48 c7 c7 08 f2 02 a0 e8 37 3b 63 e1 48 8b 03 48 8b 53 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 be 00 02 20 00 00 00 ad de 48 c7 
[ 1216.311046] RIP  [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat]
[ 1216.311046]  RSP <ffff88007808fe18>

Patch

diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c

index 29d4452..8b5d220 100644

--- a/net/netfilter/nf_nat_core.c

+++ b/net/netfilter/nf_nat_core.c

@@ -481,6 +481,8 @@  static int nf_nat_proto_clean(struct nf_conn *i, void *data)

 
 	if (!nat)
 		return 0;
+	if (!(i->status & IPS_SRC_NAT_DONE))

+		return 0;

 	if ((clean->l3proto && nf_ct_l3num(i) != clean->l3proto) ||
 	    (clean->l4proto && nf_ct_protonum(i) != clean->l4proto))
 		return 0;