diff mbox

death_by_event() does not check IPS_DYING_BIT - race condition against ctnetlink_del_conntrack

Message ID 9829027.jsWF3AolUh@gentoovm
State Superseded
Headers show

Commit Message

Oliver Aug. 28, 2012, 5:16 p.m. UTC
Hi Pablo,

On Tuesday 28 August 2012 12:52:35 you wrote:
> It seems we're hitting death_by_event twice, which should not happen.
> 
> Would you give a try to the following patch?
> 
> Thanks.

I've tried applying the patch against 3.4.10 and found that it doesn't work 
due to having rewritten nf_ct_iterate_cleanup() to take two additional 
arguments and the nf_conntrack_proto.c in your source tree being divergent 
from anything I could find.

So, I've taken a different approach; nf_ct_iterate_cleanup() is a wrapper for 
nf_ct_iterate_cleanup_new() that simply passes 0 for the last two args so as 
to save having to edit every module.

Please take a look at the attached patch and let me know your thoughts; I'd 
Ideally like to have this fix in 3.4 since that's long-term stable.

During testing I found that the kernel is indeed solid and does not panic; 
however, I managed to make conntrackd eat 100% of a CPU core on one of the 
pair and conntrack entries remained unevicted from the kernel until I killed 
the conntrackd process.

Kind Regards,
Oliver

Comments

Oliver Aug. 28, 2012, 11:10 p.m. UTC | #1
On Tuesday 28 August 2012 19:16:39 Oliver wrote:
> During testing I found that the kernel is indeed solid and does not panic;
> however, I managed to make conntrackd eat 100% of a CPU core on one of the
> pair and conntrack entries remained unevicted from the kernel until I killed
> the conntrackd process.

having conntrackd running while the conntrack table is full is causing a GPF - 
I have attached a dmesg output of the kernel panic resulting from a general 
protection fault.

The first GPF is from the kernel patched with the code provided in my previous 
e-mail (the one for v3.4.10 based on the patch you provided me)

the second is with my only my original patch (the one-liner that checks the 
dying bit in death_by_event) - although that's likely not relevant here since 
that function is not part of the stack.

Kind Regards,
Oliver
Aug 28 15:18:06 fw02-lab [  352.508058] testing netconsole, hello!
Aug 28 15:19:59 fw02-lab [  465.619240] nf_conntrack: table full, dropping packet.
Aug 28 15:19:59 fw02-lab [  465.619258] nf_conntrack: table full, dropping packet.
Aug 28 15:19:59 fw02-lab [  465.619272] nf_conntrack: table full, dropping packet.
Aug 28 15:19:59 fw02-lab [  465.619274] nf_conntrack: table full, dropping packet.
Aug 28 15:19:59 fw02-lab [  465.619286] nf_conntrack: table full, dropping packet.
Aug 28 15:19:59 fw02-lab [  465.619288] nf_conntrack: table full, dropping packet.
Aug 28 15:19:59 fw02-lab [  465.619298] nf_conntrack: table full, dropping packet.
Aug 28 15:19:59 fw02-lab [  465.619314] nf_conntrack: table full, dropping packet.
Aug 28 15:19:59 fw02-lab [  465.619338] nf_conntrack: table full, dropping packet.
Aug 28 15:19:59 fw02-lab [  465.619349] nf_conntrack: table full, dropping packet.
Aug 28 15:20:00 fw02-lab [  467.077056] general protection fault: 0000 [#1] 
Aug 28 15:20:00 fw02-lab SMP 
Aug 28 15:20:00 fw02-lab 
Aug 28 15:20:00 fw02-lab [  467.077205] CPU 0 
Aug 28 15:20:00 fw02-lab 
Aug 28 15:20:00 fw02-lab [  467.077254] Modules linked in:
Aug 28 15:20:00 fw02-lab xt_set
Aug 28 15:20:00 fw02-lab ip_set_hash_ipportnet
Aug 28 15:20:00 fw02-lab ip_set_hash_net
Aug 28 15:20:00 fw02-lab ip_set
Aug 28 15:20:00 fw02-lab igb
Aug 28 15:20:00 fw02-lab bnx2
Aug 28 15:20:00 fw02-lab iTCO_wdt
Aug 28 15:20:00 fw02-lab dca
Aug 28 15:20:00 fw02-lab 
Aug 28 15:20:00 fw02-lab [  467.077769] 
Aug 28 15:20:00 fw02-lab [  467.077818] Pid: 2853, comm: conntrackd Not tainted 3.4.11-zerolag+ #1
Aug 28 15:20:00 fw02-lab Dell Inc. PowerEdge R310
Aug 28 15:20:00 fw02-lab /05XKKK
Aug 28 15:20:00 fw02-lab 
Aug 28 15:20:00 fw02-lab [  467.078017] RIP: 0010:[<ffffffff8152d6c7>] 
Aug 28 15:20:00 fw02-lab [<ffffffff8152d6c7>] nf_ct_delete_from_lists+0x57/0x90
Aug 28 15:20:00 fw02-lab [  467.078184] RSP: 0018:ffff88012650f858  EFLAGS: 00010246
Aug 28 15:20:00 fw02-lab [  467.078292] RAX: ffff880132737c28 RBX: ffff880130c4ca20 RCX: ffff88013229d708
Aug 28 15:20:00 fw02-lab [  467.078404] RDX: dead000000200200 RSI: dead000000200200 RDI: ffffffff81bcf580
Aug 28 15:20:00 fw02-lab [  467.078516] RBP: ffffffff81bcd080 R08: 0000000000000170 R09: 0000000000000001
Aug 28 15:20:00 fw02-lab [  467.078630] R10: 0000000000000002 R11: 000000009b0cb868 R12: 0000000000000000
Aug 28 15:20:00 fw02-lab [  467.078746] R13: ffffffff81bcd080 R14: 0000000000000000 R15: 00000000fffffffe
Aug 28 15:20:00 fw02-lab [  467.078864] FS:  00007f8bc707f700(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
Aug 28 15:20:00 fw02-lab [  467.079037] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 28 15:20:00 fw02-lab [  467.079148] CR2: 00007f8bc72e6ac0 CR3: 0000000126bc3000 CR4: 00000000000007f0
Aug 28 15:20:00 fw02-lab [  467.079260] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 28 15:20:00 fw02-lab [  467.079371] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug 28 15:20:00 fw02-lab [  467.079484] Process conntrackd (pid: 2853, threadinfo ffff88012650e000, task ffff8801382eb400)
Aug 28 15:20:00 fw02-lab [  467.079652] Stack:
Aug 28 15:20:00 fw02-lab [  467.079754]  0000000000000286
Aug 28 15:20:00 fw02-lab ffff880130c4ca20
Aug 28 15:20:00 fw02-lab 0000000000000000
Aug 28 15:20:00 fw02-lab ffffffff8152e188
Aug 28 15:20:00 fw02-lab 
Aug 28 15:20:00 fw02-lab [  467.080103]  0000000000000000
Aug 28 15:20:00 fw02-lab 0000000000000000
Aug 28 15:20:00 fw02-lab 0000000000000286
Aug 28 15:20:00 fw02-lab ffffffff81054901
Aug 28 15:20:00 fw02-lab 
Aug 28 15:20:00 fw02-lab [  467.080591]  ffff880130c4ca20
Aug 28 15:20:00 fw02-lab ffffffff81bcd080
Aug 28 15:20:00 fw02-lab 0000000000000000
Aug 28 15:20:00 fw02-lab ffffffff81bcd080
Aug 28 15:20:00 fw02-lab 
Aug 28 15:20:00 fw02-lab [  467.080938] Call Trace:
Aug 28 15:20:00 fw02-lab [  467.081044]  [<ffffffff8152e188>] ? nf_ct_delete+0xe8/0x280
Aug 28 15:20:00 fw02-lab [  467.081155]  [<ffffffff81054901>] ? del_timer+0x61/0x80
Aug 28 15:20:00 fw02-lab [  467.081265]  [<ffffffff8152e684>] ? early_drop+0xf4/0x160
Aug 28 15:20:00 fw02-lab [  467.081376]  [<ffffffff8152ed8f>] ? __nf_conntrack_alloc+0x24f/0x290
Aug 28 15:20:00 fw02-lab [  467.081490]  [<ffffffff815374db>] ? ctnetlink_create_conntrack+0x4b/0x440
Aug 28 15:20:00 fw02-lab [  467.081602]  [<ffffffff815353f7>] ? ctnetlink_parse_tuple+0x147/0x170
Aug 28 15:20:00 fw02-lab [  467.081712]  [<ffffffff8152dadd>] ? ____nf_conntrack_find+0x10d/0x120
Aug 28 15:20:00 fw02-lab [  467.081823]  [<ffffffff8152db39>] ? __nf_conntrack_find_get+0x49/0x170
Aug 28 15:20:00 fw02-lab [  467.081935]  [<ffffffff81538c29>] ? ctnetlink_new_conntrack+0x299/0x3e0
Aug 28 15:20:00 fw02-lab [  467.082047]  [<ffffffff812811b0>] ? nla_parse+0x80/0xd0
Aug 28 15:20:00 fw02-lab [  467.082158]  [<ffffffff8152cbee>] ? nfnetlink_rcv_msg+0x1ee/0x220
Aug 28 15:20:00 fw02-lab [  467.082270]  [<ffffffff8152ca2a>] ? nfnetlink_rcv_msg+0x2a/0x220
Aug 28 15:20:00 fw02-lab [  467.082381]  [<ffffffff8152ca00>] ? nfnl_lock+0x20/0x20
Aug 28 15:20:00 fw02-lab [  467.082491]  [<ffffffff81529639>] ? netlink_rcv_skb+0x99/0xc0
Aug 28 15:20:00 fw02-lab [  467.082602]  [<ffffffff81528fcf>] ? netlink_unicast+0x1af/0x200
Aug 28 15:20:00 fw02-lab [  467.082712]  [<ffffffff81529258>] ? netlink_sendmsg+0x238/0x350
Aug 28 15:20:00 fw02-lab [  467.082824]  [<ffffffff814e5204>] ? sock_sendmsg+0xe4/0x130
Aug 28 15:20:00 fw02-lab [  467.082933]  [<ffffffff814e506d>] ? sock_recvmsg+0xed/0x140
Aug 28 15:20:00 fw02-lab [  467.083044]  [<ffffffff8118001e>] ? ext4_file_write+0x6e/0x2a0
Aug 28 15:20:00 fw02-lab [  467.083155]  [<ffffffff8111a10c>] ? do_path_lookup+0x2c/0xd0
Aug 28 15:20:00 fw02-lab [  467.083264]  [<ffffffff811171ef>] ? getname_flags+0x6f/0x270
Aug 28 15:20:00 fw02-lab [  467.083374]  [<ffffffff8110c447>] ? do_sync_write+0xc7/0x100
Aug 28 15:20:00 fw02-lab [  467.083482]  [<ffffffff814e43f2>] ? sockfd_lookup_light+0x22/0x90
Aug 28 15:20:00 fw02-lab [  467.083592]  [<ffffffff814e710c>] ? sys_sendto+0x13c/0x1a0
Aug 28 15:20:00 fw02-lab [  467.083702]  [<ffffffff81687862>] ? system_call_fastpath+0x16/0x1b
Aug 28 15:20:00 fw02-lab [  467.083810] Code: 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab 8b 
Aug 28 15:20:00 fw02-lab 43 
Aug 28 15:20:00 fw02-lab 08 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab 8b 
Aug 28 15:20:00 fw02-lab 53 
Aug 28 15:20:00 fw02-lab 10 
Aug 28 15:20:00 fw02-lab a8 
Aug 28 15:20:00 fw02-lab 01 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab 89 
Aug 28 15:20:00 fw02-lab 02 
Aug 28 15:20:00 fw02-lab 75 
Aug 28 15:20:00 fw02-lab 04 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab 89 
Aug 28 15:20:00 fw02-lab 50 
Aug 28 15:20:00 fw02-lab 08 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab 8b 
Aug 28 15:20:00 fw02-lab 43 
Aug 28 15:20:00 fw02-lab 40 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab be 
Aug 28 15:20:00 fw02-lab 00 
Aug 28 15:20:00 fw02-lab 02 
Aug 28 15:20:00 fw02-lab 20 
Aug 28 15:20:00 fw02-lab 00 
Aug 28 15:20:00 fw02-lab 00 
Aug 28 15:20:00 fw02-lab 00 
Aug 28 15:20:00 fw02-lab ad 
Aug 28 15:20:00 fw02-lab de 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab 8b 
Aug 28 15:20:00 fw02-lab 53 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab 89 
Aug 28 15:20:00 fw02-lab 73 
Aug 28 15:20:00 fw02-lab 10 
Aug 28 15:20:00 fw02-lab a8 
Aug 28 15:20:00 fw02-lab 01 
Aug 28 15:20:00 fw02-lab 
Aug 28 15:20:00 fw02-lab 89 
Aug 28 15:20:00 fw02-lab 02 
Aug 28 15:20:00 fw02-lab 75 
Aug 28 15:20:00 fw02-lab 04 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab 89 
Aug 28 15:20:00 fw02-lab 50 
Aug 28 15:20:00 fw02-lab 08 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab 89 
Aug 28 15:20:00 fw02-lab df 
Aug 28 15:20:00 fw02-lab 48 
Aug 28 15:20:00 fw02-lab b9 
Aug 28 15:20:00 fw02-lab 00 
Aug 28 15:20:00 fw02-lab 02 
Aug 28 15:20:00 fw02-lab 20 
Aug 28 15:20:00 fw02-lab 00 
Aug 28 15:20:00 fw02-lab 00 
Aug 28 15:20:00 fw02-lab 00 
Aug 28 15:20:00 fw02-lab ad 
Aug 28 15:20:00 fw02-lab 
Aug 28 15:20:00 fw02-lab [  467.086984] RIP 
Aug 28 15:20:00 fw02-lab [<ffffffff8152d6c7>] nf_ct_delete_from_lists+0x57/0x90
Aug 28 15:20:00 fw02-lab [  467.087142]  RSP <ffff88012650f858>
Aug 28 15:20:00 fw02-lab [  467.087263] ---[ end trace 71ca302640a71c1f ]---
Aug 28 15:20:00 fw02-lab [  467.087525] Kernel panic - not syncing: Fatal exception in interrupt
Aug 28 15:20:00 fw02-lab [  467.087642] Rebooting in 5 seconds..


<SNIP>


Aug 28 15:50:00 fw02-lab [  220.903447] testing netconsole again, hello!
Aug 28 15:51:28 fw02-lab [  308.589992] nf_conntrack: table full, dropping packet.
Aug 28 15:51:28 fw02-lab [  308.590036] nf_conntrack: table full, dropping packet.
Aug 28 15:51:28 fw02-lab [  308.590061] nf_conntrack: table full, dropping packet.
Aug 28 15:51:28 fw02-lab [  308.590082] nf_conntrack: table full, dropping packet.
Aug 28 15:51:28 fw02-lab [  308.590131] nf_conntrack: table full, dropping packet.
Aug 28 15:51:28 fw02-lab [  308.590139] nf_conntrack: table full, dropping packet.
Aug 28 15:51:28 fw02-lab [  308.590201] nf_conntrack: table full, dropping packet.
Aug 28 15:51:28 fw02-lab [  308.590239] nf_conntrack: table full, dropping packet.
Aug 28 15:51:28 fw02-lab [  308.590352] nf_conntrack: table full, dropping packet.
Aug 28 15:51:28 fw02-lab [  308.590442] nf_conntrack: table full, dropping packet.
Aug 28 15:51:29 fw02-lab [  310.128590] general protection fault: 0000 [#1] 
Aug 28 15:51:29 fw02-lab SMP 
Aug 28 15:51:29 fw02-lab 
Aug 28 15:51:29 fw02-lab [  310.128741] CPU 0 
Aug 28 15:51:29 fw02-lab 
Aug 28 15:51:29 fw02-lab [  310.128791] Modules linked in:
Aug 28 15:51:29 fw02-lab xt_set
Aug 28 15:51:29 fw02-lab ip_set_hash_ipportnet
Aug 28 15:51:29 fw02-lab ip_set_hash_net
Aug 28 15:51:29 fw02-lab ip_set
Aug 28 15:51:29 fw02-lab igb
Aug 28 15:51:29 fw02-lab dca
Aug 28 15:51:29 fw02-lab bnx2
Aug 28 15:51:29 fw02-lab iTCO_wdt
Aug 28 15:51:29 fw02-lab 
Aug 28 15:51:29 fw02-lab [  310.129312] 
Aug 28 15:51:29 fw02-lab [  310.129361] Pid: 2859, comm: conntrackd Not tainted 3.4.10-zerolag+ #2
Aug 28 15:51:29 fw02-lab Dell Inc. PowerEdge R310
Aug 28 15:51:29 fw02-lab /05XKKK
Aug 28 15:51:29 fw02-lab 
Aug 28 15:51:29 fw02-lab [  310.129561] RIP: 0010:[<ffffffff8152d7d7>] 
Aug 28 15:51:29 fw02-lab [<ffffffff8152d7d7>] nf_ct_delete_from_lists+0x57/0x90
Aug 28 15:51:29 fw02-lab [  310.129726] RSP: 0018:ffff880132391858  EFLAGS: 00010202
Aug 28 15:51:29 fw02-lab [  310.129837] RAX: 000000000019a05b RBX: ffff880132b35b00 RCX: 0000000000245675
Aug 28 15:51:29 fw02-lab [  310.129951] RDX: dead000000200200 RSI: dead000000200200 RDI: ffffffff81bcf580
Aug 28 15:51:29 fw02-lab [  310.130064] RBP: ffffffff81bcd080 R08: 0000000000013c70 R09: ffffffff81527493
Aug 28 15:51:29 fw02-lab [  310.130180] R10: ffffffff814ed911 R11: 0000000000000004 R12: ffff880132aefa98
Aug 28 15:51:29 fw02-lab [  310.130293] R13: 0000000000000000 R14: ffff880132b35b04 R15: 00000000fffffffe
Aug 28 15:51:29 fw02-lab [  310.130408] FS:  00007fb285c74700(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
Aug 28 15:51:29 fw02-lab [  310.130579] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 28 15:51:29 fw02-lab [  310.130692] CR2: 00007fb2852fe3a0 CR3: 0000000134c54000 CR4: 00000000000007f0
Aug 28 15:51:29 fw02-lab [  310.130804] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 28 15:51:29 fw02-lab [  310.130917] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug 28 15:51:29 fw02-lab [  310.131029] Process conntrackd (pid: 2859, threadinfo ffff880132390000, task ffff880139b65480)
Aug 28 15:51:29 fw02-lab [  310.131195] Stack:
Aug 28 15:51:29 fw02-lab [  310.131298]  0000000000000000
Aug 28 15:51:29 fw02-lab ffff880132b35b00
Aug 28 15:51:29 fw02-lab ffff880132b35b78
Aug 28 15:51:29 fw02-lab ffffffff8152e530
Aug 28 15:51:29 fw02-lab 
Aug 28 15:51:29 fw02-lab [  310.131649]  0000000000000000
Aug 28 15:51:29 fw02-lab 0000000000000001
Aug 28 15:51:29 fw02-lab ffff880132b35b00
Aug 28 15:51:29 fw02-lab 0000000000000b2a
Aug 28 15:51:29 fw02-lab 
Aug 28 15:51:29 fw02-lab [  310.131994]  ffffffff81bcd080
Aug 28 15:51:29 fw02-lab ffff880132b35b00
Aug 28 15:51:29 fw02-lab 0000000000000000
Aug 28 15:51:29 fw02-lab ffffffff81bcd080
Aug 28 15:51:29 fw02-lab 
Aug 28 15:51:29 fw02-lab [  310.132343] Call Trace:
Aug 28 15:51:29 fw02-lab [  310.132449]  [<ffffffff8152e530>] ? death_by_timeout+0x1c0/0x1e0
Aug 28 15:51:29 fw02-lab [  310.132560]  [<ffffffff8152e7d8>] ? early_drop+0xe8/0x160
Aug 28 15:51:29 fw02-lab [  310.132670]  [<ffffffff8152eecf>] ? __nf_conntrack_alloc+0x24f/0x290
Aug 28 15:51:29 fw02-lab [  310.132783]  [<ffffffff8153762b>] ? ctnetlink_create_conntrack+0x4b/0x440
Aug 28 15:51:29 fw02-lab [  310.132894]  [<ffffffff81535547>] ? ctnetlink_parse_tuple+0x147/0x170
Aug 28 15:51:29 fw02-lab [  310.133005]  [<ffffffff8152dd2d>] ? ____nf_conntrack_find+0x10d/0x120
Aug 28 15:51:29 fw02-lab [  310.133115]  [<ffffffff8152dd89>] ? __nf_conntrack_find_get+0x49/0x170
Aug 28 15:51:29 fw02-lab [  310.133227]  [<ffffffff81538ef1>] ? ctnetlink_new_conntrack+0x2a1/0x3f0
Aug 28 15:51:29 fw02-lab [  310.133339]  [<ffffffff812811b0>] ? nla_parse+0x80/0xd0
Aug 28 15:51:29 fw02-lab [  310.133564]  [<ffffffff8152cbee>] ? nfnetlink_rcv_msg+0x1ee/0x220
Aug 28 15:51:29 fw02-lab [  310.133674]  [<ffffffff8152ca2a>] ? nfnetlink_rcv_msg+0x2a/0x220
Aug 28 15:51:29 fw02-lab [  310.133784]  [<ffffffff8152ca00>] ? nfnl_lock+0x20/0x20
Aug 28 15:51:29 fw02-lab [  310.133892]  [<ffffffff81529639>] ? netlink_rcv_skb+0x99/0xc0
Aug 28 15:51:29 fw02-lab [  310.134000]  [<ffffffff81528fcf>] ? netlink_unicast+0x1af/0x200
Aug 28 15:51:29 fw02-lab [  310.134110]  [<ffffffff81529258>] ? netlink_sendmsg+0x238/0x350
Aug 28 15:51:29 fw02-lab [  310.134221]  [<ffffffff814e5204>] ? sock_sendmsg+0xe4/0x130
Aug 28 15:51:29 fw02-lab [  310.134329]  [<ffffffff814e506d>] ? sock_recvmsg+0xed/0x140
Aug 28 15:51:29 fw02-lab [  310.134439]  [<ffffffff8118001e>] ? ext4_file_write+0x6e/0x2a0
Aug 28 15:51:29 fw02-lab [  310.134551]  [<ffffffff8111a10c>] ? do_path_lookup+0x2c/0xd0
Aug 28 15:51:29 fw02-lab [  310.134659]  [<ffffffff811171ef>] ? getname_flags+0x6f/0x270
Aug 28 15:51:29 fw02-lab [  310.134771]  [<ffffffff8110c447>] ? do_sync_write+0xc7/0x100
Aug 28 15:51:29 fw02-lab [  310.134880]  [<ffffffff814e43f2>] ? sockfd_lookup_light+0x22/0x90
Aug 28 15:51:29 fw02-lab [  310.134990]  [<ffffffff814e710c>] ? sys_sendto+0x13c/0x1a0
Aug 28 15:51:29 fw02-lab [  310.135102]  [<ffffffff81687b62>] ? system_call_fastpath+0x16/0x1b
Aug 28 15:51:29 fw02-lab [  310.135214] Code: 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab 8b 
Aug 28 15:51:29 fw02-lab 43 
Aug 28 15:51:29 fw02-lab 08 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab 8b 
Aug 28 15:51:29 fw02-lab 53 
Aug 28 15:51:29 fw02-lab 10 
Aug 28 15:51:29 fw02-lab a8 
Aug 28 15:51:29 fw02-lab 01 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab 89 
Aug 28 15:51:29 fw02-lab 02 
Aug 28 15:51:29 fw02-lab 75 
Aug 28 15:51:29 fw02-lab 04 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab 89 
Aug 28 15:51:29 fw02-lab 50 
Aug 28 15:51:29 fw02-lab 08 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab 8b 
Aug 28 15:51:29 fw02-lab 43 
Aug 28 15:51:29 fw02-lab 40 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab be 
Aug 28 15:51:29 fw02-lab 00 
Aug 28 15:51:29 fw02-lab 02 
Aug 28 15:51:29 fw02-lab 20 
Aug 28 15:51:29 fw02-lab 00 
Aug 28 15:51:29 fw02-lab 00 
Aug 28 15:51:29 fw02-lab 00 
Aug 28 15:51:29 fw02-lab ad 
Aug 28 15:51:29 fw02-lab de 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab 8b 
Aug 28 15:51:29 fw02-lab 53 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab 89 
Aug 28 15:51:29 fw02-lab 73 
Aug 28 15:51:29 fw02-lab 10 
Aug 28 15:51:29 fw02-lab a8 
Aug 28 15:51:29 fw02-lab 01 
Aug 28 15:51:29 fw02-lab 
Aug 28 15:51:29 fw02-lab 89 
Aug 28 15:51:29 fw02-lab 02 
Aug 28 15:51:29 fw02-lab 75 
Aug 28 15:51:29 fw02-lab 04 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab 89 
Aug 28 15:51:29 fw02-lab 50 
Aug 28 15:51:29 fw02-lab 08 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab 89 
Aug 28 15:51:29 fw02-lab df 
Aug 28 15:51:29 fw02-lab 48 
Aug 28 15:51:29 fw02-lab b9 
Aug 28 15:51:29 fw02-lab 00 
Aug 28 15:51:29 fw02-lab 02 
Aug 28 15:51:29 fw02-lab 20 
Aug 28 15:51:29 fw02-lab 00 
Aug 28 15:51:29 fw02-lab 00 
Aug 28 15:51:29 fw02-lab 00 
Aug 28 15:51:29 fw02-lab ad 
Aug 28 15:51:29 fw02-lab 
Aug 28 15:51:29 fw02-lab [  310.138394] RIP 
Aug 28 15:51:29 fw02-lab [<ffffffff8152d7d7>] nf_ct_delete_from_lists+0x57/0x90
Aug 28 15:51:29 fw02-lab [  310.138551]  RSP <ffff880132391858>
Aug 28 15:51:29 fw02-lab [  310.138671] ---[ end trace d353ce02b91230e2 ]---
Aug 28 15:51:29 fw02-lab [  310.138784] Kernel panic - not syncing: Fatal exception in interrupt
Aug 28 15:51:29 fw02-lab [  310.138904] Rebooting in 5 seconds..
diff mbox

Patch

diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index ab86036..8f92f77 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -210,8 +210,7 @@  __nf_conntrack_find(struct net *net, u16 zone,
 		    const struct nf_conntrack_tuple *tuple);
 
 extern int nf_conntrack_hash_check_insert(struct nf_conn *ct);
-extern void nf_ct_delete_from_lists(struct nf_conn *ct);
-extern void nf_ct_insert_dying_list(struct nf_conn *ct);
+bool nf_ct_delete(struct nf_conn *ct, u32 pid, int report);
 
 extern void nf_conntrack_flush_report(struct net *net, u32 pid, int report);
 
@@ -278,7 +277,8 @@  extern void nf_ct_untracked_status_or(unsigned long bits);
 
 /* Iterate over all conntracks: if iter returns true, it's deleted. */
 extern void
-nf_ct_iterate_cleanup(struct net *net, int (*iter)(struct nf_conn *i, void *data), void *data);
+nf_ct_iterate_cleanup_new(struct net *net, int (*iter)(struct nf_conn *i, void *data), void *data, u32 pid, int report);
+extern void nf_ct_iterate_cleanup(struct net *net, int (*iter)(struct nf_conn *i, void *data), void *data);
 extern void nf_conntrack_free(struct nf_conn *ct);
 extern struct nf_conn *
 nf_conntrack_alloc(struct net *net, u16 zone,
diff --git a/include/net/netfilter/nf_conntrack_ecache.h b/include/net/netfilter/nf_conntrack_ecache.h
index a88fb69..2f7b0ac 100644
--- a/include/net/netfilter/nf_conntrack_ecache.h
+++ b/include/net/netfilter/nf_conntrack_ecache.h
@@ -108,7 +108,7 @@  nf_conntrack_eventmask_report(unsigned int eventmask,
 	if (e == NULL)
 		goto out_unlock;
 
-	if (nf_ct_is_confirmed(ct) && !nf_ct_is_dying(ct)) {
+	if (nf_ct_is_confirmed(ct)) {
 		struct nf_ct_event item = {
 			.ct 	= ct,
 			.pid	= e->pid ? e->pid : pid,
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 729f157..bdc9473 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -231,7 +231,7 @@  destroy_conntrack(struct nf_conntrack *nfct)
 	nf_conntrack_free(ct);
 }
 
-void nf_ct_delete_from_lists(struct nf_conn *ct)
+static void nf_ct_delete_from_lists(struct nf_conn *ct)
 {
 	struct net *net = nf_ct_net(ct);
 
@@ -243,7 +243,6 @@  void nf_ct_delete_from_lists(struct nf_conn *ct)
 	clean_from_lists(ct);
 	spin_unlock_bh(&nf_conntrack_lock);
 }
-EXPORT_SYMBOL_GPL(nf_ct_delete_from_lists);
 
 static void death_by_event(unsigned long ul_conntrack)
 {
@@ -257,15 +256,13 @@  static void death_by_event(unsigned long ul_conntrack)
 		add_timer(&ct->timeout);
 		return;
 	}
-	/* we've got the event delivered, now it's dying */
-	set_bit(IPS_DYING_BIT, &ct->status);
 	spin_lock(&nf_conntrack_lock);
 	hlist_nulls_del(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
 	spin_unlock(&nf_conntrack_lock);
 	nf_ct_put(ct);
 }
 
-void nf_ct_insert_dying_list(struct nf_conn *ct)
+static void nf_ct_insert_dying_list(struct nf_conn *ct)
 {
 	struct net *net = nf_ct_net(ct);
 
@@ -280,27 +277,32 @@  void nf_ct_insert_dying_list(struct nf_conn *ct)
 		(random32() % net->ct.sysctl_events_retry_timeout);
 	add_timer(&ct->timeout);
 }
-EXPORT_SYMBOL_GPL(nf_ct_insert_dying_list);
 
-static void death_by_timeout(unsigned long ul_conntrack)
+bool nf_ct_delete(struct nf_conn *ct, u32 pid, int report)
 {
-	struct nf_conn *ct = (void *)ul_conntrack;
 	struct nf_conn_tstamp *tstamp;
 
 	tstamp = nf_conn_tstamp_find(ct);
 	if (tstamp && tstamp->stop == 0)
 		tstamp->stop = ktime_to_ns(ktime_get_real());
 
-	if (!test_bit(IPS_DYING_BIT, &ct->status) &&
-	    unlikely(nf_conntrack_event(IPCT_DESTROY, ct) < 0)) {
+	if (!test_and_set_bit(IPS_DYING_BIT, &ct->status) &&
+	    unlikely(nf_conntrack_event_report(IPCT_DESTROY, ct, pid,
+								report) < 0)) {
 		/* destroy event was not delivered */
 		nf_ct_delete_from_lists(ct);
 		nf_ct_insert_dying_list(ct);
-		return;
+		return false;
 	}
-	set_bit(IPS_DYING_BIT, &ct->status);
 	nf_ct_delete_from_lists(ct);
 	nf_ct_put(ct);
+	return true;
+}
+EXPORT_SYMBOL_GPL(nf_ct_delete);
+
+static void death_by_timeout(unsigned long ul_conntrack)
+{
+	nf_ct_delete((struct nf_conn *)ul_conntrack, 0, 0);
 }
 
 /*
@@ -634,11 +636,9 @@  static noinline int early_drop(struct net *net, unsigned int hash)
 	if (!ct)
 		return dropped;
 
-	if (del_timer(&ct->timeout)) {
-		death_by_timeout((unsigned long)ct);
-		/* Check if we indeed killed this entry. Reliable event
-		   delivery may have inserted it into the dying list. */
-		if (test_bit(IPS_DYING_BIT, &ct->status)) {
+	if (!nf_ct_is_dying(ct) && del_timer(&ct->timeout)) {
+		/* Check if we indeed killed this entry */
+		if (nf_ct_delete(ct, 0, 0)) {
 			dropped = 1;
 			NF_CT_STAT_INC_ATOMIC(net, early_drop);
 		}
@@ -1124,8 +1124,8 @@  bool __nf_ct_kill_acct(struct nf_conn *ct,
 		}
 	}
 
-	if (del_timer(&ct->timeout)) {
-		ct->timeout.function((unsigned long)ct);
+	if (!nf_ct_is_dying(ct) && del_timer(&ct->timeout)) {
+		nf_ct_delete(ct, 0, 0);
 		return true;
 	}
 	return false;
@@ -1225,8 +1225,7 @@  get_next_corpse(struct net *net, int (*iter)(struct nf_conn *i, void *data),
 	}
 	hlist_nulls_for_each_entry(h, n, &net->ct.unconfirmed, hnnode) {
 		ct = nf_ct_tuplehash_to_ctrack(h);
-		if (iter(ct, data))
-			set_bit(IPS_DYING_BIT, &ct->status);
+		iter(ct, data);
 	}
 	spin_unlock_bh(&nf_conntrack_lock);
 	return NULL;
@@ -1236,50 +1235,40 @@  found:
 	return ct;
 }
 
-void nf_ct_iterate_cleanup(struct net *net,
+void nf_ct_iterate_cleanup_new(struct net *net,
 			   int (*iter)(struct nf_conn *i, void *data),
-			   void *data)
+			   void *data, u32 pid, int report)
 {
 	struct nf_conn *ct;
 	unsigned int bucket = 0;
 
 	while ((ct = get_next_corpse(net, iter, data, &bucket)) != NULL) {
 		/* Time to push up daises... */
-		if (del_timer(&ct->timeout))
-			death_by_timeout((unsigned long)ct);
+		if (!nf_ct_is_dying(ct) && del_timer(&ct->timeout))
+			nf_ct_delete(ct, pid, report);
 		/* ... else the timer will get him soon. */
 
 		nf_ct_put(ct);
 	}
 }
-EXPORT_SYMBOL_GPL(nf_ct_iterate_cleanup);
+EXPORT_SYMBOL_GPL(nf_ct_iterate_cleanup_new);
 
-struct __nf_ct_flush_report {
-	u32 pid;
-	int report;
-};
+void nf_ct_iterate_cleanup(struct net *net,
+			   int (*iter)(struct nf_conn *i, void *data),
+			   void *data)
+{
+	nf_ct_iterate_cleanup_new(net, iter, data, 0, 0);
+}
+EXPORT_SYMBOL_GPL(nf_ct_iterate_cleanup);
 
-static int kill_report(struct nf_conn *i, void *data)
+static int kill_all(struct nf_conn *i, void *data)
 {
-	struct __nf_ct_flush_report *fr = (struct __nf_ct_flush_report *)data;
 	struct nf_conn_tstamp *tstamp;
 
 	tstamp = nf_conn_tstamp_find(i);
 	if (tstamp && tstamp->stop == 0)
 		tstamp->stop = ktime_to_ns(ktime_get_real());
 
-	/* If we fail to deliver the event, death_by_timeout() will retry */
-	if (nf_conntrack_event_report(IPCT_DESTROY, i,
-				      fr->pid, fr->report) < 0)
-		return 1;
-
-	/* Avoid the delivery of the destroy event in death_by_timeout(). */
-	set_bit(IPS_DYING_BIT, &i->status);
-	return 1;
-}
-
-static int kill_all(struct nf_conn *i, void *data)
-{
 	return 1;
 }
 
@@ -1295,11 +1284,7 @@  EXPORT_SYMBOL_GPL(nf_ct_free_hashtable);
 
 void nf_conntrack_flush_report(struct net *net, u32 pid, int report)
 {
-	struct __nf_ct_flush_report fr = {
-		.pid 	= pid,
-		.report = report,
-	};
-	nf_ct_iterate_cleanup(net, kill_report, &fr);
+	nf_ct_iterate_cleanup_new(net, kill_all, NULL, pid, report);
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_flush_report);
 
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index ca7e835..2b0c9c1 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -967,21 +967,9 @@  ctnetlink_del_conntrack(struct sock *ctnl, struct sk_buff *skb,
 		}
 	}
 
-	if (del_timer(&ct->timeout)) {
-		if (nf_conntrack_event_report(IPCT_DESTROY, ct,
-					      NETLINK_CB(skb).pid,
-					      nlmsg_report(nlh)) < 0) {
-			nf_ct_delete_from_lists(ct);
-			/* we failed to report the event, try later */
-			nf_ct_insert_dying_list(ct);
-			nf_ct_put(ct);
-			return 0;
-		}
-		/* death_by_timeout would report the event again */
-		set_bit(IPS_DYING_BIT, &ct->status);
-		nf_ct_delete_from_lists(ct);
-		nf_ct_put(ct);
-	}
+	if (!nf_ct_is_dying(ct) && del_timer(&ct->timeout))
+		nf_ct_delete(ct, NETLINK_CB(skb).pid, nlmsg_report(nlh));
+
 	nf_ct_put(ct);
 
 	return 0;