Message ID | b7fbf103cd589741e3938550e7cf0f3684d8951c.1443605079.git.g.nault@alphalink.fr |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Here is similar panic after patch applied (it might be different bug), got over netconsole: [126348.610996] BUG: unable to handle kernel NULL pointer dereference at 0000000000000428 [126348.611656] IP: [<ffffffffa00ea129>] pppoe_release+0x56/0x142 [pppoe] [126348.612033] PGD 17d0b03067 PUD 17c721b067 PMD 0 [126348.612545] Oops: 0000 [#1] SMP [126348.612981] Modules linked in: act_skbedit sch_fq cls_fw act_police cls_u32 sch_ingress sch_sfq sch_htb pppoe pppox ppp_generic slhc netconsole configfs xt_nat ts_bm xt_string xt_connmark xt_TCPMSS xt_tcpudp xt_mark iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle ip_tables x_tables 8021q garp mrp stp llc bonding [126348.617115] CPU: 0 PID: 5254 Comm: accel-pppd Not tainted 4.2.2-build-0087 #2 [126348.617632] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014 [126348.618193] task: ffff8817cfbe0000 ti: ffff8817c6350000 task.ti: ffff8817c6350000 [126348.618696] RIP: 0010:[<ffffffffa00ea129>] [<ffffffffa00ea129>] pppoe_release+0x56/0x142 [pppoe] [126348.619306] RSP: 0018:ffff8817c6353e28 EFLAGS: 00010202 [126348.619601] RAX: 0000000000000000 RBX: ffff8817a92b0400 RCX: 0000000000000000 [126348.620152] RDX: 0000000000000001 RSI: 00000000fffffe01 RDI: ffffffff8180c18a [126348.620715] RBP: ffff8817c6353e68 R08: 0000000000000000 R09: 0000000000000000 [126348.621254] R10: ffff88173c02b210 R11: 0000000000000293 R12: ffff8817b3c18000 [126348.621784] R13: ffff8817b3c18030 R14: ffff8817967f1140 R15: ffff8817d226c920 [126348.622330] FS: 00007f9444db9700(0000) GS:ffff8817dee00000(0000) knlGS:0000000000000000 [126348.622876] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [126348.623202] CR2: 0000000000000428 CR3: 00000017c70b2000 CR4: 00000000001406f0 [126348.623760] Stack: [126348.624056] 0000000100200018 0000000000000000 0000000100000000 ffff8817b3c18000 [126348.624925] ffffffffa00ec280 ffff8817b3c18030 ffff8817967f1140 ffff8817d226c920 [126348.625736] ffff8817c6353e88 ffffffff8180820a ffff88173c02b200 0000000000000008 [126348.626533] Call Trace: [126348.626873] [<ffffffff8180820a>] sock_release+0x1a/0x70 [126348.627183] [<ffffffff8180826d>] sock_close+0xd/0x11 [126348.627512] [<ffffffff81152c61>] __fput+0xdf/0x193 [126348.627845] [<ffffffff81152d43>] ____fput+0x9/0xb [126348.628169] [<ffffffff810d098e>] task_work_run+0x78/0x8f [126348.628517] [<ffffffff810038a9>] do_notify_resume+0x40/0x4e [126348.628837] [<ffffffff818a5a0a>] int_signal+0x12/0x17 [126348.629131] Code: 48 8b 83 e0 00 00 00 a8 01 74 12 48 89 df e8 0d 24 72 e1 b8 f7 ff ff ff e9 eb 00 00 00 8a 43 12 a8 0b 74 1c 48 8b 83 a0 02 00 00 8b 80 28 04 00 00 65 ff 08 48 c7 83 a0 02 00 00 00 00 00 00 [126348.635060] RIP [<ffffffffa00ea129>] pppoe_release+0x56/0x142 [pppoe] [126348.635432] RSP <ffff8817c6353e28> [126348.635718] CR2: 0000000000000428 [126348.641165] ---[ end trace 911ff90a1416e3d1 ]--- [126348.653235] Kernel panic - not syncing: Fatal exception [126348.653538] Kernel Offset: disabled [126348.677177] Rebooting in 5 seconds.. On 2015-09-30 12:45, Guillaume Nault wrote: > Since commit 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in > pppoe_release"), > pppoe_release() calls dev_put(po->pppoe_dev) if sk is in the > PPPOX_ZOMBIE state. But pppoe_flush_dev() can set sk->sk_state to > PPPOX_ZOMBIE _and_ reset po->pppoe_dev to NULL. This leads to the > following oops: > > [ 570.140800] BUG: unable to handle kernel NULL pointer dereference > at 00000000000004e0 > [ 570.142931] IP: [<ffffffffa018c701>] pppoe_release+0x50/0x101 > [pppoe] > [ 570.144601] PGD 3d119067 PUD 3dbc1067 PMD 0 > [ 570.144601] Oops: 0000 [#1] SMP > [ 570.144601] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core > ip6_udp_tunnel udp_tunnel pppoe pppox ppp_generic slhc loop > crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac > drbg ansi_cprng aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul > glue_helper acpi_cpufreq evdev serio_raw processor button ext4 crc16 > mbcache jbd2 virtio_net virtio_blk virtio_pci virtio_ring virtio > [ 570.144601] CPU: 1 PID: 15738 Comm: ppp-apitest Not tainted 4.2.0 #1 > [ 570.144601] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS Debian-1.8.2-1 04/01/2014 > [ 570.144601] task: ffff88003d30d600 ti: ffff880036b60000 task.ti: > ffff880036b60000 > [ 570.144601] RIP: 0010:[<ffffffffa018c701>] [<ffffffffa018c701>] > pppoe_release+0x50/0x101 [pppoe] > [ 570.144601] RSP: 0018:ffff880036b63e08 EFLAGS: 00010202 > [ 570.144601] RAX: 0000000000000000 RBX: ffff880034340000 RCX: > 0000000000000206 > [ 570.144601] RDX: 0000000000000006 RSI: ffff88003d30dd20 RDI: > ffff88003d30dd20 > [ 570.144601] RBP: ffff880036b63e28 R08: 0000000000000001 R09: > 0000000000000000 > [ 570.144601] R10: 00007ffee9b50420 R11: ffff880034340078 R12: > ffff8800387ec780 > [ 570.144601] R13: ffff8800387ec7b0 R14: ffff88003e222aa0 R15: > ffff8800387ec7b0 > [ 570.144601] FS: 00007f5672f48700(0000) GS:ffff88003fc80000(0000) > knlGS:0000000000000000 > [ 570.144601] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 570.144601] CR2: 00000000000004e0 CR3: 0000000037f7e000 CR4: > 00000000000406a0 > [ 570.144601] Stack: > [ 570.144601] ffffffffa018f240 ffff8800387ec780 ffffffffa018f240 > ffff8800387ec7b0 > [ 570.144601] ffff880036b63e48 ffffffff812caabe ffff880039e4e000 > 0000000000000008 > [ 570.144601] ffff880036b63e58 ffffffff812cabad ffff880036b63ea8 > ffffffff811347f5 > [ 570.144601] Call Trace: > [ 570.144601] [<ffffffff812caabe>] sock_release+0x1a/0x75 > [ 570.144601] [<ffffffff812cabad>] sock_close+0xd/0x11 > [ 570.144601] [<ffffffff811347f5>] __fput+0xff/0x1a5 > [ 570.144601] [<ffffffff811348cb>] ____fput+0x9/0xb > [ 570.144601] [<ffffffff81056682>] task_work_run+0x66/0x90 > [ 570.144601] [<ffffffff8100189e>] prepare_exit_to_usermode+0x8c/0xa7 > [ 570.144601] [<ffffffff81001a26>] > syscall_return_slowpath+0x16d/0x19b > [ 570.144601] [<ffffffff813babb1>] int_ret_from_sys_call+0x25/0x9f > [ 570.144601] Code: 48 8b 83 c8 01 00 00 a8 01 74 12 48 89 df e8 8b > 27 14 e1 b8 f7 ff ff ff e9 b7 00 00 00 8a 43 12 a8 0b 74 1c 48 8b 83 > a8 04 00 00 <48> 8b 80 e0 04 00 00 65 ff 08 48 c7 83 a8 04 00 00 00 00 > 00 00 > [ 570.144601] RIP [<ffffffffa018c701>] pppoe_release+0x50/0x101 > [pppoe] > [ 570.144601] RSP <ffff880036b63e08> > [ 570.144601] CR2: 00000000000004e0 > [ 570.200518] ---[ end trace 46956baf17349563 ]--- > > pppoe_flush_dev() has no reason to override sk->sk_state with > PPPOX_ZOMBIE. pppox_unbind_sock() already sets sk->sk_state to > PPPOX_DEAD, which is the correct state given that sk is unbound and > po->pppoe_dev is NULL. > > Fixes: 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release") > Tested-by: Oleksii Berezhniak <core@irc.lg.ua> > Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> > --- > drivers/net/ppp/pppoe.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c > index 3837ae3..2ed7506 100644 > --- a/drivers/net/ppp/pppoe.c > +++ b/drivers/net/ppp/pppoe.c > @@ -313,7 +313,6 @@ static void pppoe_flush_dev(struct net_device *dev) > if (po->pppoe_dev == dev && > sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND | PPPOX_ZOMBIE)) > { > pppox_unbind_sock(sk); > - sk->sk_state = PPPOX_ZOMBIE; > sk->sk_state_change(sk); > po->pppoe_dev = NULL; > dev_put(dev); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Oct 02, 2015 at 11:01:45AM +0300, Denys Fedoryshchenko wrote: > Here is similar panic after patch applied (it might be different bug), got > over netconsole: > > [126348.617115] CPU: 0 PID: 5254 Comm: accel-pppd Not tainted > 4.2.2-build-0087 #2 > [126348.617632] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS > SE5C600.86B.02.03.0003.041920141333 04/19/2014 > [126348.618193] task: ffff8817cfbe0000 ti: ffff8817c6350000 task.ti: > ffff8817c6350000 > [126348.618696] RIP: 0010:[<ffffffffa00ea129>] > [<ffffffffa00ea129>] pppoe_release+0x56/0x142 [pppoe] > [126348.619306] RSP: 0018:ffff8817c6353e28 EFLAGS: 00010202 > [126348.619601] RAX: 0000000000000000 RBX: ffff8817a92b0400 RCX: > 0000000000000000 > [126348.620152] RDX: 0000000000000001 RSI: 00000000fffffe01 RDI: > ffffffff8180c18a > [126348.620715] RBP: ffff8817c6353e68 R08: 0000000000000000 R09: > 0000000000000000 > [126348.621254] R10: ffff88173c02b210 R11: 0000000000000293 R12: > ffff8817b3c18000 > [126348.621784] R13: ffff8817b3c18030 R14: ffff8817967f1140 R15: > ffff8817d226c920 > [126348.622330] FS: 00007f9444db9700(0000) GS:ffff8817dee00000(0000) > knlGS:0000000000000000 > [126348.622876] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [126348.623202] CR2: 0000000000000428 CR3: 00000017c70b2000 CR4: > 00000000001406f0 > [126348.623760] Stack: > [126348.624056] 0000000100200018 > 0000000000000000 > 0000000100000000 > ffff8817b3c18000 > > [126348.624925] ffffffffa00ec280 > ffff8817b3c18030 > ffff8817967f1140 > ffff8817d226c920 > > [126348.625736] ffff8817c6353e88 > ffffffff8180820a > ffff88173c02b200 > 0000000000000008 > > [126348.626533] Call Trace: > [126348.626873] [<ffffffff8180820a>] sock_release+0x1a/0x70 > [126348.627183] [<ffffffff8180826d>] sock_close+0xd/0x11 > [126348.627512] [<ffffffff81152c61>] __fput+0xdf/0x193 > [126348.627845] [<ffffffff81152d43>] ____fput+0x9/0xb > [126348.628169] [<ffffffff810d098e>] task_work_run+0x78/0x8f > [126348.628517] [<ffffffff810038a9>] do_notify_resume+0x40/0x4e > [126348.628837] [<ffffffff818a5a0a>] int_signal+0x12/0x17 Ok, so there's another possibility for pppoe_release() to be called while sk->sk_state is PPPOX_{CONNECTED,BOUND,ZOMBIE} but po->pppoe_dev is NULL. I'll check the code to see if I can find any race wrt. po->pppoe_dev and sk->sk_state settings. In a previous message, you said you'd try reverting 287f3a943fef ("pppoe: Use workqueue to die properly when a PADT is received") and related patches. I guess "related patches" means 665a6cd809f4 ("pppoe: drop pppoe device in pppoe_unbind_sock_work"), right?. Did these reverts give any successful result? BTW, please don't top-post. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015-10-02 20:54, Guillaume Nault wrote: > On Fri, Oct 02, 2015 at 11:01:45AM +0300, Denys Fedoryshchenko wrote: >> Here is similar panic after patch applied (it might be different bug), >> got >> over netconsole: >> >> [126348.617115] CPU: 0 PID: 5254 Comm: accel-pppd Not tainted >> 4.2.2-build-0087 #2 >> [126348.617632] Hardware name: Intel Corporation S2600GZ/S2600GZ, >> BIOS >> SE5C600.86B.02.03.0003.041920141333 04/19/2014 >> [126348.618193] task: ffff8817cfbe0000 ti: ffff8817c6350000 task.ti: >> ffff8817c6350000 >> [126348.618696] RIP: 0010:[<ffffffffa00ea129>] >> [<ffffffffa00ea129>] pppoe_release+0x56/0x142 [pppoe] >> [126348.619306] RSP: 0018:ffff8817c6353e28 EFLAGS: 00010202 >> [126348.619601] RAX: 0000000000000000 RBX: ffff8817a92b0400 RCX: >> 0000000000000000 >> [126348.620152] RDX: 0000000000000001 RSI: 00000000fffffe01 RDI: >> ffffffff8180c18a >> [126348.620715] RBP: ffff8817c6353e68 R08: 0000000000000000 R09: >> 0000000000000000 >> [126348.621254] R10: ffff88173c02b210 R11: 0000000000000293 R12: >> ffff8817b3c18000 >> [126348.621784] R13: ffff8817b3c18030 R14: ffff8817967f1140 R15: >> ffff8817d226c920 >> [126348.622330] FS: 00007f9444db9700(0000) GS:ffff8817dee00000(0000) >> knlGS:0000000000000000 >> [126348.622876] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [126348.623202] CR2: 0000000000000428 CR3: 00000017c70b2000 CR4: >> 00000000001406f0 >> [126348.623760] Stack: >> [126348.624056] 0000000100200018 >> 0000000000000000 >> 0000000100000000 >> ffff8817b3c18000 >> >> [126348.624925] ffffffffa00ec280 >> ffff8817b3c18030 >> ffff8817967f1140 >> ffff8817d226c920 >> >> [126348.625736] ffff8817c6353e88 >> ffffffff8180820a >> ffff88173c02b200 >> 0000000000000008 >> >> [126348.626533] Call Trace: >> [126348.626873] [<ffffffff8180820a>] sock_release+0x1a/0x70 >> [126348.627183] [<ffffffff8180826d>] sock_close+0xd/0x11 >> [126348.627512] [<ffffffff81152c61>] __fput+0xdf/0x193 >> [126348.627845] [<ffffffff81152d43>] ____fput+0x9/0xb >> [126348.628169] [<ffffffff810d098e>] task_work_run+0x78/0x8f >> [126348.628517] [<ffffffff810038a9>] do_notify_resume+0x40/0x4e >> [126348.628837] [<ffffffff818a5a0a>] int_signal+0x12/0x17 > > Ok, so there's another possibility for pppoe_release() to be called > while > sk->sk_state is PPPOX_{CONNECTED,BOUND,ZOMBIE} but po->pppoe_dev is > NULL. > > I'll check the code to see if I can find any race wrt. po->pppoe_dev > and sk->sk_state settings. > > In a previous message, you said you'd try reverting 287f3a943fef > ("pppoe: Use workqueue to die properly when a PADT is received") and > related patches. I guess "related patches" means 665a6cd809f4 ("pppoe: > drop pppoe device in pppoe_unbind_sock_work"), right?. > Did these reverts give any successful result? > > BTW, please don't top-post. I am doing just "dirty" patch like this, i cannot certainly remember if i was doing git reversal, because it was a while when i spotted this bug. After that pppoe server is not rebooting. diff -Naur linux-4.2.2-vanilla/drivers/net/ppp/pppoe.c linux-4.2.2-changed/drivers/net/ppp/pppoe.c --- linux-4.2.2-vanilla/drivers/net/ppp/pppoe.c 2015-09-29 20:38:27.000000000 +0300 +++ linux-4.2.2-changed/drivers/net/ppp/pppoe.c 2015-10-04 19:05:55.697732991 +0300 @@ -519,7 +519,7 @@ } bh_unlock_sock(sk); - if (!schedule_work(&po->proto.pppoe.padt_work)) +// if (!schedule_work(&po->proto.pppoe.padt_work)) sock_put(sk); } @@ -633,7 +633,7 @@ lock_sock(sk); - INIT_WORK(&po->proto.pppoe.padt_work, pppoe_unbind_sock_work); +// INIT_WORK(&po->proto.pppoe.padt_work, pppoe_unbind_sock_work); error = -EINVAL; if (sp->sa_protocol != PX_PROTO_OE) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Guillaume Nault <g.nault@alphalink.fr> Date: Wed, 30 Sep 2015 11:45:33 +0200 > Since commit 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release"), > pppoe_release() calls dev_put(po->pppoe_dev) if sk is in the > PPPOX_ZOMBIE state. But pppoe_flush_dev() can set sk->sk_state to > PPPOX_ZOMBIE _and_ reset po->pppoe_dev to NULL. This leads to the > following oops: ... > pppoe_flush_dev() has no reason to override sk->sk_state with > PPPOX_ZOMBIE. pppox_unbind_sock() already sets sk->sk_state to > PPPOX_DEAD, which is the correct state given that sk is unbound and > po->pppoe_dev is NULL. > > Fixes: 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release") > Tested-by: Oleksii Berezhniak <core@irc.lg.ua> > Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Applied and queued up for -stable, thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> I am doing just "dirty" patch like this, i cannot certainly remember if i > was doing git reversal, because > it was a while when i spotted this bug. After that pppoe server is not > rebooting. > > diff -Naur linux-4.2.2-vanilla/drivers/net/ppp/pppoe.c > linux-4.2.2-changed/drivers/net/ppp/pppoe.c > --- linux-4.2.2-vanilla/drivers/net/ppp/pppoe.c 2015-09-29 > 20:38:27.000000000 +0300 > +++ linux-4.2.2-changed/drivers/net/ppp/pppoe.c 2015-10-04 > 19:05:55.697732991 +0300 > @@ -519,7 +519,7 @@ > } > > bh_unlock_sock(sk); > - if (!schedule_work(&po->proto.pppoe.padt_work)) > +// if (!schedule_work(&po->proto.pppoe.padt_work)) > sock_put(sk); > } > > @@ -633,7 +633,7 @@ > > lock_sock(sk); > > - INIT_WORK(&po->proto.pppoe.padt_work, pppoe_unbind_sock_work); > +// INIT_WORK(&po->proto.pppoe.padt_work, pppoe_unbind_sock_work); > > error = -EINVAL; > if (sp->sa_protocol != PX_PROTO_OE) > Ok, so this is clearly related with PADT message handling. Setting sk->sk_state to PPPOX_ZOMBIE in pppoe_disc_rcv() looks wrong to me. Furthurmore, at a first glance, it doesn't look necessary. If you're feeling lucky, you can try the following diff (WARNING: not even compile-tested!): if (po) { struct sock *sk = sk_pppox(po); - bh_lock_sock(sk); - - /* If the user has locked the socket, just ignore - * the packet. With the way two rcv protocols hook into - * one socket family type, we cannot (easily) distinguish - * what kind of SKB it is during backlog rcv. - */ - if (sock_owned_by_user(sk) == 0) { - /* We're no longer connect at the PPPOE layer, - * and must wait for ppp channel to disconnect us. - */ - sk->sk_state = PPPOX_ZOMBIE; - } - - bh_unlock_sock(sk); if (!schedule_work(&po->proto.pppoe.padt_work)) sock_put(sk); } I'll take a closer look and do proper testing during the week. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 05, 2015 at 04:08:51AM +0000, Matt Bennett wrote: > Hi, I am seeing this panic occur occasionally however I am unsure how to > go about reproducing it. Is it enough to simply keep creating and > tearing down the PPP interface? I can also test and/or investigate this > issue if a suitable reproduction method is available. > There are at least two issues resulting in similar Oops. The first one goes with MTU/address/link state updates on the underlying interface: any such update on an interface used by a PPPoE connection will generally result in an Oops when releasing the PPPoE connection. This is fixed by e6740165b8f7 ("ppp: don't override sk->sk_state in pppoe_flush_dev()"). The second one seems to be trickier. It looks like a race wrt. PADT message reception. Reproducing the bug will probably require to generate some PADT flooding to a host that creates and releases PPPoE connections. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2015-10-05 at 14:24 +0200, Guillaume Nault wrote: > On Mon, Oct 05, 2015 at 04:08:51AM +0000, Matt Bennett wrote: > > Hi, I am seeing this panic occur occasionally however I am unsure how to > > go about reproducing it. Is it enough to simply keep creating and > > tearing down the PPP interface? I can also test and/or investigate this > > issue if a suitable reproduction method is available. > > > There are at least two issues resulting in similar Oops. > > The first one goes with MTU/address/link state updates on the > underlying interface: any such update on an interface used by a > PPPoE connection will generally result in an Oops when releasing the > PPPoE connection. This is fixed by e6740165b8f7 ("ppp: don't override > sk->sk_state in pppoe_flush_dev()"). Without your patch ("ppp: don't override sk->sk_state in pppoe_flush_dev()") I can see the following function calls being made when changing the mtu on the underlying ethernet interface for the PPPoE connection: 1. pppoe_flush_dev() - setting PPPOX_ZOMBIE 2. pppoe_connect - setting PPPOX_NONE (shown below) /* Delete the old binding */ if (stage_session(po->pppoe_pa.sid)) { pppox_unbind_sock(sk); pn = pppoe_pernet(sock_net(sk)); delete_item(pn, po->pppoe_pa.sid, po->pppoe_pa.remote, po->pppoe_ifindex); if (po->pppoe_dev) { dev_put(po->pppoe_dev); po->pppoe_dev = NULL; } memset(sk_pppox(po) + 1, 0, sizeof(struct pppox_sock) - sizeof(struct sock)); sk->sk_state = PPPOX_NONE; } 3. pppoe_release - No oops (since sk->sk_state is no longer in {PPPOX_CONNECTED,PPPOX_BOUND,PPPOX_ZOMBIE}) It doesn't look to me like the above functions can execute asynchronously but I'd have to look harder. I am using 3.16 by the way. > > The second one seems to be trickier. It looks like a race wrt. PADT > message reception. Reproducing the bug will probably require to > generate some PADT flooding to a host that creates and releases PPPoE > connections. I will investigate the PADT message reception however since I am on 3.16 I don't have the commits for "pppoe: Use workqueue to die properly when a PADT is received" and "pppoe: drop pppoe device in pppoe_unbind_sock_work". Matt > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > The second one seems to be trickier. It looks like a race wrt. PADT > > message reception. Reproducing the bug will probably require to > > generate some PADT flooding to a host that creates and releases PPPoE > > connections. Ok I think I can see the potential race here, specifically the PADT frame is received while the pppoe interface is being deleted. (I will have a go inducing this with msleep() in the code tomorrow) 1. pppoe_flush_dev() - sk->sk_state = PPPOX_DEAD, po->pppoe_dev = NULL 2. pppoe_connect() - sk->sk_state = PPPOX_NONE, po->pppoe_dev = NULL 3. pppoe_disc_rcv() - sk->sk_state = PPPOX_ZOMBIE po->pppoe_dev = NULL 4. pppoe_release() - dev_put(po->pppoe_dev) ----> Oops Either in pppoe_disc_rcv() we add the condition: @@ -496,7 +499,8 @@ static int pppoe_disc_rcv(struct sk_buff *skb, struct net_device *dev, /* We're no longer connect at the PPPOE layer, * and must wait for ppp channel to disconnect us. */ - sk->sk_state = PPPOX_ZOMBIE; + if (sk->sk_state & PPPOX_CONNECTED) + sk->sk_state = PPPOX_ZOMBIE; } Or perhaps we remove the assumption that the state PPPOX_ZOMBIE has a non-null pppoe_dev on it. I don't know why the code isn't like the following anyway. -if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND | PPPOX_ZOMBIE)) { +if (po->pppoe_dev) { dev_put(po->pppoe_dev); po->pppoe_dev = NULL; }
On Tue, Oct 06, 2015 at 12:26:20AM +0000, Matt Bennett wrote: > On Mon, 2015-10-05 at 14:24 +0200, Guillaume Nault wrote: > > On Mon, Oct 05, 2015 at 04:08:51AM +0000, Matt Bennett wrote: > > > Hi, I am seeing this panic occur occasionally however I am unsure how to > > > go about reproducing it. Is it enough to simply keep creating and > > > tearing down the PPP interface? I can also test and/or investigate this > > > issue if a suitable reproduction method is available. > > > > > There are at least two issues resulting in similar Oops. > > > > The first one goes with MTU/address/link state updates on the > > underlying interface: any such update on an interface used by a > > PPPoE connection will generally result in an Oops when releasing the > > PPPoE connection. This is fixed by e6740165b8f7 ("ppp: don't override > > sk->sk_state in pppoe_flush_dev()"). > > Without your patch ("ppp: don't override sk->sk_state in > pppoe_flush_dev()") I can see the following function calls being made > when changing the mtu on the underlying ethernet interface for the PPPoE > connection: > > 1. pppoe_flush_dev() - setting PPPOX_ZOMBIE > > 2. pppoe_connect - setting PPPOX_NONE (shown below) > > /* Delete the old binding */ > if (stage_session(po->pppoe_pa.sid)) { > pppox_unbind_sock(sk); > pn = pppoe_pernet(sock_net(sk)); > delete_item(pn, po->pppoe_pa.sid, > po->pppoe_pa.remote, po->pppoe_ifindex); > if (po->pppoe_dev) { > dev_put(po->pppoe_dev); > po->pppoe_dev = NULL; > } > > memset(sk_pppox(po) + 1, 0, > sizeof(struct pppox_sock) - sizeof(struct sock)); > sk->sk_state = PPPOX_NONE; > } > > 3. pppoe_release - No oops (since sk->sk_state is no longer in > {PPPOX_CONNECTED,PPPOX_BOUND,PPPOX_ZOMBIE}) > > It doesn't look to me like the above functions can execute > asynchronously but I'd have to look harder. I am using 3.16 by the way. > Just drop the pppoe_connect() call. Right after the pppoe_flush_dev() call, sk_state is PPPOX_ZOMBIE and pppoe_dev is NULL. This is enouhg to make pppoe_release() crash. The typical scenario e6740165b8f7 ("ppp: don't override sk->sk_state in pppoe_flush_dev()") fixes is: Userspace process #1: Userspace process #2: --------------------- --------------------- fd = socket(AF_PPPOX, PX_PROTO_OE, 0); connect(fd, {AF_PPPOX, PX_PROTO_EO, $sid, $mac_addr, $ifname}, sizeof(struct sockaddr_pppox)); ... process_packets() ... # ip link set $ifname mtu $mtu close(fd); --> Kernel Oops -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 06, 2015 at 04:46:04AM +0000, Matt Bennett wrote: > > > The second one seems to be trickier. It looks like a race wrt. PADT > > > message reception. Reproducing the bug will probably require to > > > generate some PADT flooding to a host that creates and releases PPPoE > > > connections. > > Ok I think I can see the potential race here, specifically the PADT > frame is received while the pppoe interface is being deleted. (I will > have a go inducing this with msleep() in the code tomorrow) > > 1. pppoe_flush_dev() - sk->sk_state = PPPOX_DEAD, po->pppoe_dev = NULL > > 2. pppoe_connect() - sk->sk_state = PPPOX_NONE, po->pppoe_dev = NULL > > 3. pppoe_disc_rcv() - sk->sk_state = PPPOX_ZOMBIE po->pppoe_dev = NULL > > 4. pppoe_release() - dev_put(po->pppoe_dev) ----> Oops > Again, I don't know why you introduce pppoe_connect() into the mix. But anyway, you got the point. Note that pppoe_flush_dev() could be replaced by other calls since we just need to reset po->pppoe_dev (another pppoe_unbind_sock_work() call, due to duplicated PADT, would also trigger the bug). Note also that pppoe_release() needs to be run before pppoe_unbind_sock_work() gets scheduled (or at least before it locks the socket). > Either in pppoe_disc_rcv() we add the condition: > > @@ -496,7 +499,8 @@ static int pppoe_disc_rcv(struct sk_buff *skb, > struct net_device *dev, > /* We're no longer connect at the PPPOE layer, > * and must wait for ppp channel to disconnect > us. > */ > - sk->sk_state = PPPOX_ZOMBIE; > + if (sk->sk_state & PPPOX_CONNECTED) > + sk->sk_state = PPPOX_ZOMBIE; > } > > Or perhaps we remove the assumption that the state PPPOX_ZOMBIE has a > non-null pppoe_dev on it. > I don't think adding complexity in the socket state management would be a good think. Actually I event think about dropping the PPPOX_ZOMBIE state altogether. But that's probably something for net-next. > I don't know why the code isn't like the following anyway. > > -if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND | PPPOX_ZOMBIE)) { > +if (po->pppoe_dev) { > dev_put(po->pppoe_dev); > po->pppoe_dev = NULL; > } I was thinking about that same approach. pppoe_release() is the only function making that assumption. Other parts of the code seem to only require that PPPOX_CONNECTED => pppoe_dev != NULL. But I think the original condition was valid. Adding PPPOX_ZOMBIE into the test and resetting pppoe_dev upon reception of PADT have changed the relationship between sk_state and pppoe_dev, which is where the problem stands. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-10-06 at 11:46 +0200, Guillaume Nault wrote: > On Tue, Oct 06, 2015 at 04:46:04AM +0000, Matt Bennett wrote: > > > > The second one seems to be trickier. It looks like a race wrt. PADT > > > > message reception. Reproducing the bug will probably require to > > > > generate some PADT flooding to a host that creates and releases PPPoE > > > > connections. > > > > Ok I think I can see the potential race here, specifically the PADT > > frame is received while the pppoe interface is being deleted. (I will > > have a go inducing this with msleep() in the code tomorrow) > > > > 1. pppoe_flush_dev() - sk->sk_state = PPPOX_DEAD, po->pppoe_dev = NULL > > > > 2. pppoe_connect() - sk->sk_state = PPPOX_NONE, po->pppoe_dev = NULL > > > > 3. pppoe_disc_rcv() - sk->sk_state = PPPOX_ZOMBIE po->pppoe_dev = NULL > > > > 4. pppoe_release() - dev_put(po->pppoe_dev) ----> Oops > > > Again, I don't know why you introduce pppoe_connect() into the mix. Sorry, I'm just going off the function calls I can see happening with my kernel (3.16). > But anyway, you got the point. Note that pppoe_flush_dev() could be > replaced by other calls since we just need to reset po->pppoe_dev > (another pppoe_unbind_sock_work() call, due to duplicated PADT, would > also trigger the bug). Note also that pppoe_release() needs to be run > before pppoe_unbind_sock_work() gets scheduled (or at least before it > locks the socket). > > > Either in pppoe_disc_rcv() we add the condition: > > > > @@ -496,7 +499,8 @@ static int pppoe_disc_rcv(struct sk_buff *skb, > > struct net_device *dev, > > /* We're no longer connect at the PPPOE layer, > > * and must wait for ppp channel to disconnect > > us. > > */ > > - sk->sk_state = PPPOX_ZOMBIE; > > + if (sk->sk_state & PPPOX_CONNECTED) > > + sk->sk_state = PPPOX_ZOMBIE; > > } > > > > Or perhaps we remove the assumption that the state PPPOX_ZOMBIE has a > > non-null pppoe_dev on it. > > > I don't think adding complexity in the socket state management would be > a good think. Actually I event think about dropping the PPPOX_ZOMBIE > state altogether. But that's probably something for net-next. > > > I don't know why the code isn't like the following anyway. > > > > -if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND | PPPOX_ZOMBIE)) { > > +if (po->pppoe_dev) { > > dev_put(po->pppoe_dev); > > po->pppoe_dev = NULL; > > } > I was thinking about that same approach. pppoe_release() is the only > function making that assumption. Other parts of the code seem to only > require that PPPOX_CONNECTED => pppoe_dev != NULL. > > But I think the original condition was valid. Adding PPPOX_ZOMBIE into > the test and resetting pppoe_dev upon reception of PADT have changed the > relationship between sk_state and pppoe_dev, which is where the problem > stands. Yes originally the condition was valid. But I think the issue is plain to see when you look at the comment beside PPPOX_ZOMBIE declared in the enum. PPPOX_ZOMBIE = 8, /* dead, but still bound to ppp device */ We have seen in the situation we have described previously that we can be in this state without being bound to the ppp device. In my opinion the entire logic around pppoe_disc_rcv()/pppoe_unbind_sock_work() looks wrong and I agree we should do what you suggested a few emails back. i.e in pppoe_disc_rcv(): if (po) { struct sock *sk = sk_pppox(po); - bh_lock_sock(sk); - - /* If the user has locked the socket, just ignore - * the packet. With the way two rcv protocols hook into - * one socket family type, we cannot (easily) distinguish - * what kind of SKB it is during backlog rcv. - */ - if (sock_owned_by_user(sk) == 0) { - /* We're no longer connect at the PPPOE layer, - * and must wait for ppp channel to disconnect us. - */ - sk->sk_state = PPPOX_ZOMBIE; - } - - bh_unlock_sock(sk); if (!schedule_work(&po->proto.pppoe.padt_work)) sock_put(sk); } Subsequently the PPPOX_ZOMBIE state can be completely removed? Matt
On Tue, Oct 06, 2015 at 09:12:18PM +0000, Matt Bennett wrote: > On Tue, 2015-10-06 at 11:46 +0200, Guillaume Nault wrote: > > On Tue, Oct 06, 2015 at 04:46:04AM +0000, Matt Bennett wrote: > > > I don't know why the code isn't like the following anyway. > > > > > > -if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND | PPPOX_ZOMBIE)) { > > > +if (po->pppoe_dev) { > > > dev_put(po->pppoe_dev); > > > po->pppoe_dev = NULL; > > > } > > I was thinking about that same approach. pppoe_release() is the only > > function making that assumption. Other parts of the code seem to only > > require that PPPOX_CONNECTED => pppoe_dev != NULL. > > > > But I think the original condition was valid. Adding PPPOX_ZOMBIE into > > the test and resetting pppoe_dev upon reception of PADT have changed the > > relationship between sk_state and pppoe_dev, which is where the problem > > stands. > Yes originally the condition was valid. But I think the issue is plain > to see when you look at the comment beside PPPOX_ZOMBIE declared in the > enum. > > PPPOX_ZOMBIE = 8, /* dead, but still bound to ppp device */ > > We have seen in the situation we have described previously that we can > be in this state without being bound to the ppp device. > > In my opinion the entire logic around > pppoe_disc_rcv()/pppoe_unbind_sock_work() looks wrong and I agree we > should do what you suggested a few emails back. > > i.e in pppoe_disc_rcv(): > > if (po) { > struct sock *sk = sk_pppox(po); > > - bh_lock_sock(sk); > - > - /* If the user has locked the socket, just ignore > - * the packet. With the way two rcv protocols hook into > - * one socket family type, we cannot (easily) distinguish > - * what kind of SKB it is during backlog rcv. > - */ > - if (sock_owned_by_user(sk) == 0) { > - /* We're no longer connect at the PPPOE layer, > - * and must wait for ppp channel to disconnect us. > - */ > - sk->sk_state = PPPOX_ZOMBIE; > - } > - > - bh_unlock_sock(sk); > if (!schedule_work(&po->proto.pppoe.padt_work)) > sock_put(sk); > } > Yes, with the introduction of pppoe_unbind_sock_work(), setting PPPOX_ZOMBIE shouldn't be required anymore. > Subsequently the PPPOX_ZOMBIE state can be completely removed? > Yes, this is the last place where PPPOX_ZOMBIE can be set. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c index 3837ae3..2ed7506 100644 --- a/drivers/net/ppp/pppoe.c +++ b/drivers/net/ppp/pppoe.c @@ -313,7 +313,6 @@ static void pppoe_flush_dev(struct net_device *dev) if (po->pppoe_dev == dev && sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND | PPPOX_ZOMBIE)) { pppox_unbind_sock(sk); - sk->sk_state = PPPOX_ZOMBIE; sk->sk_state_change(sk); po->pppoe_dev = NULL; dev_put(dev);