Message ID | CAM_iQpXdyNeeojLfh0ZPMy1LQwOoQZfV7Az2=Q5WALM00Za9SA@mail.gmail.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
Hi Wang Cong, With this patch , the system was crashed when setsockopt. The call trace as below: crash> bt PID: 3069 TASK: ffff8800afcc0000 CPU: 0 COMMAND: "trinity-main" #0 [ffff8801bec03ce0] machine_kexec at ffffffff8105354b #1 [ffff8801bec03d40] crash_kexec at ffffffff810f7e82 #2 [ffff8801bec03e10] panic at ffffffff81650058 #3 [ffff8801bec03e90] watchdog_timer_fn at ffffffff81122533 #4 [ffff8801bec03ec8] __hrtimer_run_queues at ffffffff810abeb2 #5 [ffff8801bec03f20] hrtimer_interrupt at ffffffff810ac450 #6 [ffff8801bec03f70] local_apic_timer_interrupt at ffffffff8104a457 #7 [ffff8801bec03f88] smp_apic_timer_interrupt at ffffffff8166aed0 #8 [ffff8801bec03fb0] apic_timer_interrupt at ffffffff8166931d --- <IRQ stack> --- #9 [ffff8801b301fcb8] apic_timer_interrupt at ffffffff8166931d [exception RIP: lock_timer_base+77] RIP: ffffffff8108dced RSP: ffff8801b301fd60 RFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff8800afcc0000 RCX: 0000000000000001 RDX: ffff8800afcc0000 RSI: ffff8801b301fd90 RDI: ffff8800b0d853c8 RBP: ffff8801b301fd80 R8: ffff8800afcc0000 R9: ffffea0002680000 R10: 000000000000003c R11: ffff8801b301fb2e R12: ffff8800afcc0000 R13: ffff8800afcc0000 R14: 0000000000000000 R15: ffffffff83d1a340 ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 #10 [ffff8801b301fd88] try_to_del_timer_sync at ffffffff8108f19f #11 [ffff8801b301fdb8] del_timer_sync at ffffffff8108f252 #12 [ffff8801b301fdd0] packet_set_ring at ffffffff81635e60 #13 [ffff8801b301fe98] packet_setsockopt at ffffffff81636760 #14 [ffff8801b301ff38] sys_setsockopt at ffffffff81531860 #15 [ffff8801b301ff80] tracesys at ffffffff816687ed (via system_call) RIP: 00007fcc78b03e3a RSP: 00007fff16f246b8 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: ffffffff816687ed RCX: ffffffffffffffff RDX: 0000000000000005 RSI: 0000000000000107 RDI: 0000000000000180 RBP: 0000000000000180 R8: 000000000000001c R9: 00007fcc78dc7160 R10: 0000000001fd6ba0 R11: 0000000000000202 R12: 0000000000000000 R13: 0000000000000011 R14: 0000000001fd6b60 R15: 0000000001fd6b70 ORIG_RAX: 0000000000000036 CS: 0033 SS: 002b Best Regards, liujian > -----Original Message----- > From: Cong Wang [mailto:xiyou.wangcong@gmail.com] > Sent: Sunday, July 23, 2017 1:59 PM > To: Dingtianhong > Cc: liujian (CE); Willem de Bruijn; Dave Jones; alexander.levin@verizon.com; > davem@davemloft.net; edumazet@google.com; willemb@google.com; > daniel@iogearbox.net; netdev@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired > > On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong <dingtianhong@huawei.com> > wrote: > > Hi, Cong: > > > > Thanks for your quirk solution, but I still has some doubts about it, > > it looks like fix the problem in the > > packet_setsockopt->packet_set_ring processing, but when in > > packet_release processing, it may could not release the real pg_vec > > for the TPACKET_V3 ring, and then cause the mem leak, maybe I miss > > something here, nice to hear from your feedback. :) > > Yes you miss that packet_release() has memset()'s so we won't hit that path. :) > > However, I missed the swap() in this messy function, actually I believe the bug > is that we modify tpacket_kbdq_core inside rx_ring in non-closing case without > actually stopping its timer. I feel more confident with the following patch: > > > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index > 008bb34ee324..267b181fef15 100644 > --- a/net/packet/af_packet.c > +++ b/net/packet/af_packet.c > @@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk, union > tpacket_req_u *req_u, > case TPACKET_V3: > /* Block transmit is not supported yet */ > if (!tx_ring) { > + prb_shutdown_retire_blk_timer(po, > + rb_queue); > init_prb_bdqc(po, rb, pg_vec, req_u); > } else { > struct tpacket_req3 *req3 = > &req_u->req3;
Hi, Do we need delete the v3 ring, when tp_version changed from TPACKET_V3 to TPACKET_V1 ? Best Regards, liujian > -----Original Message----- > From: liujian (CE) > Sent: Sunday, July 23, 2017 4:21 PM > To: 'Cong Wang'; Dingtianhong > Cc: Willem de Bruijn; Dave Jones; alexander.levin@verizon.com; > davem@davemloft.net; edumazet@google.com; willemb@google.com; > daniel@iogearbox.net; netdev@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired > > Hi Wang Cong, > > With this patch , the system was crashed when setsockopt. > > The call trace as below: > > crash> bt > PID: 3069 TASK: ffff8800afcc0000 CPU: 0 COMMAND: "trinity-main" > #0 [ffff8801bec03ce0] machine_kexec at ffffffff8105354b > #1 [ffff8801bec03d40] crash_kexec at ffffffff810f7e82 > #2 [ffff8801bec03e10] panic at ffffffff81650058 > #3 [ffff8801bec03e90] watchdog_timer_fn at ffffffff81122533 > #4 [ffff8801bec03ec8] __hrtimer_run_queues at ffffffff810abeb2 > #5 [ffff8801bec03f20] hrtimer_interrupt at ffffffff810ac450 > #6 [ffff8801bec03f70] local_apic_timer_interrupt at ffffffff8104a457 > #7 [ffff8801bec03f88] smp_apic_timer_interrupt at ffffffff8166aed0 > #8 [ffff8801bec03fb0] apic_timer_interrupt at ffffffff8166931d > --- <IRQ stack> --- > #9 [ffff8801b301fcb8] apic_timer_interrupt at ffffffff8166931d > [exception RIP: lock_timer_base+77] > RIP: ffffffff8108dced RSP: ffff8801b301fd60 RFLAGS: 00000246 > RAX: 0000000000000000 RBX: ffff8800afcc0000 RCX: > 0000000000000001 > RDX: ffff8800afcc0000 RSI: ffff8801b301fd90 RDI: ffff8800b0d853c8 > RBP: ffff8801b301fd80 R8: ffff8800afcc0000 R9: ffffea0002680000 > R10: 000000000000003c R11: ffff8801b301fb2e R12: ffff8800afcc0000 > R13: ffff8800afcc0000 R14: 0000000000000000 R15: ffffffff83d1a340 > ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 > #10 [ffff8801b301fd88] try_to_del_timer_sync at ffffffff8108f19f > #11 [ffff8801b301fdb8] del_timer_sync at ffffffff8108f252 > #12 [ffff8801b301fdd0] packet_set_ring at ffffffff81635e60 > #13 [ffff8801b301fe98] packet_setsockopt at ffffffff81636760 > #14 [ffff8801b301ff38] sys_setsockopt at ffffffff81531860 > #15 [ffff8801b301ff80] tracesys at ffffffff816687ed (via system_call) > RIP: 00007fcc78b03e3a RSP: 00007fff16f246b8 RFLAGS: 00000202 > RAX: ffffffffffffffda RBX: ffffffff816687ed RCX: ffffffffffffffff > RDX: 0000000000000005 RSI: 0000000000000107 RDI: > 0000000000000180 > RBP: 0000000000000180 R8: 000000000000001c R9: > 00007fcc78dc7160 > R10: 0000000001fd6ba0 R11: 0000000000000202 R12: > 0000000000000000 > R13: 0000000000000011 R14: 0000000001fd6b60 R15: > 0000000001fd6b70 > ORIG_RAX: 0000000000000036 CS: 0033 SS: 002b > > > Best Regards, > liujian > > > > -----Original Message----- > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com] > > Sent: Sunday, July 23, 2017 1:59 PM > > To: Dingtianhong > > Cc: liujian (CE); Willem de Bruijn; Dave Jones; > > alexander.levin@verizon.com; davem@davemloft.net; > edumazet@google.com; > > willemb@google.com; daniel@iogearbox.net; netdev@vger.kernel.org; > > linux-kernel@vger.kernel.org > > Subject: Re: af_packet: use after free in > > prb_retire_rx_blk_timer_expired > > > > On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong > > <dingtianhong@huawei.com> > > wrote: > > > Hi, Cong: > > > > > > Thanks for your quirk solution, but I still has some doubts about > > > it, it looks like fix the problem in the > > > packet_setsockopt->packet_set_ring processing, but when in > > > packet_release processing, it may could not release the real pg_vec > > > for the TPACKET_V3 ring, and then cause the mem leak, maybe I miss > > > something here, nice to hear from your feedback. :) > > > > Yes you miss that packet_release() has memset()'s so we won't hit that > > path. :) > > > > However, I missed the swap() in this messy function, actually I > > believe the bug is that we modify tpacket_kbdq_core inside rx_ring in > > non-closing case without actually stopping its timer. I feel more confident > with the following patch: > > > > > > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index > > 008bb34ee324..267b181fef15 100644 > > --- a/net/packet/af_packet.c > > +++ b/net/packet/af_packet.c > > @@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk, > > union tpacket_req_u *req_u, > > case TPACKET_V3: > > /* Block transmit is not supported yet */ > > if (!tx_ring) { > > + prb_shutdown_retire_blk_timer(po, > > + rb_queue); > > init_prb_bdqc(po, rb, pg_vec, req_u); > > } else { > > struct tpacket_req3 *req3 = > > &req_u->req3;
Hi I find it caused by below steps: 1. set tp_version to TPACKET_V3 and req->tp_block_nr to 1 2. set tp_block_nr to 0 Then pg_vec was freed, and we did not delete the timer? Best Regards, liujian > -----Original Message----- > From: liujian (CE) > Sent: Sunday, July 23, 2017 5:47 PM > To: liujian (CE); 'Cong Wang'; Dingtianhong > Cc: 'Willem de Bruijn'; 'Dave Jones'; 'alexander.levin@verizon.com'; > 'davem@davemloft.net'; 'edumazet@google.com'; 'willemb@google.com'; > 'daniel@iogearbox.net'; 'netdev@vger.kernel.org'; > 'linux-kernel@vger.kernel.org' > Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired > > Hi, > > Do we need delete the v3 ring, when tp_version changed from TPACKET_V3 to > TPACKET_V1 ? > > > Best Regards, > liujian > > > > -----Original Message----- > > From: liujian (CE) > > Sent: Sunday, July 23, 2017 4:21 PM > > To: 'Cong Wang'; Dingtianhong > > Cc: Willem de Bruijn; Dave Jones; alexander.levin@verizon.com; > > davem@davemloft.net; edumazet@google.com; willemb@google.com; > > daniel@iogearbox.net; netdev@vger.kernel.org; > > linux-kernel@vger.kernel.org > > Subject: RE: af_packet: use after free in > > prb_retire_rx_blk_timer_expired > > > > Hi Wang Cong, > > > > With this patch , the system was crashed when setsockopt. > > > > The call trace as below: > > > > crash> bt > > PID: 3069 TASK: ffff8800afcc0000 CPU: 0 COMMAND: "trinity-main" > > #0 [ffff8801bec03ce0] machine_kexec at ffffffff8105354b > > #1 [ffff8801bec03d40] crash_kexec at ffffffff810f7e82 > > #2 [ffff8801bec03e10] panic at ffffffff81650058 > > #3 [ffff8801bec03e90] watchdog_timer_fn at ffffffff81122533 > > #4 [ffff8801bec03ec8] __hrtimer_run_queues at ffffffff810abeb2 > > #5 [ffff8801bec03f20] hrtimer_interrupt at ffffffff810ac450 > > #6 [ffff8801bec03f70] local_apic_timer_interrupt at ffffffff8104a457 > > #7 [ffff8801bec03f88] smp_apic_timer_interrupt at ffffffff8166aed0 > > #8 [ffff8801bec03fb0] apic_timer_interrupt at ffffffff8166931d > > --- <IRQ stack> --- > > #9 [ffff8801b301fcb8] apic_timer_interrupt at ffffffff8166931d > > [exception RIP: lock_timer_base+77] > > RIP: ffffffff8108dced RSP: ffff8801b301fd60 RFLAGS: 00000246 > > RAX: 0000000000000000 RBX: ffff8800afcc0000 RCX: > > 0000000000000001 > > RDX: ffff8800afcc0000 RSI: ffff8801b301fd90 RDI: ffff8800b0d853c8 > > RBP: ffff8801b301fd80 R8: ffff8800afcc0000 R9: > ffffea0002680000 > > R10: 000000000000003c R11: ffff8801b301fb2e R12: > ffff8800afcc0000 > > R13: ffff8800afcc0000 R14: 0000000000000000 R15: > ffffffff83d1a340 > > ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 > > #10 [ffff8801b301fd88] try_to_del_timer_sync at ffffffff8108f19f > > #11 [ffff8801b301fdb8] del_timer_sync at ffffffff8108f252 > > #12 [ffff8801b301fdd0] packet_set_ring at ffffffff81635e60 > > #13 [ffff8801b301fe98] packet_setsockopt at ffffffff81636760 > > #14 [ffff8801b301ff38] sys_setsockopt at ffffffff81531860 > > #15 [ffff8801b301ff80] tracesys at ffffffff816687ed (via system_call) > > RIP: 00007fcc78b03e3a RSP: 00007fff16f246b8 RFLAGS: 00000202 > > RAX: ffffffffffffffda RBX: ffffffff816687ed RCX: ffffffffffffffff > > RDX: 0000000000000005 RSI: 0000000000000107 RDI: > > 0000000000000180 > > RBP: 0000000000000180 R8: 000000000000001c R9: > > 00007fcc78dc7160 > > R10: 0000000001fd6ba0 R11: 0000000000000202 R12: > > 0000000000000000 > > R13: 0000000000000011 R14: 0000000001fd6b60 R15: > > 0000000001fd6b70 > > ORIG_RAX: 0000000000000036 CS: 0033 SS: 002b > > > > > > Best Regards, > > liujian > > > > > > > -----Original Message----- > > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com] > > > Sent: Sunday, July 23, 2017 1:59 PM > > > To: Dingtianhong > > > Cc: liujian (CE); Willem de Bruijn; Dave Jones; > > > alexander.levin@verizon.com; davem@davemloft.net; > > edumazet@google.com; > > > willemb@google.com; daniel@iogearbox.net; netdev@vger.kernel.org; > > > linux-kernel@vger.kernel.org > > > Subject: Re: af_packet: use after free in > > > prb_retire_rx_blk_timer_expired > > > > > > On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong > > > <dingtianhong@huawei.com> > > > wrote: > > > > Hi, Cong: > > > > > > > > Thanks for your quirk solution, but I still has some doubts about > > > > it, it looks like fix the problem in the > > > > packet_setsockopt->packet_set_ring processing, but when in > > > > packet_release processing, it may could not release the real > > > > pg_vec for the TPACKET_V3 ring, and then cause the mem leak, maybe > > > > I miss something here, nice to hear from your feedback. :) > > > > > > Yes you miss that packet_release() has memset()'s so we won't hit > > > that path. :) > > > > > > However, I missed the swap() in this messy function, actually I > > > believe the bug is that we modify tpacket_kbdq_core inside rx_ring > > > in non-closing case without actually stopping its timer. I feel more > > > confident > > with the following patch: > > > > > > > > > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index > > > 008bb34ee324..267b181fef15 100644 > > > --- a/net/packet/af_packet.c > > > +++ b/net/packet/af_packet.c > > > @@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk, > > > union tpacket_req_u *req_u, > > > case TPACKET_V3: > > > /* Block transmit is not supported yet */ > > > if (!tx_ring) { > > > + prb_shutdown_retire_blk_timer(po, > > > + rb_queue); > > > init_prb_bdqc(po, rb, pg_vec, > req_u); > > > } else { > > > struct tpacket_req3 *req3 = > > > &req_u->req3;
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 008bb34ee324..267b181fef15 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u, case TPACKET_V3: /* Block transmit is not supported yet */ if (!tx_ring) { + prb_shutdown_retire_blk_timer(po, rb_queue); init_prb_bdqc(po, rb, pg_vec, req_u); } else {