Message ID | 1441061746-1492-1-git-send-email-pshelar@nicira.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Mon, Aug 31, 2015 at 3:55 PM, Pravin B Shelar <pshelar@nicira.com> wrote: > VXLAN device can receive skb with checksum partial. But the checksum > offset could be in outer header which is pulled on receive. Such skb > can cause the panic when checksum is calculated on skb. Following patch > fixes the bug by setting checksum unnecessary while pulling outer header. > > ---8<--- > [ 13.800141] RIP: 0010:[<ffffffff81518034>] [<ffffffff81518034>] skb_checksum_help+0x144/0x150 > [ 13.800141] RSP: 0000:ffff88011fd83940 EFLAGS: 00010292 > [ 13.800141] RAX: 0000000000000042 RBX: ffff880114dd56c0 RCX: ffff8801188d9580 > ... > ... > [ 13.852308] Call Trace: > [ 13.852308] <IRQ> > [ 13.852308] [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch] > [ 13.852308] [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch] > [ 13.852308] [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] > [ 13.852308] [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] > [ 13.852308] [<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch] > [ 13.852308] [<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch] > [ 13.852308] [<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch] > [ 13.852308] [<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0 > [ 13.852308] [<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0 > [ 13.852308] [<ffffffff8157ba7a>] udp_rcv+0x1a/0x20 > [ 13.852308] [<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280 > [ 13.852308] [<ffffffff81550128>] ip_local_deliver+0x88/0x90 > [ 13.852308] [<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370 > [ 13.852308] [<ffffffff81550365>] ip_rcv+0x235/0x300 > [ 13.852308] [<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620 > [ 13.852308] [<ffffffff8151c360>] netif_receive_skb+0x80/0x90 > [ 13.852308] [<ffffffff81459935>] virtnet_poll+0x555/0x6f0 > [ 13.852308] [<ffffffff8151cd04>] net_rx_action+0x134/0x290 > [ 13.852308] [<ffffffff810683d8>] __do_softirq+0xa8/0x210 > [ 13.852308] [<ffffffff8162fe6c>] call_softirq+0x1c/0x30 > [ 13.852308] [<ffffffff810161a5>] do_softirq+0x65/0xa0 > [ 13.852308] [<ffffffff810687be>] irq_exit+0x8e/0xb0 > [ 13.852308] [<ffffffff81630733>] do_IRQ+0x63/0xe0 > [ 13.852308] [<ffffffff81625f2e>] common_interrupt+0x6e/0x6e > [ 13.852308] <EOI> > [ 13.852308] [<ffffffff8162dc02>] ? system_call_fastpath+0x16/0x1b > > Reported-by: Anupam Chanda <achanda@vmware.com> > Signed-off-by: Pravin B Shelar <pshelar@nicira.com> > --- > include/linux/skbuff.h | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index 9b88536..6238e9f 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -2601,6 +2601,9 @@ static inline void skb_postpull_rcsum(struct sk_buff *skb, > { > if (skb->ip_summed == CHECKSUM_COMPLETE) > skb->csum = csum_sub(skb->csum, csum_partial(start, len, 0)); > + else if (skb->ip_summed == CHECKSUM_PARTIAL && > + skb_checksum_start_offset(skb) <= len) > + skb->ip_summed = CHECKSUM_UNNECESSARY; No, this isn't right. We should never be converting CHECKSUM_PARTIAL into CHECKSUM_UNNECESSARY. > } > > unsigned char *skb_pull_rcsum(struct sk_buff *skb, unsigned int len); > -- > 1.7.1 > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 31, 2015 at 03:55:46PM -0700, Pravin B Shelar wrote: > VXLAN device can receive skb with checksum partial. But the checksum > offset could be in outer header which is pulled on receive. Such skb > can cause the panic when checksum is calculated on skb. Following patch > fixes the bug by setting checksum unnecessary while pulling outer header. > > ---8<--- > [ 13.800141] RIP: 0010:[<ffffffff81518034>] [<ffffffff81518034>] skb_checksum_help+0x144/0x150 > [ 13.800141] RSP: 0000:ffff88011fd83940 EFLAGS: 00010292 > [ 13.800141] RAX: 0000000000000042 RBX: ffff880114dd56c0 RCX: ffff8801188d9580 > ... > ... > [ 13.852308] Call Trace: > [ 13.852308] <IRQ> > [ 13.852308] [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch] > [ 13.852308] [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch] > [ 13.852308] [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] > [ 13.852308] [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] that doesn't look like upstream kernel. The above two functions don't exist in net-next. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 31, 2015 at 4:44 PM, Tom Herbert <tom@herbertland.com> wrote: > On Mon, Aug 31, 2015 at 3:55 PM, Pravin B Shelar <pshelar@nicira.com> wrote: >> VXLAN device can receive skb with checksum partial. But the checksum >> offset could be in outer header which is pulled on receive. Such skb >> can cause the panic when checksum is calculated on skb. Following patch >> fixes the bug by setting checksum unnecessary while pulling outer header. >> >> ---8<--- >> [ 13.800141] RIP: 0010:[<ffffffff81518034>] [<ffffffff81518034>] skb_checksum_help+0x144/0x150 >> [ 13.800141] RSP: 0000:ffff88011fd83940 EFLAGS: 00010292 >> [ 13.800141] RAX: 0000000000000042 RBX: ffff880114dd56c0 RCX: ffff8801188d9580 >> ... >> ... >> [ 13.852308] Call Trace: >> [ 13.852308] <IRQ> >> [ 13.852308] [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch] >> [ 13.852308] [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch] >> [ 13.852308] [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] >> [ 13.852308] [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] >> [ 13.852308] [<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch] >> [ 13.852308] [<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch] >> [ 13.852308] [<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch] >> [ 13.852308] [<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0 >> [ 13.852308] [<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0 >> [ 13.852308] [<ffffffff8157ba7a>] udp_rcv+0x1a/0x20 >> [ 13.852308] [<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280 >> [ 13.852308] [<ffffffff81550128>] ip_local_deliver+0x88/0x90 >> [ 13.852308] [<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370 >> [ 13.852308] [<ffffffff81550365>] ip_rcv+0x235/0x300 >> [ 13.852308] [<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620 >> [ 13.852308] [<ffffffff8151c360>] netif_receive_skb+0x80/0x90 >> [ 13.852308] [<ffffffff81459935>] virtnet_poll+0x555/0x6f0 >> [ 13.852308] [<ffffffff8151cd04>] net_rx_action+0x134/0x290 >> [ 13.852308] [<ffffffff810683d8>] __do_softirq+0xa8/0x210 >> [ 13.852308] [<ffffffff8162fe6c>] call_softirq+0x1c/0x30 >> [ 13.852308] [<ffffffff810161a5>] do_softirq+0x65/0xa0 >> [ 13.852308] [<ffffffff810687be>] irq_exit+0x8e/0xb0 >> [ 13.852308] [<ffffffff81630733>] do_IRQ+0x63/0xe0 >> [ 13.852308] [<ffffffff81625f2e>] common_interrupt+0x6e/0x6e >> [ 13.852308] <EOI> >> [ 13.852308] [<ffffffff8162dc02>] ? system_call_fastpath+0x16/0x1b >> >> Reported-by: Anupam Chanda <achanda@vmware.com> >> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> >> --- >> include/linux/skbuff.h | 3 +++ >> 1 files changed, 3 insertions(+), 0 deletions(-) >> >> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h >> index 9b88536..6238e9f 100644 >> --- a/include/linux/skbuff.h >> +++ b/include/linux/skbuff.h >> @@ -2601,6 +2601,9 @@ static inline void skb_postpull_rcsum(struct sk_buff *skb, >> { >> if (skb->ip_summed == CHECKSUM_COMPLETE) >> skb->csum = csum_sub(skb->csum, csum_partial(start, len, 0)); >> + else if (skb->ip_summed == CHECKSUM_PARTIAL && >> + skb_checksum_start_offset(skb) <= len) >> + skb->ip_summed = CHECKSUM_UNNECESSARY; > > No, this isn't right. We should never be converting CHECKSUM_PARTIAL > into CHECKSUM_UNNECESSARY. > But checksum is valid for inner packet. So I do not see any other appropriate checksum flag here. Can you suggest another flag which is better suited? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 31, 2015 at 6:40 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > On Mon, Aug 31, 2015 at 03:55:46PM -0700, Pravin B Shelar wrote: >> VXLAN device can receive skb with checksum partial. But the checksum >> offset could be in outer header which is pulled on receive. Such skb >> can cause the panic when checksum is calculated on skb. Following patch >> fixes the bug by setting checksum unnecessary while pulling outer header. >> >> ---8<--- >> [ 13.800141] RIP: 0010:[<ffffffff81518034>] [<ffffffff81518034>] skb_checksum_help+0x144/0x150 >> [ 13.800141] RSP: 0000:ffff88011fd83940 EFLAGS: 00010292 >> [ 13.800141] RAX: 0000000000000042 RBX: ffff880114dd56c0 RCX: ffff8801188d9580 >> ... >> ... >> [ 13.852308] Call Trace: >> [ 13.852308] <IRQ> >> [ 13.852308] [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch] >> [ 13.852308] [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch] >> [ 13.852308] [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] >> [ 13.852308] [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] > > that doesn't look like upstream kernel. > The above two functions don't exist in net-next. > Bug was reported on older kernel. But the issue exist on net-next kernel. For example when vxlan checksum is turned on. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 31, 2015 at 8:25 PM, Pravin Shelar <pshelar@nicira.com> wrote: > On Mon, Aug 31, 2015 at 4:44 PM, Tom Herbert <tom@herbertland.com> wrote: >> On Mon, Aug 31, 2015 at 3:55 PM, Pravin B Shelar <pshelar@nicira.com> wrote: >>> VXLAN device can receive skb with checksum partial. But the checksum >>> offset could be in outer header which is pulled on receive. Such skb >>> can cause the panic when checksum is calculated on skb. Following patch >>> fixes the bug by setting checksum unnecessary while pulling outer header. >>> >>> ---8<--- >>> [ 13.800141] RIP: 0010:[<ffffffff81518034>] [<ffffffff81518034>] skb_checksum_help+0x144/0x150 >>> [ 13.800141] RSP: 0000:ffff88011fd83940 EFLAGS: 00010292 >>> [ 13.800141] RAX: 0000000000000042 RBX: ffff880114dd56c0 RCX: ffff8801188d9580 >>> ... >>> ... >>> [ 13.852308] Call Trace: >>> [ 13.852308] <IRQ> >>> [ 13.852308] [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch] >>> [ 13.852308] [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch] >>> [ 13.852308] [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] >>> [ 13.852308] [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] >>> [ 13.852308] [<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch] >>> [ 13.852308] [<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch] >>> [ 13.852308] [<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch] >>> [ 13.852308] [<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0 >>> [ 13.852308] [<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0 >>> [ 13.852308] [<ffffffff8157ba7a>] udp_rcv+0x1a/0x20 >>> [ 13.852308] [<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280 >>> [ 13.852308] [<ffffffff81550128>] ip_local_deliver+0x88/0x90 >>> [ 13.852308] [<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370 >>> [ 13.852308] [<ffffffff81550365>] ip_rcv+0x235/0x300 >>> [ 13.852308] [<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620 >>> [ 13.852308] [<ffffffff8151c360>] netif_receive_skb+0x80/0x90 >>> [ 13.852308] [<ffffffff81459935>] virtnet_poll+0x555/0x6f0 >>> [ 13.852308] [<ffffffff8151cd04>] net_rx_action+0x134/0x290 >>> [ 13.852308] [<ffffffff810683d8>] __do_softirq+0xa8/0x210 >>> [ 13.852308] [<ffffffff8162fe6c>] call_softirq+0x1c/0x30 >>> [ 13.852308] [<ffffffff810161a5>] do_softirq+0x65/0xa0 >>> [ 13.852308] [<ffffffff810687be>] irq_exit+0x8e/0xb0 >>> [ 13.852308] [<ffffffff81630733>] do_IRQ+0x63/0xe0 >>> [ 13.852308] [<ffffffff81625f2e>] common_interrupt+0x6e/0x6e >>> [ 13.852308] <EOI> >>> [ 13.852308] [<ffffffff8162dc02>] ? system_call_fastpath+0x16/0x1b >>> >>> Reported-by: Anupam Chanda <achanda@vmware.com> >>> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> >>> --- >>> include/linux/skbuff.h | 3 +++ >>> 1 files changed, 3 insertions(+), 0 deletions(-) >>> >>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h >>> index 9b88536..6238e9f 100644 >>> --- a/include/linux/skbuff.h >>> +++ b/include/linux/skbuff.h >>> @@ -2601,6 +2601,9 @@ static inline void skb_postpull_rcsum(struct sk_buff *skb, >>> { >>> if (skb->ip_summed == CHECKSUM_COMPLETE) >>> skb->csum = csum_sub(skb->csum, csum_partial(start, len, 0)); >>> + else if (skb->ip_summed == CHECKSUM_PARTIAL && >>> + skb_checksum_start_offset(skb) <= len) >>> + skb->ip_summed = CHECKSUM_UNNECESSARY; >> >> No, this isn't right. We should never be converting CHECKSUM_PARTIAL >> into CHECKSUM_UNNECESSARY. >> > > But checksum is valid for inner packet. So I do not see any other > appropriate checksum flag here. Can you suggest another flag which is > better suited? You don't need to do any conversion. skb_checksum_unnecessary checks CHECKSUM_PARTIAL values. If the checksum offset being checked goes past CHECKSUM_PARTIAL, then skb_checksum_unnecessary will fail and checksum complete will be called to get CHECKSUM_COMPLETE. I'm not sure why openvswitch is even calling skb_checksum_help in the first place, the stack should be able to handle CHECKSUM_PARTIAL. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 31, 2015 at 3:55 PM, Pravin B Shelar <pshelar@nicira.com> wrote: > VXLAN device can receive skb with checksum partial. But the checksum > offset could be in outer header which is pulled on receive. Such skb > can cause the panic when checksum is calculated on skb. Following patch > fixes the bug by setting checksum unnecessary while pulling outer header. > Okay, I think I understand what you are doing. I suggest in the openvswitch path, if there is a checksum CHECKSUM_PARTIAL that refers to the outer headers which must have been verified at this point then set to CHECKSUM_NONE-- assuming CHECKSUM_UNNECESSARY on the inner header is not correct in this case. If the CHECKSUM_PARTIAL refers to the inner header then you can call skb_checksum_help to resolve an inner checksum. > ---8<--- > [ 13.800141] RIP: 0010:[<ffffffff81518034>] [<ffffffff81518034>] +0x144/0x150 > [ 13.800141] RSP: 0000:ffff88011fd83940 EFLAGS: 00010292 > [ 13.800141] RAX: 0000000000000042 RBX: ffff880114dd56c0 RCX: ffff8801188d9580 > ... > ... > [ 13.852308] Call Trace: > [ 13.852308] <IRQ> > [ 13.852308] [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch] > [ 13.852308] [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch] > [ 13.852308] [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] > [ 13.852308] [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] > [ 13.852308] [<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch] > [ 13.852308] [<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch] > [ 13.852308] [<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch] > [ 13.852308] [<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0 > [ 13.852308] [<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0 > [ 13.852308] [<ffffffff8157ba7a>] udp_rcv+0x1a/0x20 > [ 13.852308] [<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280 > [ 13.852308] [<ffffffff81550128>] ip_local_deliver+0x88/0x90 > [ 13.852308] [<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370 > [ 13.852308] [<ffffffff81550365>] ip_rcv+0x235/0x300 > [ 13.852308] [<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620 > [ 13.852308] [<ffffffff8151c360>] netif_receive_skb+0x80/0x90 > [ 13.852308] [<ffffffff81459935>] virtnet_poll+0x555/0x6f0 > [ 13.852308] [<ffffffff8151cd04>] net_rx_action+0x134/0x290 > [ 13.852308] [<ffffffff810683d8>] __do_softirq+0xa8/0x210 > [ 13.852308] [<ffffffff8162fe6c>] call_softirq+0x1c/0x30 > [ 13.852308] [<ffffffff810161a5>] do_softirq+0x65/0xa0 > [ 13.852308] [<ffffffff810687be>] irq_exit+0x8e/0xb0 > [ 13.852308] [<ffffffff81630733>] do_IRQ+0x63/0xe0 > [ 13.852308] [<ffffffff81625f2e>] common_interrupt+0x6e/0x6e > [ 13.852308] <EOI> > [ 13.852308] [<ffffffff8162dc02>] ? system_call_fastpath+0x16/0x1b > > Reported-by: Anupam Chanda <achanda@vmware.com> > Signed-off-by: Pravin B Shelar <pshelar@nicira.com> > --- > include/linux/skbuff.h | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index 9b88536..6238e9f 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -2601,6 +2601,9 @@ static inline void skb_postpull_rcsum(struct sk_buff *skb, > { > if (skb->ip_summed == CHECKSUM_COMPLETE) > skb->csum = csum_sub(skb->csum, csum_partial(start, len, 0)); > + else if (skb->ip_summed == CHECKSUM_PARTIAL && > + skb_checksum_start_offset(skb) <= len) > + skb->ip_summed = CHECKSUM_UNNECESSARY; > } > > unsigned char *skb_pull_rcsum(struct sk_buff *skb, unsigned int len); > -- > 1.7.1 > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 31, 2015 at 9:12 PM, Tom Herbert <tom@herbertland.com> wrote: > On Mon, Aug 31, 2015 at 3:55 PM, Pravin B Shelar <pshelar@nicira.com> wrote: >> VXLAN device can receive skb with checksum partial. But the checksum >> offset could be in outer header which is pulled on receive. Such skb >> can cause the panic when checksum is calculated on skb. Following patch >> fixes the bug by setting checksum unnecessary while pulling outer header. >> > Okay, I think I understand what you are doing. I suggest in the > openvswitch path, if there is a checksum CHECKSUM_PARTIAL that refers > to the outer headers which must have been verified at this point then > set to CHECKSUM_NONE-- assuming CHECKSUM_UNNECESSARY on the inner > header is not correct in this case. If the CHECKSUM_PARTIAL refers to > the inner header then you can call skb_checksum_help to resolve an > inner checksum. > That would be OVS specific fix, But I do see skb_checksum_help() called in multiple places outside OVS that could result in similar kernel panic. Therefore I want to solve it up in networking stack rather than in OVS. >> ---8<--- >> [ 13.800141] RIP: 0010:[<ffffffff81518034>] [<ffffffff81518034>] +0x144/0x150 >> [ 13.800141] RSP: 0000:ffff88011fd83940 EFLAGS: 00010292 >> [ 13.800141] RAX: 0000000000000042 RBX: ffff880114dd56c0 RCX: ffff8801188d9580 >> ... >> ... >> [ 13.852308] Call Trace: >> [ 13.852308] <IRQ> >> [ 13.852308] [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch] >> [ 13.852308] [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch] >> [ 13.852308] [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] >> [ 13.852308] [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] >> [ 13.852308] [<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch] >> [ 13.852308] [<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch] >> [ 13.852308] [<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch] >> [ 13.852308] [<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0 >> [ 13.852308] [<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0 >> [ 13.852308] [<ffffffff8157ba7a>] udp_rcv+0x1a/0x20 >> [ 13.852308] [<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280 >> [ 13.852308] [<ffffffff81550128>] ip_local_deliver+0x88/0x90 >> [ 13.852308] [<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370 >> [ 13.852308] [<ffffffff81550365>] ip_rcv+0x235/0x300 >> [ 13.852308] [<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620 >> [ 13.852308] [<ffffffff8151c360>] netif_receive_skb+0x80/0x90 >> [ 13.852308] [<ffffffff81459935>] virtnet_poll+0x555/0x6f0 >> [ 13.852308] [<ffffffff8151cd04>] net_rx_action+0x134/0x290 >> [ 13.852308] [<ffffffff810683d8>] __do_softirq+0xa8/0x210 >> [ 13.852308] [<ffffffff8162fe6c>] call_softirq+0x1c/0x30 >> [ 13.852308] [<ffffffff810161a5>] do_softirq+0x65/0xa0 >> [ 13.852308] [<ffffffff810687be>] irq_exit+0x8e/0xb0 >> [ 13.852308] [<ffffffff81630733>] do_IRQ+0x63/0xe0 >> [ 13.852308] [<ffffffff81625f2e>] common_interrupt+0x6e/0x6e >> [ 13.852308] <EOI> >> [ 13.852308] [<ffffffff8162dc02>] ? system_call_fastpath+0x16/0x1b >> >> Reported-by: Anupam Chanda <achanda@vmware.com> >> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> >> --- >> include/linux/skbuff.h | 3 +++ >> 1 files changed, 3 insertions(+), 0 deletions(-) >> >> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h >> index 9b88536..6238e9f 100644 >> --- a/include/linux/skbuff.h >> +++ b/include/linux/skbuff.h >> @@ -2601,6 +2601,9 @@ static inline void skb_postpull_rcsum(struct sk_buff *skb, >> { >> if (skb->ip_summed == CHECKSUM_COMPLETE) >> skb->csum = csum_sub(skb->csum, csum_partial(start, len, 0)); >> + else if (skb->ip_summed == CHECKSUM_PARTIAL && >> + skb_checksum_start_offset(skb) <= len) >> + skb->ip_summed = CHECKSUM_UNNECESSARY; >> } >> >> unsigned char *skb_pull_rcsum(struct sk_buff *skb, unsigned int len); >> -- >> 1.7.1 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 31, 2015 at 10:15 PM, Pravin Shelar <pshelar@nicira.com> wrote: > On Mon, Aug 31, 2015 at 9:12 PM, Tom Herbert <tom@herbertland.com> wrote: >> On Mon, Aug 31, 2015 at 3:55 PM, Pravin B Shelar <pshelar@nicira.com> wrote: >>> VXLAN device can receive skb with checksum partial. But the checksum >>> offset could be in outer header which is pulled on receive. Such skb >>> can cause the panic when checksum is calculated on skb. Following patch >>> fixes the bug by setting checksum unnecessary while pulling outer header. >>> >> Okay, I think I understand what you are doing. I suggest in the >> openvswitch path, if there is a checksum CHECKSUM_PARTIAL that refers >> to the outer headers which must have been verified at this point then >> set to CHECKSUM_NONE-- assuming CHECKSUM_UNNECESSARY on the inner >> header is not correct in this case. If the CHECKSUM_PARTIAL refers to >> the inner header then you can call skb_checksum_help to resolve an >> inner checksum. >> > > That would be OVS specific fix, But I do see skb_checksum_help() > called in multiple places outside OVS that could result in similar > kernel panic. Therefore I want to solve it up in networking stack > rather than in OVS. > Please try to reproduce this out of OVS from the top of the tree then and report down exactly where panic is occurring the code. Unlike most of the of the other cases where skb_checksum_help() is being called this in the RX path so skb is probably not pulled over the checksum offset for those. Even so, if the skb is pulled beyond the checksum offset then this should result in a negative offset in skb_checksum_start_offset(skb) which should be okay. It looks like this in itself should not be causing your panic. Tom >>> ---8<--- >>> [ 13.800141] RIP: 0010:[<ffffffff81518034>] [<ffffffff81518034>] +0x144/0x150 >>> [ 13.800141] RSP: 0000:ffff88011fd83940 EFLAGS: 00010292 >>> [ 13.800141] RAX: 0000000000000042 RBX: ffff880114dd56c0 RCX: ffff8801188d9580 >>> ... >>> ... >>> [ 13.852308] Call Trace: >>> [ 13.852308] <IRQ> >>> [ 13.852308] [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch] >>> [ 13.852308] [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch] >>> [ 13.852308] [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] >>> [ 13.852308] [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] >>> [ 13.852308] [<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch] >>> [ 13.852308] [<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch] >>> [ 13.852308] [<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch] >>> [ 13.852308] [<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0 >>> [ 13.852308] [<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0 >>> [ 13.852308] [<ffffffff8157ba7a>] udp_rcv+0x1a/0x20 >>> [ 13.852308] [<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280 >>> [ 13.852308] [<ffffffff81550128>] ip_local_deliver+0x88/0x90 >>> [ 13.852308] [<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370 >>> [ 13.852308] [<ffffffff81550365>] ip_rcv+0x235/0x300 >>> [ 13.852308] [<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620 >>> [ 13.852308] [<ffffffff8151c360>] netif_receive_skb+0x80/0x90 >>> [ 13.852308] [<ffffffff81459935>] virtnet_poll+0x555/0x6f0 >>> [ 13.852308] [<ffffffff8151cd04>] net_rx_action+0x134/0x290 >>> [ 13.852308] [<ffffffff810683d8>] __do_softirq+0xa8/0x210 >>> [ 13.852308] [<ffffffff8162fe6c>] call_softirq+0x1c/0x30 >>> [ 13.852308] [<ffffffff810161a5>] do_softirq+0x65/0xa0 >>> [ 13.852308] [<ffffffff810687be>] irq_exit+0x8e/0xb0 >>> [ 13.852308] [<ffffffff81630733>] do_IRQ+0x63/0xe0 >>> [ 13.852308] [<ffffffff81625f2e>] common_interrupt+0x6e/0x6e >>> [ 13.852308] <EOI> >>> [ 13.852308] [<ffffffff8162dc02>] ? system_call_fastpath+0x16/0x1b >>> >>> Reported-by: Anupam Chanda <achanda@vmware.com> >>> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> >>> --- >>> include/linux/skbuff.h | 3 +++ >>> 1 files changed, 3 insertions(+), 0 deletions(-) >>> >>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h >>> index 9b88536..6238e9f 100644 >>> --- a/include/linux/skbuff.h >>> +++ b/include/linux/skbuff.h >>> @@ -2601,6 +2601,9 @@ static inline void skb_postpull_rcsum(struct sk_buff *skb, >>> { >>> if (skb->ip_summed == CHECKSUM_COMPLETE) >>> skb->csum = csum_sub(skb->csum, csum_partial(start, len, 0)); >>> + else if (skb->ip_summed == CHECKSUM_PARTIAL && >>> + skb_checksum_start_offset(skb) <= len) >>> + skb->ip_summed = CHECKSUM_UNNECESSARY; >>> } >>> >>> unsigned char *skb_pull_rcsum(struct sk_buff *skb, unsigned int len); >>> -- >>> 1.7.1 >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe netdev" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 1, 2015 at 7:19 AM, Tom Herbert <tom@herbertland.com> wrote: > On Mon, Aug 31, 2015 at 10:15 PM, Pravin Shelar <pshelar@nicira.com> wrote: >> On Mon, Aug 31, 2015 at 9:12 PM, Tom Herbert <tom@herbertland.com> wrote: >>> On Mon, Aug 31, 2015 at 3:55 PM, Pravin B Shelar <pshelar@nicira.com> wrote: >>>> VXLAN device can receive skb with checksum partial. But the checksum >>>> offset could be in outer header which is pulled on receive. Such skb >>>> can cause the panic when checksum is calculated on skb. Following patch >>>> fixes the bug by setting checksum unnecessary while pulling outer header. >>>> >>> Okay, I think I understand what you are doing. I suggest in the >>> openvswitch path, if there is a checksum CHECKSUM_PARTIAL that refers >>> to the outer headers which must have been verified at this point then >>> set to CHECKSUM_NONE-- assuming CHECKSUM_UNNECESSARY on the inner >>> header is not correct in this case. If the CHECKSUM_PARTIAL refers to >>> the inner header then you can call skb_checksum_help to resolve an >>> inner checksum. >>> >> >> That would be OVS specific fix, But I do see skb_checksum_help() >> called in multiple places outside OVS that could result in similar >> kernel panic. Therefore I want to solve it up in networking stack >> rather than in OVS. >> > Please try to reproduce this out of OVS from the top of the tree then > and report down exactly where panic is occurring the code. Unlike most > of the of the other cases where skb_checksum_help() is being called > this in the RX path so skb is probably not pulled over the checksum > offset for those. Even so, if the skb is pulled beyond the checksum > offset then this should result in a negative offset in > skb_checksum_start_offset(skb) which should be okay. It looks like > this in itself should not be causing your panic. > ip_do_fragment() also calls skb_checksum_help() that can results in similar panic. But it is not easy to reproduce it in this case due to call site is in exception path. The negative checksum offset can atleast cause assert failure in skb_checksum_help(). I will send patch to fix that. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 1, 2015 at 12:20 PM, Pravin Shelar <pshelar@nicira.com> wrote: > On Tue, Sep 1, 2015 at 7:19 AM, Tom Herbert <tom@herbertland.com> wrote: >> On Mon, Aug 31, 2015 at 10:15 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>> On Mon, Aug 31, 2015 at 9:12 PM, Tom Herbert <tom@herbertland.com> wrote: >>>> On Mon, Aug 31, 2015 at 3:55 PM, Pravin B Shelar <pshelar@nicira.com> wrote: >>>>> VXLAN device can receive skb with checksum partial. But the checksum >>>>> offset could be in outer header which is pulled on receive. Such skb >>>>> can cause the panic when checksum is calculated on skb. Following patch >>>>> fixes the bug by setting checksum unnecessary while pulling outer header. >>>>> >>>> Okay, I think I understand what you are doing. I suggest in the >>>> openvswitch path, if there is a checksum CHECKSUM_PARTIAL that refers >>>> to the outer headers which must have been verified at this point then >>>> set to CHECKSUM_NONE-- assuming CHECKSUM_UNNECESSARY on the inner >>>> header is not correct in this case. If the CHECKSUM_PARTIAL refers to >>>> the inner header then you can call skb_checksum_help to resolve an >>>> inner checksum. >>>> >>> >>> That would be OVS specific fix, But I do see skb_checksum_help() >>> called in multiple places outside OVS that could result in similar >>> kernel panic. Therefore I want to solve it up in networking stack >>> rather than in OVS. >>> >> Please try to reproduce this out of OVS from the top of the tree then >> and report down exactly where panic is occurring the code. Unlike most >> of the of the other cases where skb_checksum_help() is being called >> this in the RX path so skb is probably not pulled over the checksum >> offset for those. Even so, if the skb is pulled beyond the checksum >> offset then this should result in a negative offset in >> skb_checksum_start_offset(skb) which should be okay. It looks like >> this in itself should not be causing your panic. >> > > ip_do_fragment() also calls skb_checksum_help() that can results in > similar panic. But it is not easy to reproduce it in this case due to > call site is in exception path. > The negative checksum offset can atleast cause assert failure in > skb_checksum_help(). I will send patch to fix that. Which BUG_ON do you see is hitting? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 1, 2015 at 12:55 PM, Tom Herbert <tom@herbertland.com> wrote: > On Tue, Sep 1, 2015 at 12:20 PM, Pravin Shelar <pshelar@nicira.com> wrote: >> On Tue, Sep 1, 2015 at 7:19 AM, Tom Herbert <tom@herbertland.com> wrote: >>> On Mon, Aug 31, 2015 at 10:15 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>>> On Mon, Aug 31, 2015 at 9:12 PM, Tom Herbert <tom@herbertland.com> wrote: >>>>> On Mon, Aug 31, 2015 at 3:55 PM, Pravin B Shelar <pshelar@nicira.com> wrote: >>>>>> VXLAN device can receive skb with checksum partial. But the checksum >>>>>> offset could be in outer header which is pulled on receive. Such skb >>>>>> can cause the panic when checksum is calculated on skb. Following patch >>>>>> fixes the bug by setting checksum unnecessary while pulling outer header. >>>>>> >>>>> Okay, I think I understand what you are doing. I suggest in the >>>>> openvswitch path, if there is a checksum CHECKSUM_PARTIAL that refers >>>>> to the outer headers which must have been verified at this point then >>>>> set to CHECKSUM_NONE-- assuming CHECKSUM_UNNECESSARY on the inner >>>>> header is not correct in this case. If the CHECKSUM_PARTIAL refers to >>>>> the inner header then you can call skb_checksum_help to resolve an >>>>> inner checksum. >>>>> >>>> >>>> That would be OVS specific fix, But I do see skb_checksum_help() >>>> called in multiple places outside OVS that could result in similar >>>> kernel panic. Therefore I want to solve it up in networking stack >>>> rather than in OVS. >>>> >>> Please try to reproduce this out of OVS from the top of the tree then >>> and report down exactly where panic is occurring the code. Unlike most >>> of the of the other cases where skb_checksum_help() is being called >>> this in the RX path so skb is probably not pulled over the checksum >>> offset for those. Even so, if the skb is pulled beyond the checksum >>> offset then this should result in a negative offset in >>> skb_checksum_start_offset(skb) which should be okay. It looks like >>> this in itself should not be causing your panic. >>> >> >> ip_do_fragment() also calls skb_checksum_help() that can results in >> similar panic. But it is not easy to reproduce it in this case due to >> call site is in exception path. >> The negative checksum offset can atleast cause assert failure in >> skb_checksum_help(). I will send patch to fix that. > > Which BUG_ON do you see is hitting? BUG_ON(offset >= skb_headlen(skb)); where skb_headlen() returns unsigned int, therefore negative offset cast to unsigned int. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 1, 2015 at 2:10 PM, Pravin Shelar <pshelar@nicira.com> wrote: > On Tue, Sep 1, 2015 at 12:55 PM, Tom Herbert <tom@herbertland.com> wrote: >> On Tue, Sep 1, 2015 at 12:20 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>> On Tue, Sep 1, 2015 at 7:19 AM, Tom Herbert <tom@herbertland.com> wrote: >>>> On Mon, Aug 31, 2015 at 10:15 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>>>> On Mon, Aug 31, 2015 at 9:12 PM, Tom Herbert <tom@herbertland.com> wrote: >>>>>> On Mon, Aug 31, 2015 at 3:55 PM, Pravin B Shelar <pshelar@nicira.com> wrote: >>>>>>> VXLAN device can receive skb with checksum partial. But the checksum >>>>>>> offset could be in outer header which is pulled on receive. Such skb >>>>>>> can cause the panic when checksum is calculated on skb. Following patch >>>>>>> fixes the bug by setting checksum unnecessary while pulling outer header. >>>>>>> >>>>>> Okay, I think I understand what you are doing. I suggest in the >>>>>> openvswitch path, if there is a checksum CHECKSUM_PARTIAL that refers >>>>>> to the outer headers which must have been verified at this point then >>>>>> set to CHECKSUM_NONE-- assuming CHECKSUM_UNNECESSARY on the inner >>>>>> header is not correct in this case. If the CHECKSUM_PARTIAL refers to >>>>>> the inner header then you can call skb_checksum_help to resolve an >>>>>> inner checksum. >>>>>> >>>>> >>>>> That would be OVS specific fix, But I do see skb_checksum_help() >>>>> called in multiple places outside OVS that could result in similar >>>>> kernel panic. Therefore I want to solve it up in networking stack >>>>> rather than in OVS. >>>>> >>>> Please try to reproduce this out of OVS from the top of the tree then >>>> and report down exactly where panic is occurring the code. Unlike most >>>> of the of the other cases where skb_checksum_help() is being called >>>> this in the RX path so skb is probably not pulled over the checksum >>>> offset for those. Even so, if the skb is pulled beyond the checksum >>>> offset then this should result in a negative offset in >>>> skb_checksum_start_offset(skb) which should be okay. It looks like >>>> this in itself should not be causing your panic. >>>> >>> >>> ip_do_fragment() also calls skb_checksum_help() that can results in >>> similar panic. But it is not easy to reproduce it in this case due to >>> call site is in exception path. >>> The negative checksum offset can atleast cause assert failure in >>> skb_checksum_help(). I will send patch to fix that. >> >> Which BUG_ON do you see is hitting? > > BUG_ON(offset >= skb_headlen(skb)); > where skb_headlen() returns unsigned int, therefore negative offset > cast to unsigned int. Thanks, looks like skb_headlen() needs to be case to int for this comparison. Does this fix your issue? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 1, 2015 at 2:25 PM, Tom Herbert <tom@herbertland.com> wrote: > On Tue, Sep 1, 2015 at 2:10 PM, Pravin Shelar <pshelar@nicira.com> wrote: >> On Tue, Sep 1, 2015 at 12:55 PM, Tom Herbert <tom@herbertland.com> wrote: >>> On Tue, Sep 1, 2015 at 12:20 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>>> On Tue, Sep 1, 2015 at 7:19 AM, Tom Herbert <tom@herbertland.com> wrote: >>>>> On Mon, Aug 31, 2015 at 10:15 PM, Pravin Shelar <pshelar@nicira.com> wrote: >>>>>> On Mon, Aug 31, 2015 at 9:12 PM, Tom Herbert <tom@herbertland.com> wrote: >>>>>>> On Mon, Aug 31, 2015 at 3:55 PM, Pravin B Shelar <pshelar@nicira.com> wrote: >>>>>>>> VXLAN device can receive skb with checksum partial. But the checksum >>>>>>>> offset could be in outer header which is pulled on receive. Such skb >>>>>>>> can cause the panic when checksum is calculated on skb. Following patch >>>>>>>> fixes the bug by setting checksum unnecessary while pulling outer header. >>>>>>>> >>>>>>> Okay, I think I understand what you are doing. I suggest in the >>>>>>> openvswitch path, if there is a checksum CHECKSUM_PARTIAL that refers >>>>>>> to the outer headers which must have been verified at this point then >>>>>>> set to CHECKSUM_NONE-- assuming CHECKSUM_UNNECESSARY on the inner >>>>>>> header is not correct in this case. If the CHECKSUM_PARTIAL refers to >>>>>>> the inner header then you can call skb_checksum_help to resolve an >>>>>>> inner checksum. >>>>>>> >>>>>> >>>>>> That would be OVS specific fix, But I do see skb_checksum_help() >>>>>> called in multiple places outside OVS that could result in similar >>>>>> kernel panic. Therefore I want to solve it up in networking stack >>>>>> rather than in OVS. >>>>>> >>>>> Please try to reproduce this out of OVS from the top of the tree then >>>>> and report down exactly where panic is occurring the code. Unlike most >>>>> of the of the other cases where skb_checksum_help() is being called >>>>> this in the RX path so skb is probably not pulled over the checksum >>>>> offset for those. Even so, if the skb is pulled beyond the checksum >>>>> offset then this should result in a negative offset in >>>>> skb_checksum_start_offset(skb) which should be okay. It looks like >>>>> this in itself should not be causing your panic. >>>>> >>>> >>>> ip_do_fragment() also calls skb_checksum_help() that can results in >>>> similar panic. But it is not easy to reproduce it in this case due to >>>> call site is in exception path. >>>> The negative checksum offset can atleast cause assert failure in >>>> skb_checksum_help(). I will send patch to fix that. >>> >>> Which BUG_ON do you see is hitting? >> >> BUG_ON(offset >= skb_headlen(skb)); >> where skb_headlen() returns unsigned int, therefore negative offset >> cast to unsigned int. > > Thanks, looks like skb_headlen() needs to be case to int for this > comparison. Does this fix your issue? There are other function which takes len as unsigned int parameter, Need to fix those too. But even after this fix it would unnecessarily calculate checksum on packet. I want to avoid it. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 9b88536..6238e9f 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -2601,6 +2601,9 @@ static inline void skb_postpull_rcsum(struct sk_buff *skb, { if (skb->ip_summed == CHECKSUM_COMPLETE) skb->csum = csum_sub(skb->csum, csum_partial(start, len, 0)); + else if (skb->ip_summed == CHECKSUM_PARTIAL && + skb_checksum_start_offset(skb) <= len) + skb->ip_summed = CHECKSUM_UNNECESSARY; } unsigned char *skb_pull_rcsum(struct sk_buff *skb, unsigned int len);
VXLAN device can receive skb with checksum partial. But the checksum offset could be in outer header which is pulled on receive. Such skb can cause the panic when checksum is calculated on skb. Following patch fixes the bug by setting checksum unnecessary while pulling outer header. ---8<--- [ 13.800141] RIP: 0010:[<ffffffff81518034>] [<ffffffff81518034>] skb_checksum_help+0x144/0x150 [ 13.800141] RSP: 0000:ffff88011fd83940 EFLAGS: 00010292 [ 13.800141] RAX: 0000000000000042 RBX: ffff880114dd56c0 RCX: ffff8801188d9580 ... ... [ 13.852308] Call Trace: [ 13.852308] <IRQ> [ 13.852308] [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch] [ 13.852308] [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch] [ 13.852308] [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] [ 13.852308] [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] [ 13.852308] [<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch] [ 13.852308] [<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch] [ 13.852308] [<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch] [ 13.852308] [<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0 [ 13.852308] [<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0 [ 13.852308] [<ffffffff8157ba7a>] udp_rcv+0x1a/0x20 [ 13.852308] [<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280 [ 13.852308] [<ffffffff81550128>] ip_local_deliver+0x88/0x90 [ 13.852308] [<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370 [ 13.852308] [<ffffffff81550365>] ip_rcv+0x235/0x300 [ 13.852308] [<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620 [ 13.852308] [<ffffffff8151c360>] netif_receive_skb+0x80/0x90 [ 13.852308] [<ffffffff81459935>] virtnet_poll+0x555/0x6f0 [ 13.852308] [<ffffffff8151cd04>] net_rx_action+0x134/0x290 [ 13.852308] [<ffffffff810683d8>] __do_softirq+0xa8/0x210 [ 13.852308] [<ffffffff8162fe6c>] call_softirq+0x1c/0x30 [ 13.852308] [<ffffffff810161a5>] do_softirq+0x65/0xa0 [ 13.852308] [<ffffffff810687be>] irq_exit+0x8e/0xb0 [ 13.852308] [<ffffffff81630733>] do_IRQ+0x63/0xe0 [ 13.852308] [<ffffffff81625f2e>] common_interrupt+0x6e/0x6e [ 13.852308] <EOI> [ 13.852308] [<ffffffff8162dc02>] ? system_call_fastpath+0x16/0x1b Reported-by: Anupam Chanda <achanda@vmware.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> --- include/linux/skbuff.h | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-)