Message ID | 1510999287-1228-1-git-send-email-wangyunjian@huawei.com |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev] datapath: Fix kernel panic for uninitialized tun_dst of ovs_gso_cb. | expand |
Can we review this patch? It's been sitting around in the mailing list for two weeks without any reviews. > -----Original Message----- > From: wangyunjian > Sent: Saturday, November 18, 2017 6:01 PM > To: dev@openvswitch.org > Cc: caihe <caihe@huawei.com>; gaoxiaoqiu <gaoxiaoqiu@huawei.com>; > Lilijun (Jerry) <jerry.lilijun@huawei.com>; wangyunjian > <wangyunjian@huawei.com> > Subject: [ovs-dev] [PATCH] datapath: Fix kernel panic for uninitialized > tun_dst of ovs_gso_cb. > > From: Yunjian Wang <wangyunjian@huawei.com> > > The variable tun_dst in struct ovs_gso_cb isn't necessarily all-zeros which > came from the Netlink layer. When delete a netdev port and immediately > add a vxlan port, they maybe use the same port_no. So the variable tun_dst > of struct ovs_gso_cb hasn't be set, when the skb sent to the vxlan port. And > the panic will be triggered. > > BUG: unable to handle kernel NULL pointer dereference at > 0000000000000052 > IP: [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch] > PGD 1f9f374067 PUD 1f9f375067 PMD 0 > Oops: 0000 [#1] SMP > RIP: 0010:[<ffffffffa07954f4>] [<ffffffffa07954f4>] > rpl_vxlan_xmit+0x34/0x60 [openvswitch] > RSP: 0018:ffff881fff483898 EFLAGS: 00010202 > RAX: 0000000000000040 RBX: ffff881ff2d59f00 RCX: ffff881f742016b0 > RDX: 0000000000000001 RSI: ffff881f9f5f0000 RDI: ffff881ff2d59f00 > RBP: ffff881fff483898 R08: 000000000000002e R09: 0000000000000000 > R10: 0000000000000000 R11: ffff881fff483a50 R12: ffff881f74201680 > R13: 000000000000ffbe R14: 0000000000000000 R15: ffff881ff2d59f00 > FS: 00007f8b6f7fe700(0000) GS:ffff881fff480000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000052 CR3: 0000001f9f373000 CR4: > 00000000000027e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > Call Trace: > <IRQ> > [<ffffffffa0786480>] ovs_vport_send+0xa0/0x180 [openvswitch] > [<ffffffffa077414e>] do_output+0x4e/0xf0 [openvswitch] > [<ffffffffa07758ae>] do_execute_actions+0xa6e/0xa90 [openvswitch] > [<ffffffff815b654f>] ? netlink_unicast+0x16f/0x1b0 > [<ffffffff815732bb>] ? skb_zerocopy+0x1fb/0x380 > [<ffffffffa07847ca>] ? flow_lookup.isra.8+0x4a/0xc0 [openvswitch] > [<ffffffffa0775b2d>] ovs_execute_actions+0x4d/0x140 [openvswitch] > [<ffffffffa077c604>] ovs_dp_process_packet+0x94/0x140 > [openvswitch] > [<ffffffffa07762c4>] ? ovs_ct_update_key+0xc4/0x150 [openvswitch] > [<ffffffffa078637b>] ovs_vport_receive+0x7b/0xe0 [openvswitch] > [<ffffffffa077c604>] ? ovs_dp_process_packet+0x94/0x140 > [openvswitch] > [<ffffffff816062d6>] ? __fib_validate_source.isra.13+0x2b6/0x400 > [<ffffffff8158da15>] ? dst_init+0xe5/0xf0 > [<ffffffffa021a2af>] ? generic_packet+0x1f/0x30 [nf_conntrack] > [<ffffffffa02160d0>] ? nf_conntrack_in+0x350/0x5f0 [nf_conntrack] > [<ffffffffa0787047>] netdev_port_receive+0xa7/0x100 [openvswitch] > [<ffffffffa07870be>] netdev_frame_hook+0x1e/0x30 [openvswitch] > [<ffffffff81581a52>] __netif_receive_skb_core+0x1e2/0x800 > [<ffffffff81582088>] __netif_receive_skb+0x18/0x60 > [<ffffffff81582110>] netif_receive_skb_internal+0x40/0xc0 > [<ffffffff81583228>] napi_gro_receive+0xd8/0x130 > [<ffffffffa04ef634>] ixgbe_clean_rx_irq+0x7c4/0xa60 [ixgbe] > [<ffffffffa04f0930>] ixgbe_poll+0x2e0/0x6c0 [ixgbe] > [<ffffffff815828b0>] net_rx_action+0x170/0x380 > [<ffffffff81090b0f>] __do_softirq+0xef/0x280 > [<ffffffff816ac15c>] call_softirq+0x1c/0x30 > [<ffffffff8102e47d>] do_softirq+0x5d/0xb0 > [<ffffffff81090ebd>] irq_exit+0x12d/0x140 > [<ffffffff816accf8>] do_IRQ+0x58/0xf0 > [<ffffffff816a1ced>] common_interrupt+0x6d/0x6d > <EOI> > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > --- > datapath/linux/compat/gso.h | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/datapath/linux/compat/gso.h b/datapath/linux/compat/gso.h > index 2e9dbb3..2010940 100644 > --- a/datapath/linux/compat/gso.h > +++ b/datapath/linux/compat/gso.h > @@ -34,11 +34,16 @@ struct ovs_gso_cb { > static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) { > OVS_GSO_CB(skb)->fix_segment = NULL; > +#ifndef USE_UPSTREAM_TUNNEL > + OVS_GSO_CB(skb)->tun_dst = NULL; > +#endif > } > #else > static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) { > - > +#ifndef USE_UPSTREAM_TUNNEL > + OVS_GSO_CB(skb)->tun_dst = NULL; > +#endif > } > #endif > > -- > 1.8.3.1 >
I'm the wrong one to review it, I don't know the datapath well enough anymore. Perhaps Greg could take a look. On Mon, Dec 04, 2017 at 01:36:52AM +0000, wangyunjian wrote: > Can we review this patch? It's been sitting around in the mailing list for two > weeks without any reviews. > > > -----Original Message----- > > From: wangyunjian > > Sent: Saturday, November 18, 2017 6:01 PM > > To: dev@openvswitch.org > > Cc: caihe <caihe@huawei.com>; gaoxiaoqiu <gaoxiaoqiu@huawei.com>; > > Lilijun (Jerry) <jerry.lilijun@huawei.com>; wangyunjian > > <wangyunjian@huawei.com> > > Subject: [ovs-dev] [PATCH] datapath: Fix kernel panic for uninitialized > > tun_dst of ovs_gso_cb. > > > > From: Yunjian Wang <wangyunjian@huawei.com> > > > > The variable tun_dst in struct ovs_gso_cb isn't necessarily all-zeros which > > came from the Netlink layer. When delete a netdev port and immediately > > add a vxlan port, they maybe use the same port_no. So the variable tun_dst > > of struct ovs_gso_cb hasn't be set, when the skb sent to the vxlan port. And > > the panic will be triggered. > > > > BUG: unable to handle kernel NULL pointer dereference at > > 0000000000000052 > > IP: [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch] > > PGD 1f9f374067 PUD 1f9f375067 PMD 0 > > Oops: 0000 [#1] SMP > > RIP: 0010:[<ffffffffa07954f4>] [<ffffffffa07954f4>] > > rpl_vxlan_xmit+0x34/0x60 [openvswitch] > > RSP: 0018:ffff881fff483898 EFLAGS: 00010202 > > RAX: 0000000000000040 RBX: ffff881ff2d59f00 RCX: ffff881f742016b0 > > RDX: 0000000000000001 RSI: ffff881f9f5f0000 RDI: ffff881ff2d59f00 > > RBP: ffff881fff483898 R08: 000000000000002e R09: 0000000000000000 > > R10: 0000000000000000 R11: ffff881fff483a50 R12: ffff881f74201680 > > R13: 000000000000ffbe R14: 0000000000000000 R15: ffff881ff2d59f00 > > FS: 00007f8b6f7fe700(0000) GS:ffff881fff480000(0000) > > knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000000052 CR3: 0000001f9f373000 CR4: > > 00000000000027e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: > > 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > > 0000000000000400 > > Call Trace: > > <IRQ> > > [<ffffffffa0786480>] ovs_vport_send+0xa0/0x180 [openvswitch] > > [<ffffffffa077414e>] do_output+0x4e/0xf0 [openvswitch] > > [<ffffffffa07758ae>] do_execute_actions+0xa6e/0xa90 [openvswitch] > > [<ffffffff815b654f>] ? netlink_unicast+0x16f/0x1b0 > > [<ffffffff815732bb>] ? skb_zerocopy+0x1fb/0x380 > > [<ffffffffa07847ca>] ? flow_lookup.isra.8+0x4a/0xc0 [openvswitch] > > [<ffffffffa0775b2d>] ovs_execute_actions+0x4d/0x140 [openvswitch] > > [<ffffffffa077c604>] ovs_dp_process_packet+0x94/0x140 > > [openvswitch] > > [<ffffffffa07762c4>] ? ovs_ct_update_key+0xc4/0x150 [openvswitch] > > [<ffffffffa078637b>] ovs_vport_receive+0x7b/0xe0 [openvswitch] > > [<ffffffffa077c604>] ? ovs_dp_process_packet+0x94/0x140 > > [openvswitch] > > [<ffffffff816062d6>] ? __fib_validate_source.isra.13+0x2b6/0x400 > > [<ffffffff8158da15>] ? dst_init+0xe5/0xf0 > > [<ffffffffa021a2af>] ? generic_packet+0x1f/0x30 [nf_conntrack] > > [<ffffffffa02160d0>] ? nf_conntrack_in+0x350/0x5f0 [nf_conntrack] > > [<ffffffffa0787047>] netdev_port_receive+0xa7/0x100 [openvswitch] > > [<ffffffffa07870be>] netdev_frame_hook+0x1e/0x30 [openvswitch] > > [<ffffffff81581a52>] __netif_receive_skb_core+0x1e2/0x800 > > [<ffffffff81582088>] __netif_receive_skb+0x18/0x60 > > [<ffffffff81582110>] netif_receive_skb_internal+0x40/0xc0 > > [<ffffffff81583228>] napi_gro_receive+0xd8/0x130 > > [<ffffffffa04ef634>] ixgbe_clean_rx_irq+0x7c4/0xa60 [ixgbe] > > [<ffffffffa04f0930>] ixgbe_poll+0x2e0/0x6c0 [ixgbe] > > [<ffffffff815828b0>] net_rx_action+0x170/0x380 > > [<ffffffff81090b0f>] __do_softirq+0xef/0x280 > > [<ffffffff816ac15c>] call_softirq+0x1c/0x30 > > [<ffffffff8102e47d>] do_softirq+0x5d/0xb0 > > [<ffffffff81090ebd>] irq_exit+0x12d/0x140 > > [<ffffffff816accf8>] do_IRQ+0x58/0xf0 > > [<ffffffff816a1ced>] common_interrupt+0x6d/0x6d > > <EOI> > > > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > > --- > > datapath/linux/compat/gso.h | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/datapath/linux/compat/gso.h b/datapath/linux/compat/gso.h > > index 2e9dbb3..2010940 100644 > > --- a/datapath/linux/compat/gso.h > > +++ b/datapath/linux/compat/gso.h > > @@ -34,11 +34,16 @@ struct ovs_gso_cb { > > static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) { > > OVS_GSO_CB(skb)->fix_segment = NULL; > > +#ifndef USE_UPSTREAM_TUNNEL > > + OVS_GSO_CB(skb)->tun_dst = NULL; > > +#endif > > } > > #else > > static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) { > > - > > +#ifndef USE_UPSTREAM_TUNNEL > > + OVS_GSO_CB(skb)->tun_dst = NULL; > > +#endif > > } > > #endif > > > > -- > > 1.8.3.1 > > >
On 12/3/2017 8:51 PM, Ben Pfaff wrote: > I'm the wrong one to review it, I don't know the datapath well enough > anymore. > > Perhaps Greg could take a look. I'll have a look at it today. Thanks, - Greg > > On Mon, Dec 04, 2017 at 01:36:52AM +0000, wangyunjian wrote: >> Can we review this patch? It's been sitting around in the mailing list for two >> weeks without any reviews. >> >>> -----Original Message----- >>> From: wangyunjian >>> Sent: Saturday, November 18, 2017 6:01 PM >>> To: dev@openvswitch.org >>> Cc: caihe <caihe@huawei.com>; gaoxiaoqiu <gaoxiaoqiu@huawei.com>; >>> Lilijun (Jerry) <jerry.lilijun@huawei.com>; wangyunjian >>> <wangyunjian@huawei.com> >>> Subject: [ovs-dev] [PATCH] datapath: Fix kernel panic for uninitialized >>> tun_dst of ovs_gso_cb. >>> >>> From: Yunjian Wang <wangyunjian@huawei.com> >>> >>> The variable tun_dst in struct ovs_gso_cb isn't necessarily all-zeros which >>> came from the Netlink layer. When delete a netdev port and immediately >>> add a vxlan port, they maybe use the same port_no. So the variable tun_dst >>> of struct ovs_gso_cb hasn't be set, when the skb sent to the vxlan port. And >>> the panic will be triggered. >>> >>> BUG: unable to handle kernel NULL pointer dereference at >>> 0000000000000052 >>> IP: [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch] >>> PGD 1f9f374067 PUD 1f9f375067 PMD 0 >>> Oops: 0000 [#1] SMP >>> RIP: 0010:[<ffffffffa07954f4>] [<ffffffffa07954f4>] >>> rpl_vxlan_xmit+0x34/0x60 [openvswitch] >>> RSP: 0018:ffff881fff483898 EFLAGS: 00010202 >>> RAX: 0000000000000040 RBX: ffff881ff2d59f00 RCX: ffff881f742016b0 >>> RDX: 0000000000000001 RSI: ffff881f9f5f0000 RDI: ffff881ff2d59f00 >>> RBP: ffff881fff483898 R08: 000000000000002e R09: 0000000000000000 >>> R10: 0000000000000000 R11: ffff881fff483a50 R12: ffff881f74201680 >>> R13: 000000000000ffbe R14: 0000000000000000 R15: ffff881ff2d59f00 >>> FS: 00007f8b6f7fe700(0000) GS:ffff881fff480000(0000) >>> knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: 0000000000000052 CR3: 0000001f9f373000 CR4: >>> 00000000000027e0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>> 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >>> 0000000000000400 >>> Call Trace: >>> <IRQ> >>> [<ffffffffa0786480>] ovs_vport_send+0xa0/0x180 [openvswitch] >>> [<ffffffffa077414e>] do_output+0x4e/0xf0 [openvswitch] >>> [<ffffffffa07758ae>] do_execute_actions+0xa6e/0xa90 [openvswitch] >>> [<ffffffff815b654f>] ? netlink_unicast+0x16f/0x1b0 >>> [<ffffffff815732bb>] ? skb_zerocopy+0x1fb/0x380 >>> [<ffffffffa07847ca>] ? flow_lookup.isra.8+0x4a/0xc0 [openvswitch] >>> [<ffffffffa0775b2d>] ovs_execute_actions+0x4d/0x140 [openvswitch] >>> [<ffffffffa077c604>] ovs_dp_process_packet+0x94/0x140 >>> [openvswitch] >>> [<ffffffffa07762c4>] ? ovs_ct_update_key+0xc4/0x150 [openvswitch] >>> [<ffffffffa078637b>] ovs_vport_receive+0x7b/0xe0 [openvswitch] >>> [<ffffffffa077c604>] ? ovs_dp_process_packet+0x94/0x140 >>> [openvswitch] >>> [<ffffffff816062d6>] ? __fib_validate_source.isra.13+0x2b6/0x400 >>> [<ffffffff8158da15>] ? dst_init+0xe5/0xf0 >>> [<ffffffffa021a2af>] ? generic_packet+0x1f/0x30 [nf_conntrack] >>> [<ffffffffa02160d0>] ? nf_conntrack_in+0x350/0x5f0 [nf_conntrack] >>> [<ffffffffa0787047>] netdev_port_receive+0xa7/0x100 [openvswitch] >>> [<ffffffffa07870be>] netdev_frame_hook+0x1e/0x30 [openvswitch] >>> [<ffffffff81581a52>] __netif_receive_skb_core+0x1e2/0x800 >>> [<ffffffff81582088>] __netif_receive_skb+0x18/0x60 >>> [<ffffffff81582110>] netif_receive_skb_internal+0x40/0xc0 >>> [<ffffffff81583228>] napi_gro_receive+0xd8/0x130 >>> [<ffffffffa04ef634>] ixgbe_clean_rx_irq+0x7c4/0xa60 [ixgbe] >>> [<ffffffffa04f0930>] ixgbe_poll+0x2e0/0x6c0 [ixgbe] >>> [<ffffffff815828b0>] net_rx_action+0x170/0x380 >>> [<ffffffff81090b0f>] __do_softirq+0xef/0x280 >>> [<ffffffff816ac15c>] call_softirq+0x1c/0x30 >>> [<ffffffff8102e47d>] do_softirq+0x5d/0xb0 >>> [<ffffffff81090ebd>] irq_exit+0x12d/0x140 >>> [<ffffffff816accf8>] do_IRQ+0x58/0xf0 >>> [<ffffffff816a1ced>] common_interrupt+0x6d/0x6d >>> <EOI> >>> >>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> >>> --- >>> datapath/linux/compat/gso.h | 7 ++++++- >>> 1 file changed, 6 insertions(+), 1 deletion(-) >>> >>> diff --git a/datapath/linux/compat/gso.h b/datapath/linux/compat/gso.h >>> index 2e9dbb3..2010940 100644 >>> --- a/datapath/linux/compat/gso.h >>> +++ b/datapath/linux/compat/gso.h >>> @@ -34,11 +34,16 @@ struct ovs_gso_cb { >>> static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) { >>> OVS_GSO_CB(skb)->fix_segment = NULL; >>> +#ifndef USE_UPSTREAM_TUNNEL >>> + OVS_GSO_CB(skb)->tun_dst = NULL; >>> +#endif >>> } >>> #else >>> static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) { >>> - >>> +#ifndef USE_UPSTREAM_TUNNEL >>> + OVS_GSO_CB(skb)->tun_dst = NULL; >>> +#endif >>> } >>> #endif >>> >>> -- >>> 1.8.3.1 >>>
On 11/18/2017 2:01 AM, w00273186 wrote: > From: Yunjian Wang <wangyunjian@huawei.com> > > The variable tun_dst in struct ovs_gso_cb isn't necessarily all-zeros which > came from the Netlink layer. When delete a netdev port and immediately add > a vxlan port, they maybe use the same port_no. So the variable tun_dst of > struct ovs_gso_cb hasn't be set, when the skb sent to the vxlan port. And > the panic will be triggered. > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000052 > IP: [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch] > PGD 1f9f374067 PUD 1f9f375067 PMD 0 > Oops: 0000 [#1] SMP > RIP: 0010:[<ffffffffa07954f4>] [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch] > RSP: 0018:ffff881fff483898 EFLAGS: 00010202 > RAX: 0000000000000040 RBX: ffff881ff2d59f00 RCX: ffff881f742016b0 > RDX: 0000000000000001 RSI: ffff881f9f5f0000 RDI: ffff881ff2d59f00 > RBP: ffff881fff483898 R08: 000000000000002e R09: 0000000000000000 > R10: 0000000000000000 R11: ffff881fff483a50 R12: ffff881f74201680 > R13: 000000000000ffbe R14: 0000000000000000 R15: ffff881ff2d59f00 > FS: 00007f8b6f7fe700(0000) GS:ffff881fff480000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000052 CR3: 0000001f9f373000 CR4: 00000000000027e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Call Trace: > <IRQ> > [<ffffffffa0786480>] ovs_vport_send+0xa0/0x180 [openvswitch] > [<ffffffffa077414e>] do_output+0x4e/0xf0 [openvswitch] > [<ffffffffa07758ae>] do_execute_actions+0xa6e/0xa90 [openvswitch] > [<ffffffff815b654f>] ? netlink_unicast+0x16f/0x1b0 > [<ffffffff815732bb>] ? skb_zerocopy+0x1fb/0x380 > [<ffffffffa07847ca>] ? flow_lookup.isra.8+0x4a/0xc0 [openvswitch] > [<ffffffffa0775b2d>] ovs_execute_actions+0x4d/0x140 [openvswitch] > [<ffffffffa077c604>] ovs_dp_process_packet+0x94/0x140 [openvswitch] > [<ffffffffa07762c4>] ? ovs_ct_update_key+0xc4/0x150 [openvswitch] > [<ffffffffa078637b>] ovs_vport_receive+0x7b/0xe0 [openvswitch] > [<ffffffffa077c604>] ? ovs_dp_process_packet+0x94/0x140 [openvswitch] > [<ffffffff816062d6>] ? __fib_validate_source.isra.13+0x2b6/0x400 > [<ffffffff8158da15>] ? dst_init+0xe5/0xf0 > [<ffffffffa021a2af>] ? generic_packet+0x1f/0x30 [nf_conntrack] > [<ffffffffa02160d0>] ? nf_conntrack_in+0x350/0x5f0 [nf_conntrack] > [<ffffffffa0787047>] netdev_port_receive+0xa7/0x100 [openvswitch] > [<ffffffffa07870be>] netdev_frame_hook+0x1e/0x30 [openvswitch] > [<ffffffff81581a52>] __netif_receive_skb_core+0x1e2/0x800 > [<ffffffff81582088>] __netif_receive_skb+0x18/0x60 > [<ffffffff81582110>] netif_receive_skb_internal+0x40/0xc0 > [<ffffffff81583228>] napi_gro_receive+0xd8/0x130 > [<ffffffffa04ef634>] ixgbe_clean_rx_irq+0x7c4/0xa60 [ixgbe] > [<ffffffffa04f0930>] ixgbe_poll+0x2e0/0x6c0 [ixgbe] > [<ffffffff815828b0>] net_rx_action+0x170/0x380 > [<ffffffff81090b0f>] __do_softirq+0xef/0x280 > [<ffffffff816ac15c>] call_softirq+0x1c/0x30 > [<ffffffff8102e47d>] do_softirq+0x5d/0xb0 > [<ffffffff81090ebd>] irq_exit+0x12d/0x140 > [<ffffffff816accf8>] do_IRQ+0x58/0xf0 > [<ffffffff816a1ced>] common_interrupt+0x6d/0x6d > <EOI> > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > --- > datapath/linux/compat/gso.h | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/datapath/linux/compat/gso.h b/datapath/linux/compat/gso.h > index 2e9dbb3..2010940 100644 > --- a/datapath/linux/compat/gso.h > +++ b/datapath/linux/compat/gso.h > @@ -34,11 +34,16 @@ struct ovs_gso_cb { > static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) > { > OVS_GSO_CB(skb)->fix_segment = NULL; > +#ifndef USE_UPSTREAM_TUNNEL > + OVS_GSO_CB(skb)->tun_dst = NULL; > +#endif > } > #else > static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) > { > - > +#ifndef USE_UPSTREAM_TUNNEL > + OVS_GSO_CB(skb)->tun_dst = NULL; > +#endif > } > #endif > I think this is the right thing to do and it passes compile and check-kmod tests on both Ubuntu 16 and RHEL 7 hosts. Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com>
On Mon, Dec 04, 2017 at 09:00:56AM -0800, Gregory Rose wrote: > On 11/18/2017 2:01 AM, w00273186 wrote: > >From: Yunjian Wang <wangyunjian@huawei.com> > > > >The variable tun_dst in struct ovs_gso_cb isn't necessarily all-zeros which > >came from the Netlink layer. When delete a netdev port and immediately add > >a vxlan port, they maybe use the same port_no. So the variable tun_dst of > >struct ovs_gso_cb hasn't be set, when the skb sent to the vxlan port. And > >the panic will be triggered. > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000052 > > IP: [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch] > > PGD 1f9f374067 PUD 1f9f375067 PMD 0 > > Oops: 0000 [#1] SMP > > RIP: 0010:[<ffffffffa07954f4>] [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch] > > RSP: 0018:ffff881fff483898 EFLAGS: 00010202 > > RAX: 0000000000000040 RBX: ffff881ff2d59f00 RCX: ffff881f742016b0 > > RDX: 0000000000000001 RSI: ffff881f9f5f0000 RDI: ffff881ff2d59f00 > > RBP: ffff881fff483898 R08: 000000000000002e R09: 0000000000000000 > > R10: 0000000000000000 R11: ffff881fff483a50 R12: ffff881f74201680 > > R13: 000000000000ffbe R14: 0000000000000000 R15: ffff881ff2d59f00 > > FS: 00007f8b6f7fe700(0000) GS:ffff881fff480000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000000052 CR3: 0000001f9f373000 CR4: 00000000000027e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Call Trace: > > <IRQ> > > [<ffffffffa0786480>] ovs_vport_send+0xa0/0x180 [openvswitch] > > [<ffffffffa077414e>] do_output+0x4e/0xf0 [openvswitch] > > [<ffffffffa07758ae>] do_execute_actions+0xa6e/0xa90 [openvswitch] > > [<ffffffff815b654f>] ? netlink_unicast+0x16f/0x1b0 > > [<ffffffff815732bb>] ? skb_zerocopy+0x1fb/0x380 > > [<ffffffffa07847ca>] ? flow_lookup.isra.8+0x4a/0xc0 [openvswitch] > > [<ffffffffa0775b2d>] ovs_execute_actions+0x4d/0x140 [openvswitch] > > [<ffffffffa077c604>] ovs_dp_process_packet+0x94/0x140 [openvswitch] > > [<ffffffffa07762c4>] ? ovs_ct_update_key+0xc4/0x150 [openvswitch] > > [<ffffffffa078637b>] ovs_vport_receive+0x7b/0xe0 [openvswitch] > > [<ffffffffa077c604>] ? ovs_dp_process_packet+0x94/0x140 [openvswitch] > > [<ffffffff816062d6>] ? __fib_validate_source.isra.13+0x2b6/0x400 > > [<ffffffff8158da15>] ? dst_init+0xe5/0xf0 > > [<ffffffffa021a2af>] ? generic_packet+0x1f/0x30 [nf_conntrack] > > [<ffffffffa02160d0>] ? nf_conntrack_in+0x350/0x5f0 [nf_conntrack] > > [<ffffffffa0787047>] netdev_port_receive+0xa7/0x100 [openvswitch] > > [<ffffffffa07870be>] netdev_frame_hook+0x1e/0x30 [openvswitch] > > [<ffffffff81581a52>] __netif_receive_skb_core+0x1e2/0x800 > > [<ffffffff81582088>] __netif_receive_skb+0x18/0x60 > > [<ffffffff81582110>] netif_receive_skb_internal+0x40/0xc0 > > [<ffffffff81583228>] napi_gro_receive+0xd8/0x130 > > [<ffffffffa04ef634>] ixgbe_clean_rx_irq+0x7c4/0xa60 [ixgbe] > > [<ffffffffa04f0930>] ixgbe_poll+0x2e0/0x6c0 [ixgbe] > > [<ffffffff815828b0>] net_rx_action+0x170/0x380 > > [<ffffffff81090b0f>] __do_softirq+0xef/0x280 > > [<ffffffff816ac15c>] call_softirq+0x1c/0x30 > > [<ffffffff8102e47d>] do_softirq+0x5d/0xb0 > > [<ffffffff81090ebd>] irq_exit+0x12d/0x140 > > [<ffffffff816accf8>] do_IRQ+0x58/0xf0 > > [<ffffffff816a1ced>] common_interrupt+0x6d/0x6d > > <EOI> > > > >Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > >--- > > datapath/linux/compat/gso.h | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > >diff --git a/datapath/linux/compat/gso.h b/datapath/linux/compat/gso.h > >index 2e9dbb3..2010940 100644 > >--- a/datapath/linux/compat/gso.h > >+++ b/datapath/linux/compat/gso.h > >@@ -34,11 +34,16 @@ struct ovs_gso_cb { > > static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) > > { > > OVS_GSO_CB(skb)->fix_segment = NULL; > >+#ifndef USE_UPSTREAM_TUNNEL > >+ OVS_GSO_CB(skb)->tun_dst = NULL; > >+#endif > > } > > #else > > static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) > > { > >- > >+#ifndef USE_UPSTREAM_TUNNEL > >+ OVS_GSO_CB(skb)->tun_dst = NULL; > >+#endif > > } > > #endif > > I think this is the right thing to do and it passes compile and check-kmod > tests on both Ubuntu 16 and RHEL 7 hosts. > > Tested-by: Greg Rose <gvrose8192@gmail.com> > Reviewed-by: Greg Rose <gvrose8192@gmail.com> Thanks, applied to master. Let me know if I should backport this (and how far).
On 12/4/2017 9:45 AM, Ben Pfaff wrote: > On Mon, Dec 04, 2017 at 09:00:56AM -0800, Gregory Rose wrote: >> On 11/18/2017 2:01 AM, w00273186 wrote: >>> From: Yunjian Wang <wangyunjian@huawei.com> >>> >>> The variable tun_dst in struct ovs_gso_cb isn't necessarily all-zeros which >>> came from the Netlink layer. When delete a netdev port and immediately add >>> a vxlan port, they maybe use the same port_no. So the variable tun_dst of >>> struct ovs_gso_cb hasn't be set, when the skb sent to the vxlan port. And >>> the panic will be triggered. >>> >>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000052 >>> IP: [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch] >>> PGD 1f9f374067 PUD 1f9f375067 PMD 0 >>> Oops: 0000 [#1] SMP >>> RIP: 0010:[<ffffffffa07954f4>] [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch] >>> RSP: 0018:ffff881fff483898 EFLAGS: 00010202 >>> RAX: 0000000000000040 RBX: ffff881ff2d59f00 RCX: ffff881f742016b0 >>> RDX: 0000000000000001 RSI: ffff881f9f5f0000 RDI: ffff881ff2d59f00 >>> RBP: ffff881fff483898 R08: 000000000000002e R09: 0000000000000000 >>> R10: 0000000000000000 R11: ffff881fff483a50 R12: ffff881f74201680 >>> R13: 000000000000ffbe R14: 0000000000000000 R15: ffff881ff2d59f00 >>> FS: 00007f8b6f7fe700(0000) GS:ffff881fff480000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: 0000000000000052 CR3: 0000001f9f373000 CR4: 00000000000027e0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >>> Call Trace: >>> <IRQ> >>> [<ffffffffa0786480>] ovs_vport_send+0xa0/0x180 [openvswitch] >>> [<ffffffffa077414e>] do_output+0x4e/0xf0 [openvswitch] >>> [<ffffffffa07758ae>] do_execute_actions+0xa6e/0xa90 [openvswitch] >>> [<ffffffff815b654f>] ? netlink_unicast+0x16f/0x1b0 >>> [<ffffffff815732bb>] ? skb_zerocopy+0x1fb/0x380 >>> [<ffffffffa07847ca>] ? flow_lookup.isra.8+0x4a/0xc0 [openvswitch] >>> [<ffffffffa0775b2d>] ovs_execute_actions+0x4d/0x140 [openvswitch] >>> [<ffffffffa077c604>] ovs_dp_process_packet+0x94/0x140 [openvswitch] >>> [<ffffffffa07762c4>] ? ovs_ct_update_key+0xc4/0x150 [openvswitch] >>> [<ffffffffa078637b>] ovs_vport_receive+0x7b/0xe0 [openvswitch] >>> [<ffffffffa077c604>] ? ovs_dp_process_packet+0x94/0x140 [openvswitch] >>> [<ffffffff816062d6>] ? __fib_validate_source.isra.13+0x2b6/0x400 >>> [<ffffffff8158da15>] ? dst_init+0xe5/0xf0 >>> [<ffffffffa021a2af>] ? generic_packet+0x1f/0x30 [nf_conntrack] >>> [<ffffffffa02160d0>] ? nf_conntrack_in+0x350/0x5f0 [nf_conntrack] >>> [<ffffffffa0787047>] netdev_port_receive+0xa7/0x100 [openvswitch] >>> [<ffffffffa07870be>] netdev_frame_hook+0x1e/0x30 [openvswitch] >>> [<ffffffff81581a52>] __netif_receive_skb_core+0x1e2/0x800 >>> [<ffffffff81582088>] __netif_receive_skb+0x18/0x60 >>> [<ffffffff81582110>] netif_receive_skb_internal+0x40/0xc0 >>> [<ffffffff81583228>] napi_gro_receive+0xd8/0x130 >>> [<ffffffffa04ef634>] ixgbe_clean_rx_irq+0x7c4/0xa60 [ixgbe] >>> [<ffffffffa04f0930>] ixgbe_poll+0x2e0/0x6c0 [ixgbe] >>> [<ffffffff815828b0>] net_rx_action+0x170/0x380 >>> [<ffffffff81090b0f>] __do_softirq+0xef/0x280 >>> [<ffffffff816ac15c>] call_softirq+0x1c/0x30 >>> [<ffffffff8102e47d>] do_softirq+0x5d/0xb0 >>> [<ffffffff81090ebd>] irq_exit+0x12d/0x140 >>> [<ffffffff816accf8>] do_IRQ+0x58/0xf0 >>> [<ffffffff816a1ced>] common_interrupt+0x6d/0x6d >>> <EOI> >>> >>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> >>> --- >>> datapath/linux/compat/gso.h | 7 ++++++- >>> 1 file changed, 6 insertions(+), 1 deletion(-) >>> >>> diff --git a/datapath/linux/compat/gso.h b/datapath/linux/compat/gso.h >>> index 2e9dbb3..2010940 100644 >>> --- a/datapath/linux/compat/gso.h >>> +++ b/datapath/linux/compat/gso.h >>> @@ -34,11 +34,16 @@ struct ovs_gso_cb { >>> static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) >>> { >>> OVS_GSO_CB(skb)->fix_segment = NULL; >>> +#ifndef USE_UPSTREAM_TUNNEL >>> + OVS_GSO_CB(skb)->tun_dst = NULL; >>> +#endif >>> } >>> #else >>> static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) >>> { >>> - >>> +#ifndef USE_UPSTREAM_TUNNEL >>> + OVS_GSO_CB(skb)->tun_dst = NULL; >>> +#endif >>> } >>> #endif >> I think this is the right thing to do and it passes compile and check-kmod >> tests on both Ubuntu 16 and RHEL 7 hosts. >> >> Tested-by: Greg Rose <gvrose8192@gmail.com> >> Reviewed-by: Greg Rose <gvrose8192@gmail.com> > Thanks, applied to master. > > Let me know if I should backport this (and how far). Looks to me like it should go all the way back to 2.6. Glad you asked! - Greg
On Mon, Dec 04, 2017 at 10:14:31AM -0800, Gregory Rose wrote: > On 12/4/2017 9:45 AM, Ben Pfaff wrote: > >On Mon, Dec 04, 2017 at 09:00:56AM -0800, Gregory Rose wrote: > >>On 11/18/2017 2:01 AM, w00273186 wrote: > >>>From: Yunjian Wang <wangyunjian@huawei.com> > >>> > >>>The variable tun_dst in struct ovs_gso_cb isn't necessarily all-zeros which > >>>came from the Netlink layer. When delete a netdev port and immediately add > >>>a vxlan port, they maybe use the same port_no. So the variable tun_dst of > >>>struct ovs_gso_cb hasn't be set, when the skb sent to the vxlan port. And > >>>the panic will be triggered. > >>> > >>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000052 > >>> IP: [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch] > >>> PGD 1f9f374067 PUD 1f9f375067 PMD 0 > >>> Oops: 0000 [#1] SMP > >>> RIP: 0010:[<ffffffffa07954f4>] [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch] > >>> RSP: 0018:ffff881fff483898 EFLAGS: 00010202 > >>> RAX: 0000000000000040 RBX: ffff881ff2d59f00 RCX: ffff881f742016b0 > >>> RDX: 0000000000000001 RSI: ffff881f9f5f0000 RDI: ffff881ff2d59f00 > >>> RBP: ffff881fff483898 R08: 000000000000002e R09: 0000000000000000 > >>> R10: 0000000000000000 R11: ffff881fff483a50 R12: ffff881f74201680 > >>> R13: 000000000000ffbe R14: 0000000000000000 R15: ffff881ff2d59f00 > >>> FS: 00007f8b6f7fe700(0000) GS:ffff881fff480000(0000) knlGS:0000000000000000 > >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>> CR2: 0000000000000052 CR3: 0000001f9f373000 CR4: 00000000000027e0 > >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > >>> Call Trace: > >>> <IRQ> > >>> [<ffffffffa0786480>] ovs_vport_send+0xa0/0x180 [openvswitch] > >>> [<ffffffffa077414e>] do_output+0x4e/0xf0 [openvswitch] > >>> [<ffffffffa07758ae>] do_execute_actions+0xa6e/0xa90 [openvswitch] > >>> [<ffffffff815b654f>] ? netlink_unicast+0x16f/0x1b0 > >>> [<ffffffff815732bb>] ? skb_zerocopy+0x1fb/0x380 > >>> [<ffffffffa07847ca>] ? flow_lookup.isra.8+0x4a/0xc0 [openvswitch] > >>> [<ffffffffa0775b2d>] ovs_execute_actions+0x4d/0x140 [openvswitch] > >>> [<ffffffffa077c604>] ovs_dp_process_packet+0x94/0x140 [openvswitch] > >>> [<ffffffffa07762c4>] ? ovs_ct_update_key+0xc4/0x150 [openvswitch] > >>> [<ffffffffa078637b>] ovs_vport_receive+0x7b/0xe0 [openvswitch] > >>> [<ffffffffa077c604>] ? ovs_dp_process_packet+0x94/0x140 [openvswitch] > >>> [<ffffffff816062d6>] ? __fib_validate_source.isra.13+0x2b6/0x400 > >>> [<ffffffff8158da15>] ? dst_init+0xe5/0xf0 > >>> [<ffffffffa021a2af>] ? generic_packet+0x1f/0x30 [nf_conntrack] > >>> [<ffffffffa02160d0>] ? nf_conntrack_in+0x350/0x5f0 [nf_conntrack] > >>> [<ffffffffa0787047>] netdev_port_receive+0xa7/0x100 [openvswitch] > >>> [<ffffffffa07870be>] netdev_frame_hook+0x1e/0x30 [openvswitch] > >>> [<ffffffff81581a52>] __netif_receive_skb_core+0x1e2/0x800 > >>> [<ffffffff81582088>] __netif_receive_skb+0x18/0x60 > >>> [<ffffffff81582110>] netif_receive_skb_internal+0x40/0xc0 > >>> [<ffffffff81583228>] napi_gro_receive+0xd8/0x130 > >>> [<ffffffffa04ef634>] ixgbe_clean_rx_irq+0x7c4/0xa60 [ixgbe] > >>> [<ffffffffa04f0930>] ixgbe_poll+0x2e0/0x6c0 [ixgbe] > >>> [<ffffffff815828b0>] net_rx_action+0x170/0x380 > >>> [<ffffffff81090b0f>] __do_softirq+0xef/0x280 > >>> [<ffffffff816ac15c>] call_softirq+0x1c/0x30 > >>> [<ffffffff8102e47d>] do_softirq+0x5d/0xb0 > >>> [<ffffffff81090ebd>] irq_exit+0x12d/0x140 > >>> [<ffffffff816accf8>] do_IRQ+0x58/0xf0 > >>> [<ffffffff816a1ced>] common_interrupt+0x6d/0x6d > >>> <EOI> > >>> > >>>Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > >>>--- > >>> datapath/linux/compat/gso.h | 7 ++++++- > >>> 1 file changed, 6 insertions(+), 1 deletion(-) > >>> > >>>diff --git a/datapath/linux/compat/gso.h b/datapath/linux/compat/gso.h > >>>index 2e9dbb3..2010940 100644 > >>>--- a/datapath/linux/compat/gso.h > >>>+++ b/datapath/linux/compat/gso.h > >>>@@ -34,11 +34,16 @@ struct ovs_gso_cb { > >>> static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) > >>> { > >>> OVS_GSO_CB(skb)->fix_segment = NULL; > >>>+#ifndef USE_UPSTREAM_TUNNEL > >>>+ OVS_GSO_CB(skb)->tun_dst = NULL; > >>>+#endif > >>> } > >>> #else > >>> static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) > >>> { > >>>- > >>>+#ifndef USE_UPSTREAM_TUNNEL > >>>+ OVS_GSO_CB(skb)->tun_dst = NULL; > >>>+#endif > >>> } > >>> #endif > >>I think this is the right thing to do and it passes compile and check-kmod > >>tests on both Ubuntu 16 and RHEL 7 hosts. > >> > >>Tested-by: Greg Rose <gvrose8192@gmail.com> > >>Reviewed-by: Greg Rose <gvrose8192@gmail.com> > >Thanks, applied to master. > > > >Let me know if I should backport this (and how far). > Looks to me like it should go all the way back to 2.6. Glad you asked! OK, done, thanks!
diff --git a/datapath/linux/compat/gso.h b/datapath/linux/compat/gso.h index 2e9dbb3..2010940 100644 --- a/datapath/linux/compat/gso.h +++ b/datapath/linux/compat/gso.h @@ -34,11 +34,16 @@ struct ovs_gso_cb { static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) { OVS_GSO_CB(skb)->fix_segment = NULL; +#ifndef USE_UPSTREAM_TUNNEL + OVS_GSO_CB(skb)->tun_dst = NULL; +#endif } #else static inline void skb_clear_ovs_gso_cb(struct sk_buff *skb) { - +#ifndef USE_UPSTREAM_TUNNEL + OVS_GSO_CB(skb)->tun_dst = NULL; +#endif } #endif