Message ID | 1538756394-46410-1-git-send-email-yihung.wei@gmail.com |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev] ofp-packet: Fix NXT_RESUME with geneve tunnel metadata | expand |
On Fri, Oct 05, 2018 at 09:19:54AM -0700, Yi-Hung Wei wrote: > The patch address vswitchd crash when it receives NXT_RESUME with geneve > tunnel metadata. The crash is due to segmentation fault with the > following stack trace, and it is observed only in kernel datapath. > A test is added to prevent regression. > > Thread 1 "ovs-vswitchd" received signal SIGSEGV, Segmentation fault. > 0 0x00007fcffd0c5412 in tun_metadata_to_geneve__ (flow=flow@entry=0x7ffcb7106680, b=b@entry=0x7ffcb70eb5a8, crit_opt=crit_opt@entry=0x7ffcb70eb287) > at lib/tun-metadata.c:676 > 1 0x00007fcffd0c6858 in tun_metadata_to_geneve_nlattr_flow (b=0x7ffcb70eb5a8, flow=0x7ffcb7106638) at lib/tun-metadata.c:706 > 2 tun_metadata_to_geneve_nlattr (tun=tun@entry=0x7ffcb7106638, flow=flow@entry=0x7ffcb7106638, key=key@entry=0x0, b=b@entry=0x7ffcb70eb5a8) > at lib/tun-metadata.c:810 > 3 0x00007fcffd048464 in tun_key_to_attr (a=a@entry=0x7ffcb70eb5a8, tun_key=tun_key@entry=0x7ffcb7106638, tun_flow_key=tun_flow_key@entry=0x7ffcb7106638, > key_buf=key_buf@entry=0x0, tnl_type=<optimized out>, tnl_type@entry=0x0) at lib/odp-util.c:2886 > 4 0x00007fcffd0551cf in odp_key_from_dp_packet (buf=buf@entry=0x7ffcb70eb5a8, packet=0x7ffcb7106590) at lib/odp-util.c:5909 > 5 0x00007fcffd0d7870 in dpif_netlink_encode_execute (buf=0x7ffcb70eb5a8, d_exec=0x7ffcb7106428, dp_ifindex=<optimized out>) at lib/dpif-netlink.c:1873 > 6 dpif_netlink_operate__ (dpif=dpif@entry=0xe65e00, ops=ops@entry=0x7ffcb7106418, n_ops=n_ops@entry=1) at lib/dpif-netlink.c:1959 > 7 0x00007fcffd0d842e in dpif_netlink_operate_chunks (n_ops=1, ops=0x7ffcb7106418, dpif=<optimized out>) at lib/dpif-netlink.c:2258 > 8 dpif_netlink_operate (dpif_=0xe65e00, ops=<optimized out>, n_ops=<optimized out>) at lib/dpif-netlink.c:2294 > 9 0x00007fcffd014680 in dpif_operate (dpif=<optimized out>, ops=<optimized out>, ops@entry=0x7ffcb7106418, n_ops=n_ops@entry=1) at lib/dpif.c:1359 > 10 0x00007fcffd014c58 in dpif_execute (dpif=<optimized out>, execute=execute@entry=0x7ffcb71064e0) at lib/dpif.c:1324 > 11 0x00007fcffd40d3e6 in nxt_resume (ofproto_=0xe6af50, pin=0x7ffcb7107150) at ofproto/ofproto-dpif.c:4885 > 12 0x00007fcffd3f88c3 in handle_nxt_resume (ofconn=ofconn@entry=0xf8c8f0, oh=oh@entry=0xf7ebd0) at ofproto/ofproto.c:3612 > 13 0x00007fcffd404a3b in handle_openflow__ (msg=0xeac460, ofconn=0xf8c8f0) at ofproto/ofproto.c:8137 > 14 handle_openflow (ofconn=0xf8c8f0, ofp_msg=0xeac460) at ofproto/ofproto.c:8258 > 15 0x00007fcffd3f4653 in ofconn_run (handle_openflow=0x7fcffd4046f0 <handle_openflow>, ofconn=0xf8c8f0) at ofproto/connmgr.c:1432 > 16 connmgr_run (mgr=0xe422f0, handle_openflow=handle_openflow@entry=0x7fcffd4046f0 <handle_openflow>) at ofproto/connmgr.c:363 > 17 0x00007fcffd3fdc76 in ofproto_run (p=0xe6af50) at ofproto/ofproto.c:1821 > 18 0x000000000040ca94 in bridge_run__ () at vswitchd/bridge.c:2939 > 19 0x0000000000411d44 in bridge_run () at vswitchd/bridge.c:2997 > 20 0x00000000004094fd in main (argc=12, argv=0x7ffcb71085b8) at vswitchd/ovs-vswitchd.c:119 > > VMWare-BZ: #2210216 > Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> > --- > Please backport it to 2.10. Applied to master and branch-2.10, thanks!
We have hit this issue as well on 2.9, would it be possible to backport it? On Mon, Oct 8, 2018 at 7:17 PM Ben Pfaff <blp@ovn.org> wrote: > > On Fri, Oct 05, 2018 at 09:19:54AM -0700, Yi-Hung Wei wrote: > > The patch address vswitchd crash when it receives NXT_RESUME with geneve > > tunnel metadata. The crash is due to segmentation fault with the > > following stack trace, and it is observed only in kernel datapath. > > A test is added to prevent regression. > > > > Thread 1 "ovs-vswitchd" received signal SIGSEGV, Segmentation fault. > > 0 0x00007fcffd0c5412 in tun_metadata_to_geneve__ (flow=flow@entry=0x7ffcb7106680, b=b@entry=0x7ffcb70eb5a8, crit_opt=crit_opt@entry=0x7ffcb70eb287) > > at lib/tun-metadata.c:676 > > 1 0x00007fcffd0c6858 in tun_metadata_to_geneve_nlattr_flow (b=0x7ffcb70eb5a8, flow=0x7ffcb7106638) at lib/tun-metadata.c:706 > > 2 tun_metadata_to_geneve_nlattr (tun=tun@entry=0x7ffcb7106638, flow=flow@entry=0x7ffcb7106638, key=key@entry=0x0, b=b@entry=0x7ffcb70eb5a8) > > at lib/tun-metadata.c:810 > > 3 0x00007fcffd048464 in tun_key_to_attr (a=a@entry=0x7ffcb70eb5a8, tun_key=tun_key@entry=0x7ffcb7106638, tun_flow_key=tun_flow_key@entry=0x7ffcb7106638, > > key_buf=key_buf@entry=0x0, tnl_type=<optimized out>, tnl_type@entry=0x0) at lib/odp-util.c:2886 > > 4 0x00007fcffd0551cf in odp_key_from_dp_packet (buf=buf@entry=0x7ffcb70eb5a8, packet=0x7ffcb7106590) at lib/odp-util.c:5909 > > 5 0x00007fcffd0d7870 in dpif_netlink_encode_execute (buf=0x7ffcb70eb5a8, d_exec=0x7ffcb7106428, dp_ifindex=<optimized out>) at lib/dpif-netlink.c:1873 > > 6 dpif_netlink_operate__ (dpif=dpif@entry=0xe65e00, ops=ops@entry=0x7ffcb7106418, n_ops=n_ops@entry=1) at lib/dpif-netlink.c:1959 > > 7 0x00007fcffd0d842e in dpif_netlink_operate_chunks (n_ops=1, ops=0x7ffcb7106418, dpif=<optimized out>) at lib/dpif-netlink.c:2258 > > 8 dpif_netlink_operate (dpif_=0xe65e00, ops=<optimized out>, n_ops=<optimized out>) at lib/dpif-netlink.c:2294 > > 9 0x00007fcffd014680 in dpif_operate (dpif=<optimized out>, ops=<optimized out>, ops@entry=0x7ffcb7106418, n_ops=n_ops@entry=1) at lib/dpif.c:1359 > > 10 0x00007fcffd014c58 in dpif_execute (dpif=<optimized out>, execute=execute@entry=0x7ffcb71064e0) at lib/dpif.c:1324 > > 11 0x00007fcffd40d3e6 in nxt_resume (ofproto_=0xe6af50, pin=0x7ffcb7107150) at ofproto/ofproto-dpif.c:4885 > > 12 0x00007fcffd3f88c3 in handle_nxt_resume (ofconn=ofconn@entry=0xf8c8f0, oh=oh@entry=0xf7ebd0) at ofproto/ofproto.c:3612 > > 13 0x00007fcffd404a3b in handle_openflow__ (msg=0xeac460, ofconn=0xf8c8f0) at ofproto/ofproto.c:8137 > > 14 handle_openflow (ofconn=0xf8c8f0, ofp_msg=0xeac460) at ofproto/ofproto.c:8258 > > 15 0x00007fcffd3f4653 in ofconn_run (handle_openflow=0x7fcffd4046f0 <handle_openflow>, ofconn=0xf8c8f0) at ofproto/connmgr.c:1432 > > 16 connmgr_run (mgr=0xe422f0, handle_openflow=handle_openflow@entry=0x7fcffd4046f0 <handle_openflow>) at ofproto/connmgr.c:363 > > 17 0x00007fcffd3fdc76 in ofproto_run (p=0xe6af50) at ofproto/ofproto.c:1821 > > 18 0x000000000040ca94 in bridge_run__ () at vswitchd/bridge.c:2939 > > 19 0x0000000000411d44 in bridge_run () at vswitchd/bridge.c:2997 > > 20 0x00000000004094fd in main (argc=12, argv=0x7ffcb71085b8) at vswitchd/ovs-vswitchd.c:119 > > > > VMWare-BZ: #2210216 > > Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> > > --- > > Please backport it to 2.10. > > Applied to master and branch-2.10, thanks! > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Doesn't apply cleanly. Would you mind manually backporting and posting the patch? On Fri, Feb 01, 2019 at 11:59:37AM +0100, Daniel Alvarez Sanchez wrote: > We have hit this issue as well on 2.9, would it be possible to backport it? > > On Mon, Oct 8, 2018 at 7:17 PM Ben Pfaff <blp@ovn.org> wrote: > > > > On Fri, Oct 05, 2018 at 09:19:54AM -0700, Yi-Hung Wei wrote: > > > The patch address vswitchd crash when it receives NXT_RESUME with geneve > > > tunnel metadata. The crash is due to segmentation fault with the > > > following stack trace, and it is observed only in kernel datapath. > > > A test is added to prevent regression. > > > > > > Thread 1 "ovs-vswitchd" received signal SIGSEGV, Segmentation fault. > > > 0 0x00007fcffd0c5412 in tun_metadata_to_geneve__ (flow=flow@entry=0x7ffcb7106680, b=b@entry=0x7ffcb70eb5a8, crit_opt=crit_opt@entry=0x7ffcb70eb287) > > > at lib/tun-metadata.c:676 > > > 1 0x00007fcffd0c6858 in tun_metadata_to_geneve_nlattr_flow (b=0x7ffcb70eb5a8, flow=0x7ffcb7106638) at lib/tun-metadata.c:706 > > > 2 tun_metadata_to_geneve_nlattr (tun=tun@entry=0x7ffcb7106638, flow=flow@entry=0x7ffcb7106638, key=key@entry=0x0, b=b@entry=0x7ffcb70eb5a8) > > > at lib/tun-metadata.c:810 > > > 3 0x00007fcffd048464 in tun_key_to_attr (a=a@entry=0x7ffcb70eb5a8, tun_key=tun_key@entry=0x7ffcb7106638, tun_flow_key=tun_flow_key@entry=0x7ffcb7106638, > > > key_buf=key_buf@entry=0x0, tnl_type=<optimized out>, tnl_type@entry=0x0) at lib/odp-util.c:2886 > > > 4 0x00007fcffd0551cf in odp_key_from_dp_packet (buf=buf@entry=0x7ffcb70eb5a8, packet=0x7ffcb7106590) at lib/odp-util.c:5909 > > > 5 0x00007fcffd0d7870 in dpif_netlink_encode_execute (buf=0x7ffcb70eb5a8, d_exec=0x7ffcb7106428, dp_ifindex=<optimized out>) at lib/dpif-netlink.c:1873 > > > 6 dpif_netlink_operate__ (dpif=dpif@entry=0xe65e00, ops=ops@entry=0x7ffcb7106418, n_ops=n_ops@entry=1) at lib/dpif-netlink.c:1959 > > > 7 0x00007fcffd0d842e in dpif_netlink_operate_chunks (n_ops=1, ops=0x7ffcb7106418, dpif=<optimized out>) at lib/dpif-netlink.c:2258 > > > 8 dpif_netlink_operate (dpif_=0xe65e00, ops=<optimized out>, n_ops=<optimized out>) at lib/dpif-netlink.c:2294 > > > 9 0x00007fcffd014680 in dpif_operate (dpif=<optimized out>, ops=<optimized out>, ops@entry=0x7ffcb7106418, n_ops=n_ops@entry=1) at lib/dpif.c:1359 > > > 10 0x00007fcffd014c58 in dpif_execute (dpif=<optimized out>, execute=execute@entry=0x7ffcb71064e0) at lib/dpif.c:1324 > > > 11 0x00007fcffd40d3e6 in nxt_resume (ofproto_=0xe6af50, pin=0x7ffcb7107150) at ofproto/ofproto-dpif.c:4885 > > > 12 0x00007fcffd3f88c3 in handle_nxt_resume (ofconn=ofconn@entry=0xf8c8f0, oh=oh@entry=0xf7ebd0) at ofproto/ofproto.c:3612 > > > 13 0x00007fcffd404a3b in handle_openflow__ (msg=0xeac460, ofconn=0xf8c8f0) at ofproto/ofproto.c:8137 > > > 14 handle_openflow (ofconn=0xf8c8f0, ofp_msg=0xeac460) at ofproto/ofproto.c:8258 > > > 15 0x00007fcffd3f4653 in ofconn_run (handle_openflow=0x7fcffd4046f0 <handle_openflow>, ofconn=0xf8c8f0) at ofproto/connmgr.c:1432 > > > 16 connmgr_run (mgr=0xe422f0, handle_openflow=handle_openflow@entry=0x7fcffd4046f0 <handle_openflow>) at ofproto/connmgr.c:363 > > > 17 0x00007fcffd3fdc76 in ofproto_run (p=0xe6af50) at ofproto/ofproto.c:1821 > > > 18 0x000000000040ca94 in bridge_run__ () at vswitchd/bridge.c:2939 > > > 19 0x0000000000411d44 in bridge_run () at vswitchd/bridge.c:2997 > > > 20 0x00000000004094fd in main (argc=12, argv=0x7ffcb71085b8) at vswitchd/ovs-vswitchd.c:119 > > > > > > VMWare-BZ: #2210216 > > > Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> > > > --- > > > Please backport it to 2.10. > > > > Applied to master and branch-2.10, thanks! > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
diff --git a/lib/ofp-packet.c b/lib/ofp-packet.c index b74c29b612a9..aa3417c9b051 100644 --- a/lib/ofp-packet.c +++ b/lib/ofp-packet.c @@ -160,6 +160,7 @@ decode_nx_packet_in2(const struct ofp_header *oh, bool loose, error = oxm_decode_match(payload.msg, ofpbuf_msgsize(&payload), loose, tun_table, vl_mff_map, &pin->flow_metadata); + pin->flow_metadata.flow.tunnel.metadata.tab = tun_table; break; case NXPINT_USERDATA: @@ -244,6 +245,7 @@ ofputil_decode_packet_in(const struct ofp_header *oh, bool loose, : NULL); enum ofperr error = oxm_pull_match_loose(&b, false, tun_table, &pin->flow_metadata); + pin->flow_metadata.flow.tunnel.metadata.tab = tun_table; if (error) { return error; } diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 277227bdd4cf..4c524317a644 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -532,6 +532,46 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.100 | FORMAT_PI OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP +AT_SETUP([datapath - flow resume with geneve tun_metadata]) +OVS_CHECK_GENEVE() + +OVS_TRAFFIC_VSWITCHD_START() +ADD_BR([br-underlay]) + +ADD_NAMESPACES(at_ns0) + +dnl Set up underlay link from host into the namespace using veth pair. +ADD_VETH(p0, at_ns0, br-underlay, "172.31.1.1/24") +AT_CHECK([ip addr add dev br-underlay "172.31.1.100/24"]) +AT_CHECK([ip link set dev br-underlay up]) + +dnl Set up tunnel endpoints on OVS outside the namespace and with a native +dnl linux device inside the namespace. +ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.31.1.1], [10.1.1.100/24]) +ADD_NATIVE_TUNNEL([geneve], [ns_gnv0], [at_ns0], [172.31.1.100], [10.1.1.1/24], + [vni 0]) + +dnl Set up flows +AT_DATA([flows.txt], [dnl +table=0, arp action=NORMAL +table=0, in_port=LOCAL icmp action=output:at_gnv0 +table=0, in_port=at_gnv0 icmp action=set_field:0xa->tun_metadata0,resubmit(,1) +table=1, icmp action=controller(pause), resubmit(,2) +table=2, tun_metadata0=0xa, icmp action=output:LOCAL +]) +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0xffff,type=0,len=4}->tun_metadata0"]) +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) + +AT_CHECK([ovs-ofctl monitor br0 resume --detach --no-chdir --pidfile=ovs-ofctl0.pid 2> ofctl_monitor0.log]) + +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 10.1.1.100 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + AT_SETUP([datapath - ping over geneve6 tunnel]) OVS_CHECK_GENEVE_UDP6ZEROCSUM()
The patch address vswitchd crash when it receives NXT_RESUME with geneve tunnel metadata. The crash is due to segmentation fault with the following stack trace, and it is observed only in kernel datapath. A test is added to prevent regression. Thread 1 "ovs-vswitchd" received signal SIGSEGV, Segmentation fault. 0 0x00007fcffd0c5412 in tun_metadata_to_geneve__ (flow=flow@entry=0x7ffcb7106680, b=b@entry=0x7ffcb70eb5a8, crit_opt=crit_opt@entry=0x7ffcb70eb287) at lib/tun-metadata.c:676 1 0x00007fcffd0c6858 in tun_metadata_to_geneve_nlattr_flow (b=0x7ffcb70eb5a8, flow=0x7ffcb7106638) at lib/tun-metadata.c:706 2 tun_metadata_to_geneve_nlattr (tun=tun@entry=0x7ffcb7106638, flow=flow@entry=0x7ffcb7106638, key=key@entry=0x0, b=b@entry=0x7ffcb70eb5a8) at lib/tun-metadata.c:810 3 0x00007fcffd048464 in tun_key_to_attr (a=a@entry=0x7ffcb70eb5a8, tun_key=tun_key@entry=0x7ffcb7106638, tun_flow_key=tun_flow_key@entry=0x7ffcb7106638, key_buf=key_buf@entry=0x0, tnl_type=<optimized out>, tnl_type@entry=0x0) at lib/odp-util.c:2886 4 0x00007fcffd0551cf in odp_key_from_dp_packet (buf=buf@entry=0x7ffcb70eb5a8, packet=0x7ffcb7106590) at lib/odp-util.c:5909 5 0x00007fcffd0d7870 in dpif_netlink_encode_execute (buf=0x7ffcb70eb5a8, d_exec=0x7ffcb7106428, dp_ifindex=<optimized out>) at lib/dpif-netlink.c:1873 6 dpif_netlink_operate__ (dpif=dpif@entry=0xe65e00, ops=ops@entry=0x7ffcb7106418, n_ops=n_ops@entry=1) at lib/dpif-netlink.c:1959 7 0x00007fcffd0d842e in dpif_netlink_operate_chunks (n_ops=1, ops=0x7ffcb7106418, dpif=<optimized out>) at lib/dpif-netlink.c:2258 8 dpif_netlink_operate (dpif_=0xe65e00, ops=<optimized out>, n_ops=<optimized out>) at lib/dpif-netlink.c:2294 9 0x00007fcffd014680 in dpif_operate (dpif=<optimized out>, ops=<optimized out>, ops@entry=0x7ffcb7106418, n_ops=n_ops@entry=1) at lib/dpif.c:1359 10 0x00007fcffd014c58 in dpif_execute (dpif=<optimized out>, execute=execute@entry=0x7ffcb71064e0) at lib/dpif.c:1324 11 0x00007fcffd40d3e6 in nxt_resume (ofproto_=0xe6af50, pin=0x7ffcb7107150) at ofproto/ofproto-dpif.c:4885 12 0x00007fcffd3f88c3 in handle_nxt_resume (ofconn=ofconn@entry=0xf8c8f0, oh=oh@entry=0xf7ebd0) at ofproto/ofproto.c:3612 13 0x00007fcffd404a3b in handle_openflow__ (msg=0xeac460, ofconn=0xf8c8f0) at ofproto/ofproto.c:8137 14 handle_openflow (ofconn=0xf8c8f0, ofp_msg=0xeac460) at ofproto/ofproto.c:8258 15 0x00007fcffd3f4653 in ofconn_run (handle_openflow=0x7fcffd4046f0 <handle_openflow>, ofconn=0xf8c8f0) at ofproto/connmgr.c:1432 16 connmgr_run (mgr=0xe422f0, handle_openflow=handle_openflow@entry=0x7fcffd4046f0 <handle_openflow>) at ofproto/connmgr.c:363 17 0x00007fcffd3fdc76 in ofproto_run (p=0xe6af50) at ofproto/ofproto.c:1821 18 0x000000000040ca94 in bridge_run__ () at vswitchd/bridge.c:2939 19 0x0000000000411d44 in bridge_run () at vswitchd/bridge.c:2997 20 0x00000000004094fd in main (argc=12, argv=0x7ffcb71085b8) at vswitchd/ovs-vswitchd.c:119 VMWare-BZ: #2210216 Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> --- Please backport it to 2.10. --- lib/ofp-packet.c | 2 ++ tests/system-traffic.at | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+)