Message ID | 1585252702-8649-2-git-send-email-pkusunyifeng@gmail.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,1/2] tun_metadata: Fix coredump caused by use-after-free bug | expand |
Hi Yifeng, Thanks for the patch, I can reproduce the issue using $ make check-system-userspace TESTSUITEFLAGS='-k resume' ASAN reports ==127707==ERROR: AddressSanitizer: heap-use-after-free on address 0x61f000020690 at pc 0x00000089cecf bp 0x7fff38f95690 sp 0x7fff38f95688 READ of size 4 at 0x61f000020690 thread T0 #0 0x89cece in tun_metadata_get_fmd /root/ovs/lib/tun-metadata.c:394:52 #1 0x66a3f9 in flow_get_metadata /root/ovs/lib/flow.c:1236:5 #2 0x58a9ca in process_upcall /root/ovs/ofproto/ofproto-dpif-upcall.c:1538:13 #3 0x57a723 in upcall_cb /root/ovs/ofproto/ofproto-dpif-upcall.c:1311:13 However, applying your fix (patch 1/2) and run $ make check-system-userspace TESTSUITEFLAGS='-k resume' fix the crash but trigger other fail. Can you take a look? Some comments below: On Thu, Mar 26, 2020 at 12:58 PM Yifeng Sun <pkusunyifeng@gmail.com> wrote: > > This patch enhances a system traffic test to prevent regression on > the tunnel metadata table (tun_table) handling with frozen state. > Without a proper fix this test can crash ovs-vswitchd due to a > use-after-free bug on tun_table. > > These are the timed sequence of how this bug is triggered: > > - Adds an OpenFlow rule in OVS that matches Geneve tunnel metadata that > contains a controller action. > - When the first packet matches the aforementioned OpenFlow rule, > during the miss upcall, OVS stores a pointer to the tun_table (that > decodes the Geneve tunnel metadata) in a frozen state and pushes down > a datapath flow into kernel datapath. > - Issues a add-tlv-map command to reprogram the tun_table on OVS. > OVS frees the old tun_table and create a new tun_table. > - A subsequent packet hits the kernel datapath flow again. Since > there is a controller action associated with that flow, it triggers > slow path controller upcall. > - In the slow path controller upcall, OVS derives the tun_table > from the frozen state, which points to the old tun_table that is > already being freed at this time point. > - In order to access the tunnel metadata, OVS uses the invalid > pointer that points to the old tun_table and triggers the core dump. > > Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> > Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> > Co-authored-by: Yi-Hung Wei <yihung.wei@gmail.com> > --- > tests/system-traffic.at | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/tests/system-traffic.at b/tests/system-traffic.at > index 4a39c929c207..992de8546c41 100644 > --- a/tests/system-traffic.at > +++ b/tests/system-traffic.at > @@ -611,6 +611,20 @@ NS_CHECK_EXEC([at_ns0], [ping -q -c 3 10.1.1.100 | FORMAT_PING], [0], [dnl > 3 packets transmitted, 3 received, 0% packet loss, time 0ms > ]) > > +dnl Test OVS handles TLV map modifictions properly when restores frozen state. > +NS_CHECK_EXEC([at_ns0], [ping 10.1.1.100 > ping.out &]) since ping.out is not checked, can you use ping -c <count> -q? > + > +sleep 2 do we need this sleep 2 > + > +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0xffff,type=0x88,len=4}->tun_metadata1"]) > +sleep 1 > +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0xffff,type=0x99,len=4}->tun_metadata2"]) > +sleep 1 > +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0xffff,type=0xaa,len=4}->tun_metadata3"]) > +sleep 1 > + > +dnl At this point, ovs-vswitchd will either crash or everything is OK. can you remove this comment? >` + > OVS_APP_EXIT_AND_WAIT([ovs-ofctl]) > OVS_TRAFFIC_VSWITCHD_STOP > AT_CLEANUP > -- > 2.7.4 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On Mon, Apr 6, 2020 at 8:29 AM William Tu <u9012063@gmail.com> wrote: > > Hi Yifeng, > > Thanks for the patch, I can reproduce the issue using > $ make check-system-userspace TESTSUITEFLAGS='-k resume' > ASAN reports > ==127707==ERROR: AddressSanitizer: heap-use-after-free on address > 0x61f000020690 at pc 0x00000089cecf bp 0x7fff38f95690 sp > 0x7fff38f95688 > READ of size 4 at 0x61f000020690 thread T0 > #0 0x89cece in tun_metadata_get_fmd /root/ovs/lib/tun-metadata.c:394:52 > #1 0x66a3f9 in flow_get_metadata /root/ovs/lib/flow.c:1236:5 > #2 0x58a9ca in process_upcall > /root/ovs/ofproto/ofproto-dpif-upcall.c:1538:13 > #3 0x57a723 in upcall_cb /root/ovs/ofproto/ofproto-dpif-upcall.c:1311:13 > > However, applying your fix (patch 1/2) and run > $ make check-system-userspace TESTSUITEFLAGS='-k resume' > fix the crash but trigger other fail. > FYI, with your patch, the failed log shows: +2020-04-06T15:43:08.716Z|00001|dpif(revalidator7)|WARN|netdev@ovs-netdev: failed to put[modify] (No such file or directory) ufid:4c0cc511-5dfd-4afe-a43b-86889dcd3600 skb_priority(0/0),tunnel(tun_id=0x0,src=172.31.1.1,dst=172.31.1.100,ttl=64/0,tp_src=62880/0,tp_dst=6081/0,flags(-df-csum+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(4),packet_type(ns=0,id=0),eth(src=ee:b9:ef:c8:ab:57/00:00:00:00:00:00,dst=33:33:00:00:00:16/00:00:00:00:00:00),eth_type(0x86dd),ipv6(src=::/::,dst=ff02::16/::,label=0/0,proto=58/0,tclass=0/0,hlimit=1/0,frag=no),icmpv6(type=143/0,code=0/0), actions:userspace(pid=0,controller(reason=7,dont_send=0,continuation=0,recirc_id=7,rule_cookie=0,controller_id=0,max_len=65535)) +2020-04-06T15:43:08.716Z|00002|dpif(revalidator7)|WARN|netdev@ovs-netdev: failed to put[modify] (No such file or directory) ufid:e7d78a37-7dd8-4641-a71d-1c57e4b47329 skb_priority(0/0),tunnel(tun_id=0x0,src=172.31.1.1,dst=172.31.1.100,ttl=64/0,tp_src=22243/0,tp_dst=6081/0,flags(-df-csum+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(4),packet_type(ns=0,id=0),eth(src=ee:b9:ef:c8:ab:57/00:00:00:00:00:00,dst=f2:7d:a0:68:ae:4a/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=10.1.1.1/0.0.0.0,dst=10.1.1.100/0.0.0.0,proto=1,tos=0/0,ttl=64/0,frag=no),icmp(type=8/0,code=0/0), actions:userspace(pid=0,controller(reason=1,dont_send=0,continuation=1,recirc_id=8,rule_cookie=0,controller_id=0,max_len=65535))
Thanks William for the comments. I will submit a new version. Yifeng On Mon, Apr 6, 2020 at 8:56 AM William Tu <u9012063@gmail.com> wrote: > On Mon, Apr 6, 2020 at 8:29 AM William Tu <u9012063@gmail.com> wrote: > > > > Hi Yifeng, > > > > Thanks for the patch, I can reproduce the issue using > > $ make check-system-userspace TESTSUITEFLAGS='-k resume' > > ASAN reports > > ==127707==ERROR: AddressSanitizer: heap-use-after-free on address > > 0x61f000020690 at pc 0x00000089cecf bp 0x7fff38f95690 sp > > 0x7fff38f95688 > > READ of size 4 at 0x61f000020690 thread T0 > > #0 0x89cece in tun_metadata_get_fmd > /root/ovs/lib/tun-metadata.c:394:52 > > #1 0x66a3f9 in flow_get_metadata /root/ovs/lib/flow.c:1236:5 > > #2 0x58a9ca in process_upcall > > /root/ovs/ofproto/ofproto-dpif-upcall.c:1538:13 > > #3 0x57a723 in upcall_cb > /root/ovs/ofproto/ofproto-dpif-upcall.c:1311:13 > > > > However, applying your fix (patch 1/2) and run > > $ make check-system-userspace TESTSUITEFLAGS='-k resume' > > fix the crash but trigger other fail. > > > FYI, with your patch, the failed log shows: > +2020-04-06T15:43:08.716Z|00001|dpif(revalidator7)|WARN|netdev@ovs-netdev: > failed to put[modify] (No such file or directory) > ufid:4c0cc511-5dfd-4afe-a43b-86889dcd3600 > > skb_priority(0/0),tunnel(tun_id=0x0,src=172.31.1.1,dst=172.31.1.100,ttl=64/0,tp_src=62880/0,tp_dst=6081/0,flags(-df-csum+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(4),packet_type(ns=0,id=0),eth(src=ee:b9:ef:c8:ab:57/00:00:00:00:00:00,dst=33:33:00:00:00:16/00:00:00:00:00:00),eth_type(0x86dd),ipv6(src=::/::,dst=ff02::16/::,label=0/0,proto=58/0,tclass=0/0,hlimit=1/0,frag=no),icmpv6(type=143/0,code=0/0), > > actions:userspace(pid=0,controller(reason=7,dont_send=0,continuation=0,recirc_id=7,rule_cookie=0,controller_id=0,max_len=65535)) > +2020-04-06T15:43:08.716Z|00002|dpif(revalidator7)|WARN|netdev@ovs-netdev: > failed to put[modify] (No such file or directory) > ufid:e7d78a37-7dd8-4641-a71d-1c57e4b47329 > > skb_priority(0/0),tunnel(tun_id=0x0,src=172.31.1.1,dst=172.31.1.100,ttl=64/0,tp_src=22243/0,tp_dst=6081/0,flags(-df-csum+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(4),packet_type(ns=0,id=0),eth(src=ee:b9:ef:c8:ab:57/00:00:00:00:00:00,dst=f2:7d:a0:68:ae:4a/00:00:00:00:00:00),eth_type(0x0800),ipv4(src= > 10.1.1.1/0.0.0.0,dst=10.1.1.100/0.0.0.0,proto=1,tos=0/0,ttl=64/0,frag=no > ),icmp(type=8/0,code=0/0), > > actions:userspace(pid=0,controller(reason=1,dont_send=0,continuation=1,recirc_id=8,rule_cookie=0,controller_id=0,max_len=65535)) >
diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 4a39c929c207..992de8546c41 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -611,6 +611,20 @@ NS_CHECK_EXEC([at_ns0], [ping -q -c 3 10.1.1.100 | FORMAT_PING], [0], [dnl 3 packets transmitted, 3 received, 0% packet loss, time 0ms ]) +dnl Test OVS handles TLV map modifictions properly when restores frozen state. +NS_CHECK_EXEC([at_ns0], [ping 10.1.1.100 > ping.out &]) + +sleep 2 + +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0xffff,type=0x88,len=4}->tun_metadata1"]) +sleep 1 +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0xffff,type=0x99,len=4}->tun_metadata2"]) +sleep 1 +AT_CHECK([ovs-ofctl add-tlv-map br0 "{class=0xffff,type=0xaa,len=4}->tun_metadata3"]) +sleep 1 + +dnl At this point, ovs-vswitchd will either crash or everything is OK. + OVS_APP_EXIT_AND_WAIT([ovs-ofctl]) OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP