Message ID | 6fc6cd366ecc884dc76b72c09a2ec00e9cdee877.1590179835.git.lorenzo.bianconi@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,v2,ovn] ovn: introduce IP_SRC_POLICY stage in ingress router pipeline | expand |
Hi Lorenzo, Sorry that I replied to v1 right before you send v2. Please check my comments there. Thanks, Han On Fri, May 22, 2020 at 1:50 PM Lorenzo Bianconi < lorenzo.bianconi@redhat.com> wrote: > In order to fix the issues introduced by commit > c0bf32d72f8b ("Manage ARP process locally in a DVR scenario "), restore > previous configuration of table 9 in ingress router pipeline and > introduce a new stage called 'ip_src_policy' used to set the src address > info in order to not distribute FIP traffic if DVR is enabled > > Fixes: c0bf32d72f8b ("Manage ARP process locally in a DVR scenario ") > Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> > --- > Changes since v1: > - fixed system-ovn.at test > - added ovn.at test > - fixed documentation > --- > northd/ovn-northd.8.xml | 75 +++++++++++++++++++++-------------------- > northd/ovn-northd.c | 38 +++++++++------------ > tests/ovn.at | 56 +++++++++++++++++------------- > tests/system-ovn.at | 28 +++++++++++++++ > 4 files changed, 115 insertions(+), 82 deletions(-) > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > index 8f224b07f..5e57f5694 100644 > --- a/northd/ovn-northd.8.xml > +++ b/northd/ovn-northd.8.xml > @@ -2484,37 +2484,6 @@ output; > </p> > </li> > > - <li> > - <p> > - For distributed logical routers where one of the logical router > ports > - specifies a <code>redirect-chassis</code>, a priority-400 > logical > - flow for each <code>dnat_and_snat</code> NAT rules configured. > - These flows will allow to properly forward traffic to the > external > - connections if available and avoid sending it through the > tunnel. > - Assuming the following NAT rule has been configured: > - </p> > - > - <pre> > -external_ip = <var>A</var>; > -external_mac = <var>B</var>; > -logical_ip = <var>C</var>; > - </pre> > - > - <p> > - the following action will be applied: > - </p> > - > - <pre> > -ip.ttl--; > -reg0 = <var>ip.dst</var>; > -reg1 = <var>A</var>; > -eth.src = <var>B</var>; > -outport = <var>router-port</var>; > -next; > - </pre> > - > - </li> > - > <li> > <p> > IPv4 routing table. For each route to IPv4 network > <var>N</var> with > @@ -2660,7 +2629,41 @@ outport = <var>P</var>; > </li> > </ul> > > - <h3>Ingress Table 12: ARP/ND Resolution</h3> > + <h3>Ingress Table 12: IP Source Policy</h3> > + > + <p> > + This table contains for distributed logical routers where one of > + the logical router ports specifies a <code>redirect-chassis</code>, > + a priority-100 logical flow for each <code>dnat_and_snat</code> > + NAT rules configured. > + These flows will allow to properly forward traffic to the external > + connections if available and avoid sending it through the tunnel. > + Assuming the following NAT rule has been configured: > + </p> > + > + <pre> > +external_ip = <var>A</var>; > +external_mac = <var>B</var>; > +logical_ip = <var>C</var>; > +logical_port = <var>D</var>; > + </pre> > + > + <p> > + for IP traffic matching <code>ip.src == <var>C</var> && > + is_chassis_resident(<var>D</var>)</code> > + </p> > + > + <p> > + the following action will be applied: > + </p> > + > + <pre> > +reg1 = <var>A</var>; > +eth.src = <var>B</var>; > +next; > + </pre> > + > + <h3>Ingress Table 13: ARP/ND Resolution</h3> > > <p> > Any packet that reaches this table is an IP packet whose next-hop > @@ -2819,7 +2822,7 @@ outport = <var>P</var>; > > </ul> > > - <h3>Ingress Table 13: Check packet length</h3> > + <h3>Ingress Table 14: Check packet length</h3> > > <p> > For distributed logical routers with distributed gateway port > configured > @@ -2849,7 +2852,7 @@ REGBIT_PKT_LARGER = check_pkt_larger(<var>L</var>); > next; > and advances to the next table. > </p> > > - <h3>Ingress Table 14: Handle larger packets</h3> > + <h3>Ingress Table 15: Handle larger packets</h3> > > <p> > For distributed logical routers with distributed gateway port > configured > @@ -2898,7 +2901,7 @@ icmp4 { > and advances to the next table. > </p> > > - <h3>Ingress Table 15: Gateway Redirect</h3> > + <h3>Ingress Table 16: Gateway Redirect</h3> > > <p> > For distributed logical routers where one of the logical router > @@ -2934,7 +2937,7 @@ icmp4 { > </li> > </ul> > > - <h3>Ingress Table 16: ARP Request</h3> > + <h3>Ingress Table 17: ARP Request</h3> > > <p> > In the common case where the Ethernet destination has been > resolved, this > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index 6ccd84e49..cecaa9ab3 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -175,11 +175,12 @@ enum ovn_stage { > PIPELINE_STAGE(ROUTER, IN, IP_ROUTING, 9, "lr_in_ip_routing") > \ > PIPELINE_STAGE(ROUTER, IN, IP_ROUTING_ECMP, 10, > "lr_in_ip_routing_ecmp") \ > PIPELINE_STAGE(ROUTER, IN, POLICY, 11, "lr_in_policy") > \ > - PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 12, > "lr_in_arp_resolve") \ > - PIPELINE_STAGE(ROUTER, IN, CHK_PKT_LEN , 13, > "lr_in_chk_pkt_len") \ > - PIPELINE_STAGE(ROUTER, IN, LARGER_PKTS, 14,"lr_in_larger_pkts") > \ > - PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 15, > "lr_in_gw_redirect") \ > - PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 16, > "lr_in_arp_request") \ > + PIPELINE_STAGE(ROUTER, IN, IP_SRC_POLICY, 12, > "lr_in_ip_src_policy") \ > + PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 13, > "lr_in_arp_resolve") \ > + PIPELINE_STAGE(ROUTER, IN, CHK_PKT_LEN , 14, > "lr_in_chk_pkt_len") \ > + PIPELINE_STAGE(ROUTER, IN, LARGER_PKTS, 15,"lr_in_larger_pkts") > \ > + PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 16, > "lr_in_gw_redirect") \ > + PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 17, > "lr_in_arp_request") \ > \ > /* Logical router egress stages. */ \ > PIPELINE_STAGE(ROUTER, OUT, UNDNAT, 0, "lr_out_undnat") \ > @@ -7125,8 +7126,6 @@ build_routing_policy_flow(struct hmap *lflows, > struct ovn_datapath *od, > ds_destroy(&actions); > } > > -/* default logical flow prioriry for distributed routes */ > -#define DROUTE_PRIO 400 > struct parsed_route { > struct ovs_list list_node; > struct v46_ip prefix; > @@ -7515,7 +7514,7 @@ build_ecmp_route_flow(struct hmap *lflows, struct > ovn_datapath *od, > } > > static void > -add_distributed_routes(struct hmap *lflows, struct ovn_datapath *od) > +add_ip_src_policy_flows(struct hmap *lflows, struct ovn_datapath *od) > { > struct ds actions = DS_EMPTY_INITIALIZER; > struct ds match = DS_EMPTY_INITIALIZER; > @@ -7533,12 +7532,9 @@ add_distributed_routes(struct hmap *lflows, struct > ovn_datapath *od) > is_ipv4 ? "4" : "6", nat->logical_ip, > nat->logical_port); > char *prefix = is_ipv4 ? "" : "xx"; > - ds_put_format(&actions, "outport = %s; eth.src = %s; " > - "%sreg0 = ip%s.dst; %sreg1 = %s; next;", > - od->l3dgw_port->json_key, nat->external_mac, > - prefix, is_ipv4 ? "4" : "6", > - prefix, nat->external_ip); > - ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_ROUTING, DROUTE_PRIO, > + ds_put_format(&actions, "eth.src = %s; %sreg1 = %s; next;", > + nat->external_mac, prefix, nat->external_ip); > + ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_SRC_POLICY, 100, > ds_cstr(&match), ds_cstr(&actions)); > ds_clear(&match); > ds_clear(&actions); > @@ -7569,12 +7565,6 @@ add_route(struct hmap *lflows, const struct > ovn_port *op, > } > build_route_match(op_inport, network_s, plen, is_src_route, is_ipv4, > &match, &priority); > - /* traffic for internal IPs of logical switch ports must be sent to > - * the gw controller through the overlay tunnels > - */ > - if (op->nbrp && !op->nbrp->n_gateway_chassis) { > - priority += DROUTE_PRIO; > - } > > struct ds actions = DS_EMPTY_INITIALIZER; > ds_put_format(&actions, "ip.ttl--; "REG_ECMP_GROUP_ID" = 0; %sreg0 = > ", > @@ -9541,9 +9531,13 @@ build_lrouter_flows(struct hmap *datapaths, struct > hmap *ports, > * logical router > */ > HMAP_FOR_EACH (od, key_node, datapaths) { > - if (od->nbr && od->l3dgw_port) { > - add_distributed_routes(lflows, od); > + if (!od->nbr) { > + continue; > + } > + if (od->l3dgw_port) { > + add_ip_src_policy_flows(lflows, od); > } > + ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_SRC_POLICY, 0, "1", > "next;"); > } > > /* Logical router ingress table IP_ROUTING & IP_ROUTING_ECMP: IP > Routing. > diff --git a/tests/ovn.at b/tests/ovn.at > index 4370b3728..09952caf6 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -10141,20 +10141,6 @@ AT_CHECK([as hv3 ovs-vsctl set Open_vSwitch . > external-ids:ovn-bridge-mappings=p > OVS_WAIT_UNTIL([test 1 = `as hv3 ovs-vsctl show | \ > grep "Port patch-br-int-to-ln_port" | wc -l`]) > > -AT_CHECK([test 1 = `ovn-sbctl dump-flows lr0 | grep lr_in_ip_routing | \ > -grep "ip4.src == 10.0.0.3 && is_chassis_resident(\"foo1\")" -c`]) > -AT_CHECK([test 1 = `ovn-sbctl dump-flows lr0 | grep lr_in_ip_routing | \ > -grep "ip4.src == 10.0.0.4 && is_chassis_resident(\"foo2\")" -c`]) > - > -key=`ovn-sbctl --bare --columns tunnel_key list datapath_Binding lr0` > -# Check that the OVS flows appear for the dnat_and_snat entries in > -# lr_in_ip_routing table. > -OVS_WAIT_UNTIL([test 1 = `as hv3 ovs-ofctl dump-flows br-int table=17 | \ > -grep "priority=400,ip,metadata=0x$key,nw_src=10.0.0.3" -c`]) > - > -OVS_WAIT_UNTIL([test 1 = `as hv3 ovs-ofctl dump-flows br-int table=17 | \ > -grep "priority=400,ip,metadata=0x$key,nw_src=10.0.0.4" -c`]) > - > # Re-add nat-addresses option > ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router" > > @@ -14957,9 +14943,14 @@ ovs-vsctl -- add-port br-int hv2-vif1 -- \ > set interface hv2-vif1 external-ids:iface-id=sw1-p0 \ > options:tx_pcap=hv2/vif1-tx.pcap \ > options:rxq_pcap=hv2/vif1-rx.pcap \ > - ofport-request=1 > + ofport-request=2 > +ovs-vsctl -- add-port br-int hv2-vif2 -- \ > + set interface hv2-vif2 external-ids:iface-id=sw0-p1 \ > + options:tx_pcap=hv2/vif2-tx.pcap \ > + options:rxq_pcap=hv2/vif2-rx.pcap \ > + ofport-request=3 > > -ovn-nbctl create Logical_Router name=lr0 options:chassis=hv1 > +ovn-nbctl create Logical_Router name=lr0 > ovn-nbctl ls-add sw0 > ovn-nbctl ls-add sw1 > > @@ -14968,13 +14959,16 @@ ovn-nbctl lsp-add sw0 rp-sw0 -- set > Logical_Switch_Port rp-sw0 \ > type=router options:router-port=sw0 \ > -- lsp-set-addresses rp-sw0 router > > -ovn-nbctl lrp-add lr0 sw1 00:00:02:01:02:03 172.16.1.1/24 > 2002:0:0:0:0:0:0:1/64 > +ovn-nbctl lrp-add lr0 sw1 00:00:02:01:02:03 172.16.1.1/24 > 2002:0:0:0:0:0:0:1/64 \ > + -- set Logical_Router_Port sw1 options:redirect-chassis="hv2" > ovn-nbctl lsp-add sw1 rp-sw1 -- set Logical_Switch_Port rp-sw1 \ > type=router options:router-port=sw1 \ > -- lsp-set-addresses rp-sw1 router > > ovn-nbctl lsp-add sw0 sw0-p0 \ > -- lsp-set-addresses sw0-p0 "f0:00:00:01:02:03 192.168.1.2 2001::2" > +ovn-nbctl lsp-add sw0 sw0-p1 \ > + -- lsp-set-addresses sw0-p1 "f0:00:00:11:02:03 192.168.1.3 2001::3" > > ovn-nbctl lsp-add sw1 sw1-p0 \ > -- lsp-set-addresses sw1-p0 unknown > @@ -15020,6 +15014,20 @@ send_na 2 1 $dst_mac $router_mac1 $dst_ip6 > $router_ip6 > > OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) > > +# Create FIP on sw0-p0, add a route on logical router pipeline and > +# ARP request for a unkwon destination is sent using FIP MAC/IP > +ovn-nbctl lr-nat-add lr0 dnat_and_snat 172.16.1.2 192.168.1.3 sw0-p1 > f0:00:00:01:02:04 > +ovn-nbctl lr-route-add lr0 172.16.2.0/24 172.16.1.11 > + > +dst_ip=$(ip_to_hex 172 16 2 10) > +fip_ip=$(ip_to_hex 172 16 1 2) > +src_ip=$(ip_to_hex 192 168 1 3) > +gw_router=$(ip_to_hex 172 16 1 11) > +send_icmp_packet 2 2 f00000110203 $router_mac0 $src_ip $dst_ip 0000 $data > +echo $(get_arp_req f00000010204 $fip_ip $gw_router) >> expected > + > +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) > + > OVN_CLEANUP([hv1],[hv2]) > AT_CLEANUP > > @@ -15645,7 +15653,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > | grep "reg0 == 10.0.0.10" \ > # Since the sw0-vir is not claimed by any chassis, eth.dst should be set > to > # zero if the ip4.dst is the virtual ip in the router pipeline. > AT_CHECK([cat lflows.txt], [0], [dnl > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) > ]) > > ip_to_hex() { > @@ -15696,7 +15704,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > | grep "reg0 == 10.0.0.10" \ > # There should be an arp resolve flow to resolve the virtual_ip with the > # sw0-p1's MAC. > AT_CHECK([cat lflows.txt], [0], [dnl > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) > ]) > > # Forcibly clear virtual_parent. ovn-controller should release the binding > @@ -15737,7 +15745,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > | grep "reg0 == 10.0.0.10" \ > # There should be an arp resolve flow to resolve the virtual_ip with the > # sw0-p2's MAC. > AT_CHECK([cat lflows.txt], [0], [dnl > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:05; next;) > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:05; next;) > ]) > > # send the garp from sw0-p2 (in hv2). hv2 should claim sw0-vir > @@ -15760,7 +15768,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > | grep "reg0 == 10.0.0.10" \ > # There should be an arp resolve flow to resolve the virtual_ip with the > # sw0-p3's MAC. > AT_CHECK([cat lflows.txt], [0], [dnl > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) > ]) > > # Now send arp reply from sw0-p1. hv1 should claim sw0-vir > @@ -15781,7 +15789,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > | grep "reg0 == 10.0.0.10" \ > > lflows.txt > > AT_CHECK([cat lflows.txt], [0], [dnl > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) > ]) > > # Delete hv1-vif1 port. hv1 should release sw0-vir > @@ -15799,7 +15807,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > | grep "reg0 == 10.0.0.10" \ > > lflows.txt > > AT_CHECK([cat lflows.txt], [0], [dnl > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) > ]) > > # Now send arp reply from sw0-p2. hv2 should claim sw0-vir > @@ -15820,7 +15828,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > | grep "reg0 == 10.0.0.10" \ > > lflows.txt > > AT_CHECK([cat lflows.txt], [0], [dnl > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) > ]) > > # Delete sw0-p2 logical port > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > index 9ae6c6b1f..99a0ee07b 100644 > --- a/tests/system-ovn.at > +++ b/tests/system-ovn.at > @@ -2747,6 +2747,17 @@ ADD_VETH(alice1, alice1, br-int, "172.16.1.2/24", > "f0:00:00:01:02:05", \ > ovn-nbctl lsp-add alice alice1 \ > -- lsp-set-addresses alice1 "f0:00:00:01:02:05 172.16.1.2" > > +# Add external network > +ADD_NAMESPACES(ext-net) > +ip link add alice-ext netns alice1 type veth peer name ext-veth netns > ext-net > +ip -n ext-net link set dev ext-veth up > +ip -n ext-net addr add 10.0.0.1/24 dev ext-veth > +ip -n ext-net route add default via 10.0.0.2 > + > +ip -n alice1 link set dev alice-ext up > +ip -n alice1 addr add 10.0.0.2/24 dev alice-ext > +ip netns exec alice1 sysctl -w net.ipv4.conf.all.forwarding=1 > + > # Add DNAT rules > AT_CHECK([ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.1.3 192.168.1.2 > foo1 00:00:02:02:03:04]) > AT_CHECK([ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.1.4 192.168.1.3 > foo2 00:00:02:02:03:05]) > @@ -2754,6 +2765,9 @@ AT_CHECK([ovn-nbctl lr-nat-add R1 dnat_and_snat > 172.16.1.4 192.168.1.3 foo2 00:0 > # Add a SNAT rule > AT_CHECK([ovn-nbctl lr-nat-add R1 snat 172.16.1.1 192.168.0.0/16]) > > +# Add default route to ext-net > +AT_CHECK([ovn-nbctl lr-route-add R1 10.0.0.0/24 172.16.1.2]) > + > ovn-nbctl --wait=hv sync > OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep 'nat(src=172.16.1.1)']) > > @@ -2797,6 +2811,20 @@ sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > icmp,orig=(src=192.168.2.2,dst=172.16.1.2,id=<cleared>,type=8,code=0),reply=(src=172.16.1.2,dst=172.16.1.1,id=<cleared>,type=0,code=0),zone=<cleared> > ]) > > +# Try to ping external network > +NS_CHECK_EXEC([ext-net], [tcpdump -n -c 3 -i ext-veth dst 172.16.1.3 and > icmp > ext-net.pcap &]) > +sleep 1 > +AT_CHECK([ovn-nbctl lr-nat-del R1 snat]) > +NS_CHECK_EXEC([foo1], [ping -q -c 3 -i 0.3 -w 2 10.0.0.1 | FORMAT_PING], \ > +[0], [dnl > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > +]) > + > +OVS_WAIT_UNTIL([ > + total_pkts=$(cat ext-net.pcap | wc -l) > + test "${total_pkts}" = "3" > +]) > + > OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > as ovn-sb > -- > 2.26.2 > >
> Hi Lorenzo, > > Sorry that I replied to v1 right before you send v2. Please check my > comments there. Hi Han, I implemented your suggestions here: https://patchwork.ozlabs.org/project/openvswitch/cover/cover.1590443438.git.lorenzo.bianconi@redhat.com/ Please note I have not added v3 tag since I changed series name. Regards, Lorenzo > > Thanks, > Han > > On Fri, May 22, 2020 at 1:50 PM Lorenzo Bianconi < > lorenzo.bianconi@redhat.com> wrote: > > > In order to fix the issues introduced by commit > > c0bf32d72f8b ("Manage ARP process locally in a DVR scenario "), restore > > previous configuration of table 9 in ingress router pipeline and > > introduce a new stage called 'ip_src_policy' used to set the src address > > info in order to not distribute FIP traffic if DVR is enabled > > > > Fixes: c0bf32d72f8b ("Manage ARP process locally in a DVR scenario ") > > Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> > > --- > > Changes since v1: > > - fixed system-ovn.at test > > - added ovn.at test > > - fixed documentation > > --- > > northd/ovn-northd.8.xml | 75 +++++++++++++++++++++-------------------- > > northd/ovn-northd.c | 38 +++++++++------------ > > tests/ovn.at | 56 +++++++++++++++++------------- > > tests/system-ovn.at | 28 +++++++++++++++ > > 4 files changed, 115 insertions(+), 82 deletions(-) > > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > > index 8f224b07f..5e57f5694 100644 > > --- a/northd/ovn-northd.8.xml > > +++ b/northd/ovn-northd.8.xml > > @@ -2484,37 +2484,6 @@ output; > > </p> > > </li> > > > > - <li> > > - <p> > > - For distributed logical routers where one of the logical router > > ports > > - specifies a <code>redirect-chassis</code>, a priority-400 > > logical > > - flow for each <code>dnat_and_snat</code> NAT rules configured. > > - These flows will allow to properly forward traffic to the > > external > > - connections if available and avoid sending it through the > > tunnel. > > - Assuming the following NAT rule has been configured: > > - </p> > > - > > - <pre> > > -external_ip = <var>A</var>; > > -external_mac = <var>B</var>; > > -logical_ip = <var>C</var>; > > - </pre> > > - > > - <p> > > - the following action will be applied: > > - </p> > > - > > - <pre> > > -ip.ttl--; > > -reg0 = <var>ip.dst</var>; > > -reg1 = <var>A</var>; > > -eth.src = <var>B</var>; > > -outport = <var>router-port</var>; > > -next; > > - </pre> > > - > > - </li> > > - > > <li> > > <p> > > IPv4 routing table. For each route to IPv4 network > > <var>N</var> with > > @@ -2660,7 +2629,41 @@ outport = <var>P</var>; > > </li> > > </ul> > > > > - <h3>Ingress Table 12: ARP/ND Resolution</h3> > > + <h3>Ingress Table 12: IP Source Policy</h3> > > + > > + <p> > > + This table contains for distributed logical routers where one of > > + the logical router ports specifies a <code>redirect-chassis</code>, > > + a priority-100 logical flow for each <code>dnat_and_snat</code> > > + NAT rules configured. > > + These flows will allow to properly forward traffic to the external > > + connections if available and avoid sending it through the tunnel. > > + Assuming the following NAT rule has been configured: > > + </p> > > + > > + <pre> > > +external_ip = <var>A</var>; > > +external_mac = <var>B</var>; > > +logical_ip = <var>C</var>; > > +logical_port = <var>D</var>; > > + </pre> > > + > > + <p> > > + for IP traffic matching <code>ip.src == <var>C</var> && > > + is_chassis_resident(<var>D</var>)</code> > > + </p> > > + > > + <p> > > + the following action will be applied: > > + </p> > > + > > + <pre> > > +reg1 = <var>A</var>; > > +eth.src = <var>B</var>; > > +next; > > + </pre> > > + > > + <h3>Ingress Table 13: ARP/ND Resolution</h3> > > > > <p> > > Any packet that reaches this table is an IP packet whose next-hop > > @@ -2819,7 +2822,7 @@ outport = <var>P</var>; > > > > </ul> > > > > - <h3>Ingress Table 13: Check packet length</h3> > > + <h3>Ingress Table 14: Check packet length</h3> > > > > <p> > > For distributed logical routers with distributed gateway port > > configured > > @@ -2849,7 +2852,7 @@ REGBIT_PKT_LARGER = check_pkt_larger(<var>L</var>); > > next; > > and advances to the next table. > > </p> > > > > - <h3>Ingress Table 14: Handle larger packets</h3> > > + <h3>Ingress Table 15: Handle larger packets</h3> > > > > <p> > > For distributed logical routers with distributed gateway port > > configured > > @@ -2898,7 +2901,7 @@ icmp4 { > > and advances to the next table. > > </p> > > > > - <h3>Ingress Table 15: Gateway Redirect</h3> > > + <h3>Ingress Table 16: Gateway Redirect</h3> > > > > <p> > > For distributed logical routers where one of the logical router > > @@ -2934,7 +2937,7 @@ icmp4 { > > </li> > > </ul> > > > > - <h3>Ingress Table 16: ARP Request</h3> > > + <h3>Ingress Table 17: ARP Request</h3> > > > > <p> > > In the common case where the Ethernet destination has been > > resolved, this > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > index 6ccd84e49..cecaa9ab3 100644 > > --- a/northd/ovn-northd.c > > +++ b/northd/ovn-northd.c > > @@ -175,11 +175,12 @@ enum ovn_stage { > > PIPELINE_STAGE(ROUTER, IN, IP_ROUTING, 9, "lr_in_ip_routing") > > \ > > PIPELINE_STAGE(ROUTER, IN, IP_ROUTING_ECMP, 10, > > "lr_in_ip_routing_ecmp") \ > > PIPELINE_STAGE(ROUTER, IN, POLICY, 11, "lr_in_policy") > > \ > > - PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 12, > > "lr_in_arp_resolve") \ > > - PIPELINE_STAGE(ROUTER, IN, CHK_PKT_LEN , 13, > > "lr_in_chk_pkt_len") \ > > - PIPELINE_STAGE(ROUTER, IN, LARGER_PKTS, 14,"lr_in_larger_pkts") > > \ > > - PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 15, > > "lr_in_gw_redirect") \ > > - PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 16, > > "lr_in_arp_request") \ > > + PIPELINE_STAGE(ROUTER, IN, IP_SRC_POLICY, 12, > > "lr_in_ip_src_policy") \ > > + PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 13, > > "lr_in_arp_resolve") \ > > + PIPELINE_STAGE(ROUTER, IN, CHK_PKT_LEN , 14, > > "lr_in_chk_pkt_len") \ > > + PIPELINE_STAGE(ROUTER, IN, LARGER_PKTS, 15,"lr_in_larger_pkts") > > \ > > + PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 16, > > "lr_in_gw_redirect") \ > > + PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 17, > > "lr_in_arp_request") \ > > \ > > /* Logical router egress stages. */ \ > > PIPELINE_STAGE(ROUTER, OUT, UNDNAT, 0, "lr_out_undnat") \ > > @@ -7125,8 +7126,6 @@ build_routing_policy_flow(struct hmap *lflows, > > struct ovn_datapath *od, > > ds_destroy(&actions); > > } > > > > -/* default logical flow prioriry for distributed routes */ > > -#define DROUTE_PRIO 400 > > struct parsed_route { > > struct ovs_list list_node; > > struct v46_ip prefix; > > @@ -7515,7 +7514,7 @@ build_ecmp_route_flow(struct hmap *lflows, struct > > ovn_datapath *od, > > } > > > > static void > > -add_distributed_routes(struct hmap *lflows, struct ovn_datapath *od) > > +add_ip_src_policy_flows(struct hmap *lflows, struct ovn_datapath *od) > > { > > struct ds actions = DS_EMPTY_INITIALIZER; > > struct ds match = DS_EMPTY_INITIALIZER; > > @@ -7533,12 +7532,9 @@ add_distributed_routes(struct hmap *lflows, struct > > ovn_datapath *od) > > is_ipv4 ? "4" : "6", nat->logical_ip, > > nat->logical_port); > > char *prefix = is_ipv4 ? "" : "xx"; > > - ds_put_format(&actions, "outport = %s; eth.src = %s; " > > - "%sreg0 = ip%s.dst; %sreg1 = %s; next;", > > - od->l3dgw_port->json_key, nat->external_mac, > > - prefix, is_ipv4 ? "4" : "6", > > - prefix, nat->external_ip); > > - ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_ROUTING, DROUTE_PRIO, > > + ds_put_format(&actions, "eth.src = %s; %sreg1 = %s; next;", > > + nat->external_mac, prefix, nat->external_ip); > > + ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_SRC_POLICY, 100, > > ds_cstr(&match), ds_cstr(&actions)); > > ds_clear(&match); > > ds_clear(&actions); > > @@ -7569,12 +7565,6 @@ add_route(struct hmap *lflows, const struct > > ovn_port *op, > > } > > build_route_match(op_inport, network_s, plen, is_src_route, is_ipv4, > > &match, &priority); > > - /* traffic for internal IPs of logical switch ports must be sent to > > - * the gw controller through the overlay tunnels > > - */ > > - if (op->nbrp && !op->nbrp->n_gateway_chassis) { > > - priority += DROUTE_PRIO; > > - } > > > > struct ds actions = DS_EMPTY_INITIALIZER; > > ds_put_format(&actions, "ip.ttl--; "REG_ECMP_GROUP_ID" = 0; %sreg0 = > > ", > > @@ -9541,9 +9531,13 @@ build_lrouter_flows(struct hmap *datapaths, struct > > hmap *ports, > > * logical router > > */ > > HMAP_FOR_EACH (od, key_node, datapaths) { > > - if (od->nbr && od->l3dgw_port) { > > - add_distributed_routes(lflows, od); > > + if (!od->nbr) { > > + continue; > > + } > > + if (od->l3dgw_port) { > > + add_ip_src_policy_flows(lflows, od); > > } > > + ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_SRC_POLICY, 0, "1", > > "next;"); > > } > > > > /* Logical router ingress table IP_ROUTING & IP_ROUTING_ECMP: IP > > Routing. > > diff --git a/tests/ovn.at b/tests/ovn.at > > index 4370b3728..09952caf6 100644 > > --- a/tests/ovn.at > > +++ b/tests/ovn.at > > @@ -10141,20 +10141,6 @@ AT_CHECK([as hv3 ovs-vsctl set Open_vSwitch . > > external-ids:ovn-bridge-mappings=p > > OVS_WAIT_UNTIL([test 1 = `as hv3 ovs-vsctl show | \ > > grep "Port patch-br-int-to-ln_port" | wc -l`]) > > > > -AT_CHECK([test 1 = `ovn-sbctl dump-flows lr0 | grep lr_in_ip_routing | \ > > -grep "ip4.src == 10.0.0.3 && is_chassis_resident(\"foo1\")" -c`]) > > -AT_CHECK([test 1 = `ovn-sbctl dump-flows lr0 | grep lr_in_ip_routing | \ > > -grep "ip4.src == 10.0.0.4 && is_chassis_resident(\"foo2\")" -c`]) > > - > > -key=`ovn-sbctl --bare --columns tunnel_key list datapath_Binding lr0` > > -# Check that the OVS flows appear for the dnat_and_snat entries in > > -# lr_in_ip_routing table. > > -OVS_WAIT_UNTIL([test 1 = `as hv3 ovs-ofctl dump-flows br-int table=17 | \ > > -grep "priority=400,ip,metadata=0x$key,nw_src=10.0.0.3" -c`]) > > - > > -OVS_WAIT_UNTIL([test 1 = `as hv3 ovs-ofctl dump-flows br-int table=17 | \ > > -grep "priority=400,ip,metadata=0x$key,nw_src=10.0.0.4" -c`]) > > - > > # Re-add nat-addresses option > > ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router" > > > > @@ -14957,9 +14943,14 @@ ovs-vsctl -- add-port br-int hv2-vif1 -- \ > > set interface hv2-vif1 external-ids:iface-id=sw1-p0 \ > > options:tx_pcap=hv2/vif1-tx.pcap \ > > options:rxq_pcap=hv2/vif1-rx.pcap \ > > - ofport-request=1 > > + ofport-request=2 > > +ovs-vsctl -- add-port br-int hv2-vif2 -- \ > > + set interface hv2-vif2 external-ids:iface-id=sw0-p1 \ > > + options:tx_pcap=hv2/vif2-tx.pcap \ > > + options:rxq_pcap=hv2/vif2-rx.pcap \ > > + ofport-request=3 > > > > -ovn-nbctl create Logical_Router name=lr0 options:chassis=hv1 > > +ovn-nbctl create Logical_Router name=lr0 > > ovn-nbctl ls-add sw0 > > ovn-nbctl ls-add sw1 > > > > @@ -14968,13 +14959,16 @@ ovn-nbctl lsp-add sw0 rp-sw0 -- set > > Logical_Switch_Port rp-sw0 \ > > type=router options:router-port=sw0 \ > > -- lsp-set-addresses rp-sw0 router > > > > -ovn-nbctl lrp-add lr0 sw1 00:00:02:01:02:03 172.16.1.1/24 > > 2002:0:0:0:0:0:0:1/64 > > +ovn-nbctl lrp-add lr0 sw1 00:00:02:01:02:03 172.16.1.1/24 > > 2002:0:0:0:0:0:0:1/64 \ > > + -- set Logical_Router_Port sw1 options:redirect-chassis="hv2" > > ovn-nbctl lsp-add sw1 rp-sw1 -- set Logical_Switch_Port rp-sw1 \ > > type=router options:router-port=sw1 \ > > -- lsp-set-addresses rp-sw1 router > > > > ovn-nbctl lsp-add sw0 sw0-p0 \ > > -- lsp-set-addresses sw0-p0 "f0:00:00:01:02:03 192.168.1.2 2001::2" > > +ovn-nbctl lsp-add sw0 sw0-p1 \ > > + -- lsp-set-addresses sw0-p1 "f0:00:00:11:02:03 192.168.1.3 2001::3" > > > > ovn-nbctl lsp-add sw1 sw1-p0 \ > > -- lsp-set-addresses sw1-p0 unknown > > @@ -15020,6 +15014,20 @@ send_na 2 1 $dst_mac $router_mac1 $dst_ip6 > > $router_ip6 > > > > OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) > > > > +# Create FIP on sw0-p0, add a route on logical router pipeline and > > +# ARP request for a unkwon destination is sent using FIP MAC/IP > > +ovn-nbctl lr-nat-add lr0 dnat_and_snat 172.16.1.2 192.168.1.3 sw0-p1 > > f0:00:00:01:02:04 > > +ovn-nbctl lr-route-add lr0 172.16.2.0/24 172.16.1.11 > > + > > +dst_ip=$(ip_to_hex 172 16 2 10) > > +fip_ip=$(ip_to_hex 172 16 1 2) > > +src_ip=$(ip_to_hex 192 168 1 3) > > +gw_router=$(ip_to_hex 172 16 1 11) > > +send_icmp_packet 2 2 f00000110203 $router_mac0 $src_ip $dst_ip 0000 $data > > +echo $(get_arp_req f00000010204 $fip_ip $gw_router) >> expected > > + > > +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) > > + > > OVN_CLEANUP([hv1],[hv2]) > > AT_CLEANUP > > > > @@ -15645,7 +15653,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > > | grep "reg0 == 10.0.0.10" \ > > # Since the sw0-vir is not claimed by any chassis, eth.dst should be set > > to > > # zero if the ip4.dst is the virtual ip in the router pipeline. > > AT_CHECK([cat lflows.txt], [0], [dnl > > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) > > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) > > ]) > > > > ip_to_hex() { > > @@ -15696,7 +15704,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > > | grep "reg0 == 10.0.0.10" \ > > # There should be an arp resolve flow to resolve the virtual_ip with the > > # sw0-p1's MAC. > > AT_CHECK([cat lflows.txt], [0], [dnl > > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) > > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) > > ]) > > > > # Forcibly clear virtual_parent. ovn-controller should release the binding > > @@ -15737,7 +15745,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > > | grep "reg0 == 10.0.0.10" \ > > # There should be an arp resolve flow to resolve the virtual_ip with the > > # sw0-p2's MAC. > > AT_CHECK([cat lflows.txt], [0], [dnl > > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:05; next;) > > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:05; next;) > > ]) > > > > # send the garp from sw0-p2 (in hv2). hv2 should claim sw0-vir > > @@ -15760,7 +15768,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > > | grep "reg0 == 10.0.0.10" \ > > # There should be an arp resolve flow to resolve the virtual_ip with the > > # sw0-p3's MAC. > > AT_CHECK([cat lflows.txt], [0], [dnl > > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) > > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) > > ]) > > > > # Now send arp reply from sw0-p1. hv1 should claim sw0-vir > > @@ -15781,7 +15789,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > > | grep "reg0 == 10.0.0.10" \ > > > lflows.txt > > > > AT_CHECK([cat lflows.txt], [0], [dnl > > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) > > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) > > ]) > > > > # Delete hv1-vif1 port. hv1 should release sw0-vir > > @@ -15799,7 +15807,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > > | grep "reg0 == 10.0.0.10" \ > > > lflows.txt > > > > AT_CHECK([cat lflows.txt], [0], [dnl > > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) > > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) > > ]) > > > > # Now send arp reply from sw0-p2. hv2 should claim sw0-vir > > @@ -15820,7 +15828,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve > > | grep "reg0 == 10.0.0.10" \ > > > lflows.txt > > > > AT_CHECK([cat lflows.txt], [0], [dnl > > - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) > > + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == > > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) > > ]) > > > > # Delete sw0-p2 logical port > > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > > index 9ae6c6b1f..99a0ee07b 100644 > > --- a/tests/system-ovn.at > > +++ b/tests/system-ovn.at > > @@ -2747,6 +2747,17 @@ ADD_VETH(alice1, alice1, br-int, "172.16.1.2/24", > > "f0:00:00:01:02:05", \ > > ovn-nbctl lsp-add alice alice1 \ > > -- lsp-set-addresses alice1 "f0:00:00:01:02:05 172.16.1.2" > > > > +# Add external network > > +ADD_NAMESPACES(ext-net) > > +ip link add alice-ext netns alice1 type veth peer name ext-veth netns > > ext-net > > +ip -n ext-net link set dev ext-veth up > > +ip -n ext-net addr add 10.0.0.1/24 dev ext-veth > > +ip -n ext-net route add default via 10.0.0.2 > > + > > +ip -n alice1 link set dev alice-ext up > > +ip -n alice1 addr add 10.0.0.2/24 dev alice-ext > > +ip netns exec alice1 sysctl -w net.ipv4.conf.all.forwarding=1 > > + > > # Add DNAT rules > > AT_CHECK([ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.1.3 192.168.1.2 > > foo1 00:00:02:02:03:04]) > > AT_CHECK([ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.1.4 192.168.1.3 > > foo2 00:00:02:02:03:05]) > > @@ -2754,6 +2765,9 @@ AT_CHECK([ovn-nbctl lr-nat-add R1 dnat_and_snat > > 172.16.1.4 192.168.1.3 foo2 00:0 > > # Add a SNAT rule > > AT_CHECK([ovn-nbctl lr-nat-add R1 snat 172.16.1.1 192.168.0.0/16]) > > > > +# Add default route to ext-net > > +AT_CHECK([ovn-nbctl lr-route-add R1 10.0.0.0/24 172.16.1.2]) > > + > > ovn-nbctl --wait=hv sync > > OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep 'nat(src=172.16.1.1)']) > > > > @@ -2797,6 +2811,20 @@ sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > > > icmp,orig=(src=192.168.2.2,dst=172.16.1.2,id=<cleared>,type=8,code=0),reply=(src=172.16.1.2,dst=172.16.1.1,id=<cleared>,type=0,code=0),zone=<cleared> > > ]) > > > > +# Try to ping external network > > +NS_CHECK_EXEC([ext-net], [tcpdump -n -c 3 -i ext-veth dst 172.16.1.3 and > > icmp > ext-net.pcap &]) > > +sleep 1 > > +AT_CHECK([ovn-nbctl lr-nat-del R1 snat]) > > +NS_CHECK_EXEC([foo1], [ping -q -c 3 -i 0.3 -w 2 10.0.0.1 | FORMAT_PING], \ > > +[0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +OVS_WAIT_UNTIL([ > > + total_pkts=$(cat ext-net.pcap | wc -l) > > + test "${total_pkts}" = "3" > > +]) > > + > > OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > > > as ovn-sb > > -- > > 2.26.2 > > > >
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 8f224b07f..5e57f5694 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -2484,37 +2484,6 @@ output; </p> </li> - <li> - <p> - For distributed logical routers where one of the logical router ports - specifies a <code>redirect-chassis</code>, a priority-400 logical - flow for each <code>dnat_and_snat</code> NAT rules configured. - These flows will allow to properly forward traffic to the external - connections if available and avoid sending it through the tunnel. - Assuming the following NAT rule has been configured: - </p> - - <pre> -external_ip = <var>A</var>; -external_mac = <var>B</var>; -logical_ip = <var>C</var>; - </pre> - - <p> - the following action will be applied: - </p> - - <pre> -ip.ttl--; -reg0 = <var>ip.dst</var>; -reg1 = <var>A</var>; -eth.src = <var>B</var>; -outport = <var>router-port</var>; -next; - </pre> - - </li> - <li> <p> IPv4 routing table. For each route to IPv4 network <var>N</var> with @@ -2660,7 +2629,41 @@ outport = <var>P</var>; </li> </ul> - <h3>Ingress Table 12: ARP/ND Resolution</h3> + <h3>Ingress Table 12: IP Source Policy</h3> + + <p> + This table contains for distributed logical routers where one of + the logical router ports specifies a <code>redirect-chassis</code>, + a priority-100 logical flow for each <code>dnat_and_snat</code> + NAT rules configured. + These flows will allow to properly forward traffic to the external + connections if available and avoid sending it through the tunnel. + Assuming the following NAT rule has been configured: + </p> + + <pre> +external_ip = <var>A</var>; +external_mac = <var>B</var>; +logical_ip = <var>C</var>; +logical_port = <var>D</var>; + </pre> + + <p> + for IP traffic matching <code>ip.src == <var>C</var> && + is_chassis_resident(<var>D</var>)</code> + </p> + + <p> + the following action will be applied: + </p> + + <pre> +reg1 = <var>A</var>; +eth.src = <var>B</var>; +next; + </pre> + + <h3>Ingress Table 13: ARP/ND Resolution</h3> <p> Any packet that reaches this table is an IP packet whose next-hop @@ -2819,7 +2822,7 @@ outport = <var>P</var>; </ul> - <h3>Ingress Table 13: Check packet length</h3> + <h3>Ingress Table 14: Check packet length</h3> <p> For distributed logical routers with distributed gateway port configured @@ -2849,7 +2852,7 @@ REGBIT_PKT_LARGER = check_pkt_larger(<var>L</var>); next; and advances to the next table. </p> - <h3>Ingress Table 14: Handle larger packets</h3> + <h3>Ingress Table 15: Handle larger packets</h3> <p> For distributed logical routers with distributed gateway port configured @@ -2898,7 +2901,7 @@ icmp4 { and advances to the next table. </p> - <h3>Ingress Table 15: Gateway Redirect</h3> + <h3>Ingress Table 16: Gateway Redirect</h3> <p> For distributed logical routers where one of the logical router @@ -2934,7 +2937,7 @@ icmp4 { </li> </ul> - <h3>Ingress Table 16: ARP Request</h3> + <h3>Ingress Table 17: ARP Request</h3> <p> In the common case where the Ethernet destination has been resolved, this diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 6ccd84e49..cecaa9ab3 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -175,11 +175,12 @@ enum ovn_stage { PIPELINE_STAGE(ROUTER, IN, IP_ROUTING, 9, "lr_in_ip_routing") \ PIPELINE_STAGE(ROUTER, IN, IP_ROUTING_ECMP, 10, "lr_in_ip_routing_ecmp") \ PIPELINE_STAGE(ROUTER, IN, POLICY, 11, "lr_in_policy") \ - PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 12, "lr_in_arp_resolve") \ - PIPELINE_STAGE(ROUTER, IN, CHK_PKT_LEN , 13, "lr_in_chk_pkt_len") \ - PIPELINE_STAGE(ROUTER, IN, LARGER_PKTS, 14,"lr_in_larger_pkts") \ - PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 15, "lr_in_gw_redirect") \ - PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 16, "lr_in_arp_request") \ + PIPELINE_STAGE(ROUTER, IN, IP_SRC_POLICY, 12, "lr_in_ip_src_policy") \ + PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 13, "lr_in_arp_resolve") \ + PIPELINE_STAGE(ROUTER, IN, CHK_PKT_LEN , 14, "lr_in_chk_pkt_len") \ + PIPELINE_STAGE(ROUTER, IN, LARGER_PKTS, 15,"lr_in_larger_pkts") \ + PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 16, "lr_in_gw_redirect") \ + PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 17, "lr_in_arp_request") \ \ /* Logical router egress stages. */ \ PIPELINE_STAGE(ROUTER, OUT, UNDNAT, 0, "lr_out_undnat") \ @@ -7125,8 +7126,6 @@ build_routing_policy_flow(struct hmap *lflows, struct ovn_datapath *od, ds_destroy(&actions); } -/* default logical flow prioriry for distributed routes */ -#define DROUTE_PRIO 400 struct parsed_route { struct ovs_list list_node; struct v46_ip prefix; @@ -7515,7 +7514,7 @@ build_ecmp_route_flow(struct hmap *lflows, struct ovn_datapath *od, } static void -add_distributed_routes(struct hmap *lflows, struct ovn_datapath *od) +add_ip_src_policy_flows(struct hmap *lflows, struct ovn_datapath *od) { struct ds actions = DS_EMPTY_INITIALIZER; struct ds match = DS_EMPTY_INITIALIZER; @@ -7533,12 +7532,9 @@ add_distributed_routes(struct hmap *lflows, struct ovn_datapath *od) is_ipv4 ? "4" : "6", nat->logical_ip, nat->logical_port); char *prefix = is_ipv4 ? "" : "xx"; - ds_put_format(&actions, "outport = %s; eth.src = %s; " - "%sreg0 = ip%s.dst; %sreg1 = %s; next;", - od->l3dgw_port->json_key, nat->external_mac, - prefix, is_ipv4 ? "4" : "6", - prefix, nat->external_ip); - ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_ROUTING, DROUTE_PRIO, + ds_put_format(&actions, "eth.src = %s; %sreg1 = %s; next;", + nat->external_mac, prefix, nat->external_ip); + ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_SRC_POLICY, 100, ds_cstr(&match), ds_cstr(&actions)); ds_clear(&match); ds_clear(&actions); @@ -7569,12 +7565,6 @@ add_route(struct hmap *lflows, const struct ovn_port *op, } build_route_match(op_inport, network_s, plen, is_src_route, is_ipv4, &match, &priority); - /* traffic for internal IPs of logical switch ports must be sent to - * the gw controller through the overlay tunnels - */ - if (op->nbrp && !op->nbrp->n_gateway_chassis) { - priority += DROUTE_PRIO; - } struct ds actions = DS_EMPTY_INITIALIZER; ds_put_format(&actions, "ip.ttl--; "REG_ECMP_GROUP_ID" = 0; %sreg0 = ", @@ -9541,9 +9531,13 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, * logical router */ HMAP_FOR_EACH (od, key_node, datapaths) { - if (od->nbr && od->l3dgw_port) { - add_distributed_routes(lflows, od); + if (!od->nbr) { + continue; + } + if (od->l3dgw_port) { + add_ip_src_policy_flows(lflows, od); } + ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_SRC_POLICY, 0, "1", "next;"); } /* Logical router ingress table IP_ROUTING & IP_ROUTING_ECMP: IP Routing. diff --git a/tests/ovn.at b/tests/ovn.at index 4370b3728..09952caf6 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -10141,20 +10141,6 @@ AT_CHECK([as hv3 ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=p OVS_WAIT_UNTIL([test 1 = `as hv3 ovs-vsctl show | \ grep "Port patch-br-int-to-ln_port" | wc -l`]) -AT_CHECK([test 1 = `ovn-sbctl dump-flows lr0 | grep lr_in_ip_routing | \ -grep "ip4.src == 10.0.0.3 && is_chassis_resident(\"foo1\")" -c`]) -AT_CHECK([test 1 = `ovn-sbctl dump-flows lr0 | grep lr_in_ip_routing | \ -grep "ip4.src == 10.0.0.4 && is_chassis_resident(\"foo2\")" -c`]) - -key=`ovn-sbctl --bare --columns tunnel_key list datapath_Binding lr0` -# Check that the OVS flows appear for the dnat_and_snat entries in -# lr_in_ip_routing table. -OVS_WAIT_UNTIL([test 1 = `as hv3 ovs-ofctl dump-flows br-int table=17 | \ -grep "priority=400,ip,metadata=0x$key,nw_src=10.0.0.3" -c`]) - -OVS_WAIT_UNTIL([test 1 = `as hv3 ovs-ofctl dump-flows br-int table=17 | \ -grep "priority=400,ip,metadata=0x$key,nw_src=10.0.0.4" -c`]) - # Re-add nat-addresses option ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router" @@ -14957,9 +14943,14 @@ ovs-vsctl -- add-port br-int hv2-vif1 -- \ set interface hv2-vif1 external-ids:iface-id=sw1-p0 \ options:tx_pcap=hv2/vif1-tx.pcap \ options:rxq_pcap=hv2/vif1-rx.pcap \ - ofport-request=1 + ofport-request=2 +ovs-vsctl -- add-port br-int hv2-vif2 -- \ + set interface hv2-vif2 external-ids:iface-id=sw0-p1 \ + options:tx_pcap=hv2/vif2-tx.pcap \ + options:rxq_pcap=hv2/vif2-rx.pcap \ + ofport-request=3 -ovn-nbctl create Logical_Router name=lr0 options:chassis=hv1 +ovn-nbctl create Logical_Router name=lr0 ovn-nbctl ls-add sw0 ovn-nbctl ls-add sw1 @@ -14968,13 +14959,16 @@ ovn-nbctl lsp-add sw0 rp-sw0 -- set Logical_Switch_Port rp-sw0 \ type=router options:router-port=sw0 \ -- lsp-set-addresses rp-sw0 router -ovn-nbctl lrp-add lr0 sw1 00:00:02:01:02:03 172.16.1.1/24 2002:0:0:0:0:0:0:1/64 +ovn-nbctl lrp-add lr0 sw1 00:00:02:01:02:03 172.16.1.1/24 2002:0:0:0:0:0:0:1/64 \ + -- set Logical_Router_Port sw1 options:redirect-chassis="hv2" ovn-nbctl lsp-add sw1 rp-sw1 -- set Logical_Switch_Port rp-sw1 \ type=router options:router-port=sw1 \ -- lsp-set-addresses rp-sw1 router ovn-nbctl lsp-add sw0 sw0-p0 \ -- lsp-set-addresses sw0-p0 "f0:00:00:01:02:03 192.168.1.2 2001::2" +ovn-nbctl lsp-add sw0 sw0-p1 \ + -- lsp-set-addresses sw0-p1 "f0:00:00:11:02:03 192.168.1.3 2001::3" ovn-nbctl lsp-add sw1 sw1-p0 \ -- lsp-set-addresses sw1-p0 unknown @@ -15020,6 +15014,20 @@ send_na 2 1 $dst_mac $router_mac1 $dst_ip6 $router_ip6 OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) +# Create FIP on sw0-p0, add a route on logical router pipeline and +# ARP request for a unkwon destination is sent using FIP MAC/IP +ovn-nbctl lr-nat-add lr0 dnat_and_snat 172.16.1.2 192.168.1.3 sw0-p1 f0:00:00:01:02:04 +ovn-nbctl lr-route-add lr0 172.16.2.0/24 172.16.1.11 + +dst_ip=$(ip_to_hex 172 16 2 10) +fip_ip=$(ip_to_hex 172 16 1 2) +src_ip=$(ip_to_hex 192 168 1 3) +gw_router=$(ip_to_hex 172 16 1 11) +send_icmp_packet 2 2 f00000110203 $router_mac0 $src_ip $dst_ip 0000 $data +echo $(get_arp_req f00000010204 $fip_ip $gw_router) >> expected + +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) + OVN_CLEANUP([hv1],[hv2]) AT_CLEANUP @@ -15645,7 +15653,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ # Since the sw0-vir is not claimed by any chassis, eth.dst should be set to # zero if the ip4.dst is the virtual ip in the router pipeline. AT_CHECK([cat lflows.txt], [0], [dnl - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) ]) ip_to_hex() { @@ -15696,7 +15704,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ # There should be an arp resolve flow to resolve the virtual_ip with the # sw0-p1's MAC. AT_CHECK([cat lflows.txt], [0], [dnl - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) ]) # Forcibly clear virtual_parent. ovn-controller should release the binding @@ -15737,7 +15745,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ # There should be an arp resolve flow to resolve the virtual_ip with the # sw0-p2's MAC. AT_CHECK([cat lflows.txt], [0], [dnl - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:05; next;) + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:05; next;) ]) # send the garp from sw0-p2 (in hv2). hv2 should claim sw0-vir @@ -15760,7 +15768,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ # There should be an arp resolve flow to resolve the virtual_ip with the # sw0-p3's MAC. AT_CHECK([cat lflows.txt], [0], [dnl - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) ]) # Now send arp reply from sw0-p1. hv1 should claim sw0-vir @@ -15781,7 +15789,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) ]) # Delete hv1-vif1 port. hv1 should release sw0-vir @@ -15799,7 +15807,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) ]) # Now send arp reply from sw0-p2. hv2 should claim sw0-vir @@ -15820,7 +15828,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=12(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) + table=13(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) ]) # Delete sw0-p2 logical port diff --git a/tests/system-ovn.at b/tests/system-ovn.at index 9ae6c6b1f..99a0ee07b 100644 --- a/tests/system-ovn.at +++ b/tests/system-ovn.at @@ -2747,6 +2747,17 @@ ADD_VETH(alice1, alice1, br-int, "172.16.1.2/24", "f0:00:00:01:02:05", \ ovn-nbctl lsp-add alice alice1 \ -- lsp-set-addresses alice1 "f0:00:00:01:02:05 172.16.1.2" +# Add external network +ADD_NAMESPACES(ext-net) +ip link add alice-ext netns alice1 type veth peer name ext-veth netns ext-net +ip -n ext-net link set dev ext-veth up +ip -n ext-net addr add 10.0.0.1/24 dev ext-veth +ip -n ext-net route add default via 10.0.0.2 + +ip -n alice1 link set dev alice-ext up +ip -n alice1 addr add 10.0.0.2/24 dev alice-ext +ip netns exec alice1 sysctl -w net.ipv4.conf.all.forwarding=1 + # Add DNAT rules AT_CHECK([ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.1.3 192.168.1.2 foo1 00:00:02:02:03:04]) AT_CHECK([ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.1.4 192.168.1.3 foo2 00:00:02:02:03:05]) @@ -2754,6 +2765,9 @@ AT_CHECK([ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.1.4 192.168.1.3 foo2 00:0 # Add a SNAT rule AT_CHECK([ovn-nbctl lr-nat-add R1 snat 172.16.1.1 192.168.0.0/16]) +# Add default route to ext-net +AT_CHECK([ovn-nbctl lr-route-add R1 10.0.0.0/24 172.16.1.2]) + ovn-nbctl --wait=hv sync OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep 'nat(src=172.16.1.1)']) @@ -2797,6 +2811,20 @@ sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl icmp,orig=(src=192.168.2.2,dst=172.16.1.2,id=<cleared>,type=8,code=0),reply=(src=172.16.1.2,dst=172.16.1.1,id=<cleared>,type=0,code=0),zone=<cleared> ]) +# Try to ping external network +NS_CHECK_EXEC([ext-net], [tcpdump -n -c 3 -i ext-veth dst 172.16.1.3 and icmp > ext-net.pcap &]) +sleep 1 +AT_CHECK([ovn-nbctl lr-nat-del R1 snat]) +NS_CHECK_EXEC([foo1], [ping -q -c 3 -i 0.3 -w 2 10.0.0.1 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +OVS_WAIT_UNTIL([ + total_pkts=$(cat ext-net.pcap | wc -l) + test "${total_pkts}" = "3" +]) + OVS_APP_EXIT_AND_WAIT([ovn-controller]) as ovn-sb
In order to fix the issues introduced by commit c0bf32d72f8b ("Manage ARP process locally in a DVR scenario "), restore previous configuration of table 9 in ingress router pipeline and introduce a new stage called 'ip_src_policy' used to set the src address info in order to not distribute FIP traffic if DVR is enabled Fixes: c0bf32d72f8b ("Manage ARP process locally in a DVR scenario ") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> --- Changes since v1: - fixed system-ovn.at test - added ovn.at test - fixed documentation --- northd/ovn-northd.8.xml | 75 +++++++++++++++++++++-------------------- northd/ovn-northd.c | 38 +++++++++------------ tests/ovn.at | 56 +++++++++++++++++------------- tests/system-ovn.at | 28 +++++++++++++++ 4 files changed, 115 insertions(+), 82 deletions(-)