Message ID | 20200326144912.804557-1-numans@ovn.org |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev,ovn,v3] ovn-northd: Skip unsnat flows for load balancer vips in router ingress pipeline | expand |
Acked-by: Mark Michelson <mmichels@redhat.com> On 3/26/20 10:49 AM, numans@ovn.org wrote: > From: Numan Siddique <numans@ovn.org> > > Suppose there is below NAT entry with external_ip = 172.168.0.100 > > nat <UUID> > external ip: "172.168.0.100" > logical ip: "10.0.0.0/24" > type: "snat" > > And a load balancer with the VIP - 172.168.0.100 > > _uuid : <UUID> > external_ids : {} > name : lb1 > protocol : tcp > vips : {"172.168.0.100:8080"="10.0.0.4:8080"} > > And if these are associated to a gateway logical router > > Then we will see the below lflows in the router pipeline > > ... > table=5 (lr_in_unsnat ), priority=90 , match=(ip && ip4.dst == 172.168.0.100), action=(ct_snat;) > ... > table=6 (lr_in_dnat ), priority=120 , match=(ct.new && ip && ip4.dst == 172.168.0.100 && tcp && tcp.dst == 8080), action=(ct_lb(10.0.0.4:8080);) > table=6 (lr_in_dnat ), priority=120 , match=(ct.est && ip && ip4.dst == 172.168.0.100 && tcp && tcp.dst == 8080), action=(ct_dnat;) > > When a new connection packet destinated for the lb vip 172.168.0.100 and tcp.dst = 8080 > is received, the ct.new flow in the lr_in_dnat is hit and the packet's ip4.dst is > dnatted to 10.0.0.4 in the dnat conntrack zone. > > But for the subsequent packet destined to the vip, the ct.est lflow in the lr_in_dnat > stage doesn't get hit. In this case, the packet first hits the lr_in_unsnat pri 90 flow > as mentioned above with the action ct_snat. Even though ct_snat should have no effect, > looks like it is resetting the ct flags. > > In the case of tcp, the ct.new flow is hit instead of ct.est. In the the case of sctp, neither of the above > lflows in lr_in_dnat stage hit. > > This needs to be investigated further. But we can avoid this scenario in OVN > by adding the below lflow. > > table=5 (lr_in_unsnat ), priority=120 , match=(ip4 && ip4.dst == 172.168.0.100 && tcp.dst == 8080), action=(next;) > > This patch adds the above lflow if the lb vip also has an entry in the NAT table. > > This patch is also required to support sctp load balancers in OVN. > > Reported-by: Tim Rozet <trozet@redhat.com> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1815217 > Signed-off-by: Numan Siddique <numans@ovn.org> > --- > v2 -> v3 > ----- > * v2 was nothing but v1 as I forgot to commit the modified files. > * Also updated the ovn-northd.8.xml. > > v1 -> v2 > ------ > * Addressed the review comments from Mark - Added the protocol in the > match. > * I'm not sure if this is an ovs issue. So updated the commit message > accordingly. > > > northd/ovn-northd.8.xml | 27 +++++++++++++++++++ > northd/ovn-northd.c | 59 +++++++++++++++++++++++++++++++---------- > tests/ovn-northd.at | 38 ++++++++++++++++++++++++++ > tests/system-ovn.at | 51 ++++++++++++++++++++++++++++++++++- > 4 files changed, 160 insertions(+), 15 deletions(-) > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > index 1e0993e07..b5e4d6d84 100644 > --- a/northd/ovn-northd.8.xml > +++ b/northd/ovn-northd.8.xml > @@ -2075,6 +2075,33 @@ icmp6 { > unSNATted here. > </p> > > + <p>Ingress Table 5: UNSNAT on Gateway and Distributed Routers</p> > + <ul> > + <li> > + <p> > + If the Router (Gateway or Distributed) is configured with > + load balancers, then below lflows are added: > + </p> > + > + <p> > + For each IPv4 address <var>A</var> defined as load balancer > + VIP with the protocol <var>P</var> (and the protocol port > + <var>T</var> if defined) is also present as an > + <code>external_ip</code> in the NAT table, > + a priority-120 logical flow is added with the match > + <code>ip4 && ip4.dst == <var>A</var> && > + <var>P</var></code> with the action <code>next;</code> to > + advance the packet to the next table. If the load balancer > + has protocol port <code>B</code> defined, then the match also has > + <code><var>P</var>.dst == <var>B</var></code>. > + </p> > + > + <p> > + The above flows are also added for IPv6 load balancers. > + </p> > + </li> > + </ul> > + > <p>Ingress Table 5: UNSNAT on Gateway Routers</p> > > <ul> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index b76df0554..7ef049310 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -7574,7 +7574,7 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, > struct ds *match, struct ds *actions, int priority, > const char *lb_force_snat_ip, struct lb_vip *lb_vip, > bool is_udp, struct nbrec_load_balancer *lb, > - struct shash *meter_groups) > + struct shash *meter_groups, struct sset *nat_entries) > { > build_empty_lb_event_flow(od, lflows, lb_vip, lb, S_ROUTER_IN_DNAT, > meter_groups); > @@ -7607,6 +7607,40 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, > free(new_match); > free(est_match); > > + const char *ip_match = NULL; > + if (lb_vip->addr_family == AF_INET) { > + ip_match = "ip4"; > + } else { > + ip_match = "ip6"; > + } > + > + if (sset_contains(nat_entries, lb_vip->vip)) { > + /* The load balancer vip is also present in the NAT entries. > + * So add a high priority lflow to advance the the packet > + * destined to the vip (and the vip port if defined) > + * in the S_ROUTER_IN_UNSNAT stage. > + * There seems to be an issue with ovs-vswitchd. When the new > + * connection packet destined for the lb vip is received, > + * it is dnat'ed in the S_ROUTER_IN_DNAT stage in the dnat > + * conntrack zone. For the next packet, if it goes through > + * unsnat stage, the conntrack flags are not set properly, and > + * it doesn't hit the established state flows in > + * S_ROUTER_IN_DNAT stage. */ > + struct ds unsnat_match = DS_EMPTY_INITIALIZER; > + ds_put_format(&unsnat_match, "%s && %s.dst == %s && %s", > + ip_match, ip_match, lb_vip->vip, > + is_udp ? "udp" : "tcp"); > + if (lb_vip->vip_port) { > + ds_put_format(&unsnat_match, " && %s.dst == %d", > + is_udp ? "udp" : "tcp", lb_vip->vip_port); > + } > + > + ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_UNSNAT, 120, > + ds_cstr(&unsnat_match), "next;", &lb->header_); > + > + ds_destroy(&unsnat_match); > + } > + > if (!od->l3dgw_port || !od->l3redirect_port || !lb_vip->n_backends) { > return; > } > @@ -7616,19 +7650,11 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, > * router has a gateway router port associated. > */ > struct ds undnat_match = DS_EMPTY_INITIALIZER; > - if (lb_vip->addr_family == AF_INET) { > - ds_put_cstr(&undnat_match, "ip4 && ("); > - } else { > - ds_put_cstr(&undnat_match, "ip6 && ("); > - } > + ds_put_format(&undnat_match, "%s && (", ip_match); > > for (size_t i = 0; i < lb_vip->n_backends; i++) { > struct lb_vip_backend *backend = &lb_vip->backends[i]; > - if (backend->addr_family == AF_INET) { > - ds_put_format(&undnat_match, "(ip4.src == %s", backend->ip); > - } else { > - ds_put_format(&undnat_match, "(ip6.src == %s", backend->ip); > - } > + ds_put_format(&undnat_match, "(%s.src == %s", ip_match, backend->ip); > > if (backend->port) { > ds_put_format(&undnat_match, " && %s.src == %d) || ", > @@ -8879,6 +8905,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > &nat->header_); > sset_add(&nat_entries, nat->external_ip); > } > + } else { > + /* Add the NAT external_ip to the nat_entries even for > + * gateway routers. This is required for adding load balancer > + * flows.*/ > + sset_add(&nat_entries, nat->external_ip); > } > > /* Egress UNDNAT table: It is for already established connections' > @@ -9059,8 +9090,6 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > } > } > > - sset_destroy(&nat_entries); > - > /* Handle force SNAT options set in the gateway router. */ > if (dnat_force_snat_ip && !od->l3dgw_port) { > /* If a packet with destination IP address as that of the > @@ -9121,6 +9150,7 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > /* Load balancing and packet defrag are only valid on > * Gateway routers or router with gateway port. */ > if (!smap_get(&od->nbr->options, "chassis") && !od->l3dgw_port) { > + sset_destroy(&nat_entries); > continue; > } > > @@ -9195,10 +9225,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > } > add_router_lb_flow(lflows, od, &match, &actions, prio, > lb_force_snat_ip, lb_vip, is_udp, > - nb_lb, meter_groups); > + nb_lb, meter_groups, &nat_entries); > } > } > sset_destroy(&all_ips); > + sset_destroy(&nat_entries); > } > > /* Logical router ingress table ND_RA_OPTIONS & ND_RA_RESPONSE: IPv6 Router > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > index a2989e78e..d127152f5 100644 > --- a/tests/ovn-northd.at > +++ b/tests/ovn-northd.at > @@ -1288,3 +1288,41 @@ ovn-nbctl --wait=sb lb-del lb2 > OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list service_monitor | wc -l`]) > > AT_CLEANUP > + > +AT_SETUP([ovn -- Load balancer VIP in NAT entries]) > +AT_SKIP_IF([test $HAVE_PYTHON = no]) > +ovn_start > + > +ovn-nbctl lr-add lr0 > +ovn-nbctl lrp-add lr0 lr0-public 00:00:01:01:02:04 192.168.2.1/24 > +ovn-nbctl lrp-add lr0 lr0-join 00:00:01:01:02:04 10.10.0.1/24 > + > +ovn-nbctl set logical_router lr0 options:chassis=ch1 > + > +ovn-nbctl lb-add lb1 "192.168.2.1:8080" "10.0.0.4:8080" > +ovn-nbctl lb-add lb2 "192.168.2.4:8080" "10.0.0.5:8080" udp > +ovn-nbctl lb-add lb3 "192.168.2.5:8080" "10.0.0.6:8080" > +ovn-nbctl lb-add lb4 "192.168.2.6:8080" "10.0.0.7:8080" > + > +ovn-nbctl lr-lb-add lr0 lb1 > +ovn-nbctl lr-lb-add lr0 lb2 > +ovn-nbctl lr-lb-add lr0 lb3 > +ovn-nbctl lr-lb-add lr0 lb4 > + > +ovn-nbctl lr-nat-add lr0 snat 192.168.2.1 10.0.0.0/24 > +ovn-nbctl lr-nat-add lr0 dnat_and_snat 192.168.2.4 10.0.0.4 > +ovn-nbctl lr-nat-add lr0 dnat 192.168.2.5 10.0.0.5 > + > +OVS_WAIT_UNTIL([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > +grep "ip4 && ip4.dst == 192.168.2.1 && tcp && tcp.dst == 8080" -c) ]) > + > +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > +grep "ip4 && ip4.dst == 192.168.2.4 && udp && udp.dst == 8080" -c) ]) > + > +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > +grep "ip4 && ip4.dst == 192.168.2.5 && tcp && tcp.dst == 8080" -c) ]) > + > +AT_CHECK([test 0 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > +grep "ip4 && ip4.dst == 192.168.2.6 && tcp && tcp.dst == 8080" -c) ]) > + > +AT_CLEANUP > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > index 3b3379840..f1ae69b20 100644 > --- a/tests/system-ovn.at > +++ b/tests/system-ovn.at > @@ -1635,7 +1635,6 @@ ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.16 > ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.2.2 \ > external_ip=30.0.0.2 -- add logical_router R2 nat @nat > > - > # Wait for ovn-controller to catch up. > ovn-nbctl --wait=hv sync > OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \ > @@ -1671,6 +1670,56 @@ tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(sr > tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > ]) > > +check_est_flows () { > + n=$(ovs-ofctl dump-flows br-int table=14 | grep \ > +"priority=120,ct_state=+est+trk,tcp,metadata=0x2,nw_dst=30.0.0.2,tp_dst=8000" \ > +| grep nat | sed -n 's/.*n_packets=\([[0-9]]\{1,\}\).*/\1/p') > + > + echo "n_packets=$n" > + test "$n" != 0 > +} > + > +OVS_WAIT_UNTIL([check_est_flows], [check established flows]) > + > + > +ovn-nbctl set logical_router R2 options:lb_force_snat_ip="20.0.0.2" > + > +# Destroy the load balancer and create again. ovn-controller will > +# clear the OF flows and re add again and clears the n_packets > +# for these flows. > +ovn-nbctl destroy load_balancer $uuid > +uuid=`ovn-nbctl create load_balancer vips:30.0.0.1="192.168.1.2,192.168.2.2"` > +ovn-nbctl set logical_router R2 load_balancer=$uuid > + > +# Config OVN load-balancer with another VIP (this time with ports). > +ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.168.2.2:80"' > + > +ovn-nbctl list load_balancer > +ovn-sbctl dump-flows R2 > +OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-flows br-int table=41 | \ > +grep 'nat(src=20.0.0.2)']) > + > +dnl Test load-balancing that includes L4 ports in NAT. > +for i in `seq 1 20`; do > + echo Request $i > + NS_CHECK_EXEC([alice1], [wget 30.0.0.2:8000 -t 5 -T 1 --retry-connrefused -v -o wget$i.log]) > +done > + > +dnl Each server should have at least one connection. > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.2) | > +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > +]) > + > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(20.0.0.2) | > +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > +tcp,orig=(src=172.16.1.2,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=20.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > +tcp,orig=(src=172.16.1.2,dst=192.168.2.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=20.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > +]) > + > +OVS_WAIT_UNTIL([check_est_flows], [check established flows]) > + > OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > as ovn-sb >
On Thu, Mar 26, 2020 at 8:29 PM Mark Michelson <mmichels@redhat.com> wrote: > > Acked-by: Mark Michelson <mmichels@redhat.com> Thanks, I applied the patch to master. Numan > > On 3/26/20 10:49 AM, numans@ovn.org wrote: > > From: Numan Siddique <numans@ovn.org> > > > > Suppose there is below NAT entry with external_ip = 172.168.0.100 > > > > nat <UUID> > > external ip: "172.168.0.100" > > logical ip: "10.0.0.0/24" > > type: "snat" > > > > And a load balancer with the VIP - 172.168.0.100 > > > > _uuid : <UUID> > > external_ids : {} > > name : lb1 > > protocol : tcp > > vips : {"172.168.0.100:8080"="10.0.0.4:8080"} > > > > And if these are associated to a gateway logical router > > > > Then we will see the below lflows in the router pipeline > > > > ... > > table=5 (lr_in_unsnat ), priority=90 , match=(ip && ip4.dst == 172.168.0.100), action=(ct_snat;) > > ... > > table=6 (lr_in_dnat ), priority=120 , match=(ct.new && ip && ip4.dst == 172.168.0.100 && tcp && tcp.dst == 8080), action=(ct_lb(10.0.0.4:8080);) > > table=6 (lr_in_dnat ), priority=120 , match=(ct.est && ip && ip4.dst == 172.168.0.100 && tcp && tcp.dst == 8080), action=(ct_dnat;) > > > > When a new connection packet destinated for the lb vip 172.168.0.100 and tcp.dst = 8080 > > is received, the ct.new flow in the lr_in_dnat is hit and the packet's ip4.dst is > > dnatted to 10.0.0.4 in the dnat conntrack zone. > > > > But for the subsequent packet destined to the vip, the ct.est lflow in the lr_in_dnat > > stage doesn't get hit. In this case, the packet first hits the lr_in_unsnat pri 90 flow > > as mentioned above with the action ct_snat. Even though ct_snat should have no effect, > > looks like it is resetting the ct flags. > > > > In the case of tcp, the ct.new flow is hit instead of ct.est. In the the case of sctp, neither of the above > > lflows in lr_in_dnat stage hit. > > > > This needs to be investigated further. But we can avoid this scenario in OVN > > by adding the below lflow. > > > > table=5 (lr_in_unsnat ), priority=120 , match=(ip4 && ip4.dst == 172.168.0.100 && tcp.dst == 8080), action=(next;) > > > > This patch adds the above lflow if the lb vip also has an entry in the NAT table. > > > > This patch is also required to support sctp load balancers in OVN. > > > > Reported-by: Tim Rozet <trozet@redhat.com> > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1815217 > > Signed-off-by: Numan Siddique <numans@ovn.org> > > --- > > v2 -> v3 > > ----- > > * v2 was nothing but v1 as I forgot to commit the modified files. > > * Also updated the ovn-northd.8.xml. > > > > v1 -> v2 > > ------ > > * Addressed the review comments from Mark - Added the protocol in the > > match. > > * I'm not sure if this is an ovs issue. So updated the commit message > > accordingly. > > > > > > northd/ovn-northd.8.xml | 27 +++++++++++++++++++ > > northd/ovn-northd.c | 59 +++++++++++++++++++++++++++++++---------- > > tests/ovn-northd.at | 38 ++++++++++++++++++++++++++ > > tests/system-ovn.at | 51 ++++++++++++++++++++++++++++++++++- > > 4 files changed, 160 insertions(+), 15 deletions(-) > > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > > index 1e0993e07..b5e4d6d84 100644 > > --- a/northd/ovn-northd.8.xml > > +++ b/northd/ovn-northd.8.xml > > @@ -2075,6 +2075,33 @@ icmp6 { > > unSNATted here. > > </p> > > > > + <p>Ingress Table 5: UNSNAT on Gateway and Distributed Routers</p> > > + <ul> > > + <li> > > + <p> > > + If the Router (Gateway or Distributed) is configured with > > + load balancers, then below lflows are added: > > + </p> > > + > > + <p> > > + For each IPv4 address <var>A</var> defined as load balancer > > + VIP with the protocol <var>P</var> (and the protocol port > > + <var>T</var> if defined) is also present as an > > + <code>external_ip</code> in the NAT table, > > + a priority-120 logical flow is added with the match > > + <code>ip4 && ip4.dst == <var>A</var> && > > + <var>P</var></code> with the action <code>next;</code> to > > + advance the packet to the next table. If the load balancer > > + has protocol port <code>B</code> defined, then the match also has > > + <code><var>P</var>.dst == <var>B</var></code>. > > + </p> > > + > > + <p> > > + The above flows are also added for IPv6 load balancers. > > + </p> > > + </li> > > + </ul> > > + > > <p>Ingress Table 5: UNSNAT on Gateway Routers</p> > > > > <ul> > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > index b76df0554..7ef049310 100644 > > --- a/northd/ovn-northd.c > > +++ b/northd/ovn-northd.c > > @@ -7574,7 +7574,7 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, > > struct ds *match, struct ds *actions, int priority, > > const char *lb_force_snat_ip, struct lb_vip *lb_vip, > > bool is_udp, struct nbrec_load_balancer *lb, > > - struct shash *meter_groups) > > + struct shash *meter_groups, struct sset *nat_entries) > > { > > build_empty_lb_event_flow(od, lflows, lb_vip, lb, S_ROUTER_IN_DNAT, > > meter_groups); > > @@ -7607,6 +7607,40 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, > > free(new_match); > > free(est_match); > > > > + const char *ip_match = NULL; > > + if (lb_vip->addr_family == AF_INET) { > > + ip_match = "ip4"; > > + } else { > > + ip_match = "ip6"; > > + } > > + > > + if (sset_contains(nat_entries, lb_vip->vip)) { > > + /* The load balancer vip is also present in the NAT entries. > > + * So add a high priority lflow to advance the the packet > > + * destined to the vip (and the vip port if defined) > > + * in the S_ROUTER_IN_UNSNAT stage. > > + * There seems to be an issue with ovs-vswitchd. When the new > > + * connection packet destined for the lb vip is received, > > + * it is dnat'ed in the S_ROUTER_IN_DNAT stage in the dnat > > + * conntrack zone. For the next packet, if it goes through > > + * unsnat stage, the conntrack flags are not set properly, and > > + * it doesn't hit the established state flows in > > + * S_ROUTER_IN_DNAT stage. */ > > + struct ds unsnat_match = DS_EMPTY_INITIALIZER; > > + ds_put_format(&unsnat_match, "%s && %s.dst == %s && %s", > > + ip_match, ip_match, lb_vip->vip, > > + is_udp ? "udp" : "tcp"); > > + if (lb_vip->vip_port) { > > + ds_put_format(&unsnat_match, " && %s.dst == %d", > > + is_udp ? "udp" : "tcp", lb_vip->vip_port); > > + } > > + > > + ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_UNSNAT, 120, > > + ds_cstr(&unsnat_match), "next;", &lb->header_); > > + > > + ds_destroy(&unsnat_match); > > + } > > + > > if (!od->l3dgw_port || !od->l3redirect_port || !lb_vip->n_backends) { > > return; > > } > > @@ -7616,19 +7650,11 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, > > * router has a gateway router port associated. > > */ > > struct ds undnat_match = DS_EMPTY_INITIALIZER; > > - if (lb_vip->addr_family == AF_INET) { > > - ds_put_cstr(&undnat_match, "ip4 && ("); > > - } else { > > - ds_put_cstr(&undnat_match, "ip6 && ("); > > - } > > + ds_put_format(&undnat_match, "%s && (", ip_match); > > > > for (size_t i = 0; i < lb_vip->n_backends; i++) { > > struct lb_vip_backend *backend = &lb_vip->backends[i]; > > - if (backend->addr_family == AF_INET) { > > - ds_put_format(&undnat_match, "(ip4.src == %s", backend->ip); > > - } else { > > - ds_put_format(&undnat_match, "(ip6.src == %s", backend->ip); > > - } > > + ds_put_format(&undnat_match, "(%s.src == %s", ip_match, backend->ip); > > > > if (backend->port) { > > ds_put_format(&undnat_match, " && %s.src == %d) || ", > > @@ -8879,6 +8905,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > > &nat->header_); > > sset_add(&nat_entries, nat->external_ip); > > } > > + } else { > > + /* Add the NAT external_ip to the nat_entries even for > > + * gateway routers. This is required for adding load balancer > > + * flows.*/ > > + sset_add(&nat_entries, nat->external_ip); > > } > > > > /* Egress UNDNAT table: It is for already established connections' > > @@ -9059,8 +9090,6 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > > } > > } > > > > - sset_destroy(&nat_entries); > > - > > /* Handle force SNAT options set in the gateway router. */ > > if (dnat_force_snat_ip && !od->l3dgw_port) { > > /* If a packet with destination IP address as that of the > > @@ -9121,6 +9150,7 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > > /* Load balancing and packet defrag are only valid on > > * Gateway routers or router with gateway port. */ > > if (!smap_get(&od->nbr->options, "chassis") && !od->l3dgw_port) { > > + sset_destroy(&nat_entries); > > continue; > > } > > > > @@ -9195,10 +9225,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > > } > > add_router_lb_flow(lflows, od, &match, &actions, prio, > > lb_force_snat_ip, lb_vip, is_udp, > > - nb_lb, meter_groups); > > + nb_lb, meter_groups, &nat_entries); > > } > > } > > sset_destroy(&all_ips); > > + sset_destroy(&nat_entries); > > } > > > > /* Logical router ingress table ND_RA_OPTIONS & ND_RA_RESPONSE: IPv6 Router > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > > index a2989e78e..d127152f5 100644 > > --- a/tests/ovn-northd.at > > +++ b/tests/ovn-northd.at > > @@ -1288,3 +1288,41 @@ ovn-nbctl --wait=sb lb-del lb2 > > OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list service_monitor | wc -l`]) > > > > AT_CLEANUP > > + > > +AT_SETUP([ovn -- Load balancer VIP in NAT entries]) > > +AT_SKIP_IF([test $HAVE_PYTHON = no]) > > +ovn_start > > + > > +ovn-nbctl lr-add lr0 > > +ovn-nbctl lrp-add lr0 lr0-public 00:00:01:01:02:04 192.168.2.1/24 > > +ovn-nbctl lrp-add lr0 lr0-join 00:00:01:01:02:04 10.10.0.1/24 > > + > > +ovn-nbctl set logical_router lr0 options:chassis=ch1 > > + > > +ovn-nbctl lb-add lb1 "192.168.2.1:8080" "10.0.0.4:8080" > > +ovn-nbctl lb-add lb2 "192.168.2.4:8080" "10.0.0.5:8080" udp > > +ovn-nbctl lb-add lb3 "192.168.2.5:8080" "10.0.0.6:8080" > > +ovn-nbctl lb-add lb4 "192.168.2.6:8080" "10.0.0.7:8080" > > + > > +ovn-nbctl lr-lb-add lr0 lb1 > > +ovn-nbctl lr-lb-add lr0 lb2 > > +ovn-nbctl lr-lb-add lr0 lb3 > > +ovn-nbctl lr-lb-add lr0 lb4 > > + > > +ovn-nbctl lr-nat-add lr0 snat 192.168.2.1 10.0.0.0/24 > > +ovn-nbctl lr-nat-add lr0 dnat_and_snat 192.168.2.4 10.0.0.4 > > +ovn-nbctl lr-nat-add lr0 dnat 192.168.2.5 10.0.0.5 > > + > > +OVS_WAIT_UNTIL([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > > +grep "ip4 && ip4.dst == 192.168.2.1 && tcp && tcp.dst == 8080" -c) ]) > > + > > +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > > +grep "ip4 && ip4.dst == 192.168.2.4 && udp && udp.dst == 8080" -c) ]) > > + > > +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > > +grep "ip4 && ip4.dst == 192.168.2.5 && tcp && tcp.dst == 8080" -c) ]) > > + > > +AT_CHECK([test 0 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > > +grep "ip4 && ip4.dst == 192.168.2.6 && tcp && tcp.dst == 8080" -c) ]) > > + > > +AT_CLEANUP > > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > > index 3b3379840..f1ae69b20 100644 > > --- a/tests/system-ovn.at > > +++ b/tests/system-ovn.at > > @@ -1635,7 +1635,6 @@ ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.16 > > ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.2.2 \ > > external_ip=30.0.0.2 -- add logical_router R2 nat @nat > > > > - > > # Wait for ovn-controller to catch up. > > ovn-nbctl --wait=hv sync > > OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \ > > @@ -1671,6 +1670,56 @@ tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(sr > > tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > ]) > > > > +check_est_flows () { > > + n=$(ovs-ofctl dump-flows br-int table=14 | grep \ > > +"priority=120,ct_state=+est+trk,tcp,metadata=0x2,nw_dst=30.0.0.2,tp_dst=8000" \ > > +| grep nat | sed -n 's/.*n_packets=\([[0-9]]\{1,\}\).*/\1/p') > > + > > + echo "n_packets=$n" > > + test "$n" != 0 > > +} > > + > > +OVS_WAIT_UNTIL([check_est_flows], [check established flows]) > > + > > + > > +ovn-nbctl set logical_router R2 options:lb_force_snat_ip="20.0.0.2" > > + > > +# Destroy the load balancer and create again. ovn-controller will > > +# clear the OF flows and re add again and clears the n_packets > > +# for these flows. > > +ovn-nbctl destroy load_balancer $uuid > > +uuid=`ovn-nbctl create load_balancer vips:30.0.0.1="192.168.1.2,192.168.2.2"` > > +ovn-nbctl set logical_router R2 load_balancer=$uuid > > + > > +# Config OVN load-balancer with another VIP (this time with ports). > > +ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.168.2.2:80"' > > + > > +ovn-nbctl list load_balancer > > +ovn-sbctl dump-flows R2 > > +OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-flows br-int table=41 | \ > > +grep 'nat(src=20.0.0.2)']) > > + > > +dnl Test load-balancing that includes L4 ports in NAT. > > +for i in `seq 1 20`; do > > + echo Request $i > > + NS_CHECK_EXEC([alice1], [wget 30.0.0.2:8000 -t 5 -T 1 --retry-connrefused -v -o wget$i.log]) > > +done > > + > > +dnl Each server should have at least one connection. > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.2) | > > +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(20.0.0.2) | > > +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > +tcp,orig=(src=172.16.1.2,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=20.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > +tcp,orig=(src=172.16.1.2,dst=192.168.2.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=20.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > +]) > > + > > +OVS_WAIT_UNTIL([check_est_flows], [check established flows]) > > + > > OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > > > as ovn-sb > > > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Fri, Mar 27, 2020 at 2:35 PM Numan Siddique <numans@ovn.org> wrote: > > On Thu, Mar 26, 2020 at 8:29 PM Mark Michelson <mmichels@redhat.com> wrote: > > > > Acked-by: Mark Michelson <mmichels@redhat.com> > > Thanks, I applied the patch to master. I also applied the patch to branch-20.03 as its a bug and affects the tcp load balancers. Thanks Numan > > Numan > > > > > On 3/26/20 10:49 AM, numans@ovn.org wrote: > > > From: Numan Siddique <numans@ovn.org> > > > > > > Suppose there is below NAT entry with external_ip = 172.168.0.100 > > > > > > nat <UUID> > > > external ip: "172.168.0.100" > > > logical ip: "10.0.0.0/24" > > > type: "snat" > > > > > > And a load balancer with the VIP - 172.168.0.100 > > > > > > _uuid : <UUID> > > > external_ids : {} > > > name : lb1 > > > protocol : tcp > > > vips : {"172.168.0.100:8080"="10.0.0.4:8080"} > > > > > > And if these are associated to a gateway logical router > > > > > > Then we will see the below lflows in the router pipeline > > > > > > ... > > > table=5 (lr_in_unsnat ), priority=90 , match=(ip && ip4.dst == 172.168.0.100), action=(ct_snat;) > > > ... > > > table=6 (lr_in_dnat ), priority=120 , match=(ct.new && ip && ip4.dst == 172.168.0.100 && tcp && tcp.dst == 8080), action=(ct_lb(10.0.0.4:8080);) > > > table=6 (lr_in_dnat ), priority=120 , match=(ct.est && ip && ip4.dst == 172.168.0.100 && tcp && tcp.dst == 8080), action=(ct_dnat;) > > > > > > When a new connection packet destinated for the lb vip 172.168.0.100 and tcp.dst = 8080 > > > is received, the ct.new flow in the lr_in_dnat is hit and the packet's ip4.dst is > > > dnatted to 10.0.0.4 in the dnat conntrack zone. > > > > > > But for the subsequent packet destined to the vip, the ct.est lflow in the lr_in_dnat > > > stage doesn't get hit. In this case, the packet first hits the lr_in_unsnat pri 90 flow > > > as mentioned above with the action ct_snat. Even though ct_snat should have no effect, > > > looks like it is resetting the ct flags. > > > > > > In the case of tcp, the ct.new flow is hit instead of ct.est. In the the case of sctp, neither of the above > > > lflows in lr_in_dnat stage hit. > > > > > > This needs to be investigated further. But we can avoid this scenario in OVN > > > by adding the below lflow. > > > > > > table=5 (lr_in_unsnat ), priority=120 , match=(ip4 && ip4.dst == 172.168.0.100 && tcp.dst == 8080), action=(next;) > > > > > > This patch adds the above lflow if the lb vip also has an entry in the NAT table. > > > > > > This patch is also required to support sctp load balancers in OVN. > > > > > > Reported-by: Tim Rozet <trozet@redhat.com> > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1815217 > > > Signed-off-by: Numan Siddique <numans@ovn.org> > > > --- > > > v2 -> v3 > > > ----- > > > * v2 was nothing but v1 as I forgot to commit the modified files. > > > * Also updated the ovn-northd.8.xml. > > > > > > v1 -> v2 > > > ------ > > > * Addressed the review comments from Mark - Added the protocol in the > > > match. > > > * I'm not sure if this is an ovs issue. So updated the commit message > > > accordingly. > > > > > > > > > northd/ovn-northd.8.xml | 27 +++++++++++++++++++ > > > northd/ovn-northd.c | 59 +++++++++++++++++++++++++++++++---------- > > > tests/ovn-northd.at | 38 ++++++++++++++++++++++++++ > > > tests/system-ovn.at | 51 ++++++++++++++++++++++++++++++++++- > > > 4 files changed, 160 insertions(+), 15 deletions(-) > > > > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > > > index 1e0993e07..b5e4d6d84 100644 > > > --- a/northd/ovn-northd.8.xml > > > +++ b/northd/ovn-northd.8.xml > > > @@ -2075,6 +2075,33 @@ icmp6 { > > > unSNATted here. > > > </p> > > > > > > + <p>Ingress Table 5: UNSNAT on Gateway and Distributed Routers</p> > > > + <ul> > > > + <li> > > > + <p> > > > + If the Router (Gateway or Distributed) is configured with > > > + load balancers, then below lflows are added: > > > + </p> > > > + > > > + <p> > > > + For each IPv4 address <var>A</var> defined as load balancer > > > + VIP with the protocol <var>P</var> (and the protocol port > > > + <var>T</var> if defined) is also present as an > > > + <code>external_ip</code> in the NAT table, > > > + a priority-120 logical flow is added with the match > > > + <code>ip4 && ip4.dst == <var>A</var> && > > > + <var>P</var></code> with the action <code>next;</code> to > > > + advance the packet to the next table. If the load balancer > > > + has protocol port <code>B</code> defined, then the match also has > > > + <code><var>P</var>.dst == <var>B</var></code>. > > > + </p> > > > + > > > + <p> > > > + The above flows are also added for IPv6 load balancers. > > > + </p> > > > + </li> > > > + </ul> > > > + > > > <p>Ingress Table 5: UNSNAT on Gateway Routers</p> > > > > > > <ul> > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > > index b76df0554..7ef049310 100644 > > > --- a/northd/ovn-northd.c > > > +++ b/northd/ovn-northd.c > > > @@ -7574,7 +7574,7 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, > > > struct ds *match, struct ds *actions, int priority, > > > const char *lb_force_snat_ip, struct lb_vip *lb_vip, > > > bool is_udp, struct nbrec_load_balancer *lb, > > > - struct shash *meter_groups) > > > + struct shash *meter_groups, struct sset *nat_entries) > > > { > > > build_empty_lb_event_flow(od, lflows, lb_vip, lb, S_ROUTER_IN_DNAT, > > > meter_groups); > > > @@ -7607,6 +7607,40 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, > > > free(new_match); > > > free(est_match); > > > > > > + const char *ip_match = NULL; > > > + if (lb_vip->addr_family == AF_INET) { > > > + ip_match = "ip4"; > > > + } else { > > > + ip_match = "ip6"; > > > + } > > > + > > > + if (sset_contains(nat_entries, lb_vip->vip)) { > > > + /* The load balancer vip is also present in the NAT entries. > > > + * So add a high priority lflow to advance the the packet > > > + * destined to the vip (and the vip port if defined) > > > + * in the S_ROUTER_IN_UNSNAT stage. > > > + * There seems to be an issue with ovs-vswitchd. When the new > > > + * connection packet destined for the lb vip is received, > > > + * it is dnat'ed in the S_ROUTER_IN_DNAT stage in the dnat > > > + * conntrack zone. For the next packet, if it goes through > > > + * unsnat stage, the conntrack flags are not set properly, and > > > + * it doesn't hit the established state flows in > > > + * S_ROUTER_IN_DNAT stage. */ > > > + struct ds unsnat_match = DS_EMPTY_INITIALIZER; > > > + ds_put_format(&unsnat_match, "%s && %s.dst == %s && %s", > > > + ip_match, ip_match, lb_vip->vip, > > > + is_udp ? "udp" : "tcp"); > > > + if (lb_vip->vip_port) { > > > + ds_put_format(&unsnat_match, " && %s.dst == %d", > > > + is_udp ? "udp" : "tcp", lb_vip->vip_port); > > > + } > > > + > > > + ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_UNSNAT, 120, > > > + ds_cstr(&unsnat_match), "next;", &lb->header_); > > > + > > > + ds_destroy(&unsnat_match); > > > + } > > > + > > > if (!od->l3dgw_port || !od->l3redirect_port || !lb_vip->n_backends) { > > > return; > > > } > > > @@ -7616,19 +7650,11 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, > > > * router has a gateway router port associated. > > > */ > > > struct ds undnat_match = DS_EMPTY_INITIALIZER; > > > - if (lb_vip->addr_family == AF_INET) { > > > - ds_put_cstr(&undnat_match, "ip4 && ("); > > > - } else { > > > - ds_put_cstr(&undnat_match, "ip6 && ("); > > > - } > > > + ds_put_format(&undnat_match, "%s && (", ip_match); > > > > > > for (size_t i = 0; i < lb_vip->n_backends; i++) { > > > struct lb_vip_backend *backend = &lb_vip->backends[i]; > > > - if (backend->addr_family == AF_INET) { > > > - ds_put_format(&undnat_match, "(ip4.src == %s", backend->ip); > > > - } else { > > > - ds_put_format(&undnat_match, "(ip6.src == %s", backend->ip); > > > - } > > > + ds_put_format(&undnat_match, "(%s.src == %s", ip_match, backend->ip); > > > > > > if (backend->port) { > > > ds_put_format(&undnat_match, " && %s.src == %d) || ", > > > @@ -8879,6 +8905,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > > > &nat->header_); > > > sset_add(&nat_entries, nat->external_ip); > > > } > > > + } else { > > > + /* Add the NAT external_ip to the nat_entries even for > > > + * gateway routers. This is required for adding load balancer > > > + * flows.*/ > > > + sset_add(&nat_entries, nat->external_ip); > > > } > > > > > > /* Egress UNDNAT table: It is for already established connections' > > > @@ -9059,8 +9090,6 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > > > } > > > } > > > > > > - sset_destroy(&nat_entries); > > > - > > > /* Handle force SNAT options set in the gateway router. */ > > > if (dnat_force_snat_ip && !od->l3dgw_port) { > > > /* If a packet with destination IP address as that of the > > > @@ -9121,6 +9150,7 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > > > /* Load balancing and packet defrag are only valid on > > > * Gateway routers or router with gateway port. */ > > > if (!smap_get(&od->nbr->options, "chassis") && !od->l3dgw_port) { > > > + sset_destroy(&nat_entries); > > > continue; > > > } > > > > > > @@ -9195,10 +9225,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > > > } > > > add_router_lb_flow(lflows, od, &match, &actions, prio, > > > lb_force_snat_ip, lb_vip, is_udp, > > > - nb_lb, meter_groups); > > > + nb_lb, meter_groups, &nat_entries); > > > } > > > } > > > sset_destroy(&all_ips); > > > + sset_destroy(&nat_entries); > > > } > > > > > > /* Logical router ingress table ND_RA_OPTIONS & ND_RA_RESPONSE: IPv6 Router > > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > > > index a2989e78e..d127152f5 100644 > > > --- a/tests/ovn-northd.at > > > +++ b/tests/ovn-northd.at > > > @@ -1288,3 +1288,41 @@ ovn-nbctl --wait=sb lb-del lb2 > > > OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list service_monitor | wc -l`]) > > > > > > AT_CLEANUP > > > + > > > +AT_SETUP([ovn -- Load balancer VIP in NAT entries]) > > > +AT_SKIP_IF([test $HAVE_PYTHON = no]) > > > +ovn_start > > > + > > > +ovn-nbctl lr-add lr0 > > > +ovn-nbctl lrp-add lr0 lr0-public 00:00:01:01:02:04 192.168.2.1/24 > > > +ovn-nbctl lrp-add lr0 lr0-join 00:00:01:01:02:04 10.10.0.1/24 > > > + > > > +ovn-nbctl set logical_router lr0 options:chassis=ch1 > > > + > > > +ovn-nbctl lb-add lb1 "192.168.2.1:8080" "10.0.0.4:8080" > > > +ovn-nbctl lb-add lb2 "192.168.2.4:8080" "10.0.0.5:8080" udp > > > +ovn-nbctl lb-add lb3 "192.168.2.5:8080" "10.0.0.6:8080" > > > +ovn-nbctl lb-add lb4 "192.168.2.6:8080" "10.0.0.7:8080" > > > + > > > +ovn-nbctl lr-lb-add lr0 lb1 > > > +ovn-nbctl lr-lb-add lr0 lb2 > > > +ovn-nbctl lr-lb-add lr0 lb3 > > > +ovn-nbctl lr-lb-add lr0 lb4 > > > + > > > +ovn-nbctl lr-nat-add lr0 snat 192.168.2.1 10.0.0.0/24 > > > +ovn-nbctl lr-nat-add lr0 dnat_and_snat 192.168.2.4 10.0.0.4 > > > +ovn-nbctl lr-nat-add lr0 dnat 192.168.2.5 10.0.0.5 > > > + > > > +OVS_WAIT_UNTIL([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > > > +grep "ip4 && ip4.dst == 192.168.2.1 && tcp && tcp.dst == 8080" -c) ]) > > > + > > > +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > > > +grep "ip4 && ip4.dst == 192.168.2.4 && udp && udp.dst == 8080" -c) ]) > > > + > > > +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > > > +grep "ip4 && ip4.dst == 192.168.2.5 && tcp && tcp.dst == 8080" -c) ]) > > > + > > > +AT_CHECK([test 0 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ > > > +grep "ip4 && ip4.dst == 192.168.2.6 && tcp && tcp.dst == 8080" -c) ]) > > > + > > > +AT_CLEANUP > > > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > > > index 3b3379840..f1ae69b20 100644 > > > --- a/tests/system-ovn.at > > > +++ b/tests/system-ovn.at > > > @@ -1635,7 +1635,6 @@ ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.16 > > > ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.2.2 \ > > > external_ip=30.0.0.2 -- add logical_router R2 nat @nat > > > > > > - > > > # Wait for ovn-controller to catch up. > > > ovn-nbctl --wait=hv sync > > > OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \ > > > @@ -1671,6 +1670,56 @@ tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(sr > > > tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > > ]) > > > > > > +check_est_flows () { > > > + n=$(ovs-ofctl dump-flows br-int table=14 | grep \ > > > +"priority=120,ct_state=+est+trk,tcp,metadata=0x2,nw_dst=30.0.0.2,tp_dst=8000" \ > > > +| grep nat | sed -n 's/.*n_packets=\([[0-9]]\{1,\}\).*/\1/p') > > > + > > > + echo "n_packets=$n" > > > + test "$n" != 0 > > > +} > > > + > > > +OVS_WAIT_UNTIL([check_est_flows], [check established flows]) > > > + > > > + > > > +ovn-nbctl set logical_router R2 options:lb_force_snat_ip="20.0.0.2" > > > + > > > +# Destroy the load balancer and create again. ovn-controller will > > > +# clear the OF flows and re add again and clears the n_packets > > > +# for these flows. > > > +ovn-nbctl destroy load_balancer $uuid > > > +uuid=`ovn-nbctl create load_balancer vips:30.0.0.1="192.168.1.2,192.168.2.2"` > > > +ovn-nbctl set logical_router R2 load_balancer=$uuid > > > + > > > +# Config OVN load-balancer with another VIP (this time with ports). > > > +ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.168.2.2:80"' > > > + > > > +ovn-nbctl list load_balancer > > > +ovn-sbctl dump-flows R2 > > > +OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-flows br-int table=41 | \ > > > +grep 'nat(src=20.0.0.2)']) > > > + > > > +dnl Test load-balancing that includes L4 ports in NAT. > > > +for i in `seq 1 20`; do > > > + echo Request $i > > > + NS_CHECK_EXEC([alice1], [wget 30.0.0.2:8000 -t 5 -T 1 --retry-connrefused -v -o wget$i.log]) > > > +done > > > + > > > +dnl Each server should have at least one connection. > > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.2) | > > > +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > > +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > > +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > > +]) > > > + > > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(20.0.0.2) | > > > +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > > +tcp,orig=(src=172.16.1.2,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=20.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > > +tcp,orig=(src=172.16.1.2,dst=192.168.2.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=20.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > > +]) > > > + > > > +OVS_WAIT_UNTIL([check_est_flows], [check established flows]) > > > + > > > OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > > > > > as ovn-sb > > > > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 1e0993e07..b5e4d6d84 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -2075,6 +2075,33 @@ icmp6 { unSNATted here. </p> + <p>Ingress Table 5: UNSNAT on Gateway and Distributed Routers</p> + <ul> + <li> + <p> + If the Router (Gateway or Distributed) is configured with + load balancers, then below lflows are added: + </p> + + <p> + For each IPv4 address <var>A</var> defined as load balancer + VIP with the protocol <var>P</var> (and the protocol port + <var>T</var> if defined) is also present as an + <code>external_ip</code> in the NAT table, + a priority-120 logical flow is added with the match + <code>ip4 && ip4.dst == <var>A</var> && + <var>P</var></code> with the action <code>next;</code> to + advance the packet to the next table. If the load balancer + has protocol port <code>B</code> defined, then the match also has + <code><var>P</var>.dst == <var>B</var></code>. + </p> + + <p> + The above flows are also added for IPv6 load balancers. + </p> + </li> + </ul> + <p>Ingress Table 5: UNSNAT on Gateway Routers</p> <ul> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index b76df0554..7ef049310 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -7574,7 +7574,7 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, struct ds *match, struct ds *actions, int priority, const char *lb_force_snat_ip, struct lb_vip *lb_vip, bool is_udp, struct nbrec_load_balancer *lb, - struct shash *meter_groups) + struct shash *meter_groups, struct sset *nat_entries) { build_empty_lb_event_flow(od, lflows, lb_vip, lb, S_ROUTER_IN_DNAT, meter_groups); @@ -7607,6 +7607,40 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, free(new_match); free(est_match); + const char *ip_match = NULL; + if (lb_vip->addr_family == AF_INET) { + ip_match = "ip4"; + } else { + ip_match = "ip6"; + } + + if (sset_contains(nat_entries, lb_vip->vip)) { + /* The load balancer vip is also present in the NAT entries. + * So add a high priority lflow to advance the the packet + * destined to the vip (and the vip port if defined) + * in the S_ROUTER_IN_UNSNAT stage. + * There seems to be an issue with ovs-vswitchd. When the new + * connection packet destined for the lb vip is received, + * it is dnat'ed in the S_ROUTER_IN_DNAT stage in the dnat + * conntrack zone. For the next packet, if it goes through + * unsnat stage, the conntrack flags are not set properly, and + * it doesn't hit the established state flows in + * S_ROUTER_IN_DNAT stage. */ + struct ds unsnat_match = DS_EMPTY_INITIALIZER; + ds_put_format(&unsnat_match, "%s && %s.dst == %s && %s", + ip_match, ip_match, lb_vip->vip, + is_udp ? "udp" : "tcp"); + if (lb_vip->vip_port) { + ds_put_format(&unsnat_match, " && %s.dst == %d", + is_udp ? "udp" : "tcp", lb_vip->vip_port); + } + + ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_UNSNAT, 120, + ds_cstr(&unsnat_match), "next;", &lb->header_); + + ds_destroy(&unsnat_match); + } + if (!od->l3dgw_port || !od->l3redirect_port || !lb_vip->n_backends) { return; } @@ -7616,19 +7650,11 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, * router has a gateway router port associated. */ struct ds undnat_match = DS_EMPTY_INITIALIZER; - if (lb_vip->addr_family == AF_INET) { - ds_put_cstr(&undnat_match, "ip4 && ("); - } else { - ds_put_cstr(&undnat_match, "ip6 && ("); - } + ds_put_format(&undnat_match, "%s && (", ip_match); for (size_t i = 0; i < lb_vip->n_backends; i++) { struct lb_vip_backend *backend = &lb_vip->backends[i]; - if (backend->addr_family == AF_INET) { - ds_put_format(&undnat_match, "(ip4.src == %s", backend->ip); - } else { - ds_put_format(&undnat_match, "(ip6.src == %s", backend->ip); - } + ds_put_format(&undnat_match, "(%s.src == %s", ip_match, backend->ip); if (backend->port) { ds_put_format(&undnat_match, " && %s.src == %d) || ", @@ -8879,6 +8905,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, &nat->header_); sset_add(&nat_entries, nat->external_ip); } + } else { + /* Add the NAT external_ip to the nat_entries even for + * gateway routers. This is required for adding load balancer + * flows.*/ + sset_add(&nat_entries, nat->external_ip); } /* Egress UNDNAT table: It is for already established connections' @@ -9059,8 +9090,6 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, } } - sset_destroy(&nat_entries); - /* Handle force SNAT options set in the gateway router. */ if (dnat_force_snat_ip && !od->l3dgw_port) { /* If a packet with destination IP address as that of the @@ -9121,6 +9150,7 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, /* Load balancing and packet defrag are only valid on * Gateway routers or router with gateway port. */ if (!smap_get(&od->nbr->options, "chassis") && !od->l3dgw_port) { + sset_destroy(&nat_entries); continue; } @@ -9195,10 +9225,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, } add_router_lb_flow(lflows, od, &match, &actions, prio, lb_force_snat_ip, lb_vip, is_udp, - nb_lb, meter_groups); + nb_lb, meter_groups, &nat_entries); } } sset_destroy(&all_ips); + sset_destroy(&nat_entries); } /* Logical router ingress table ND_RA_OPTIONS & ND_RA_RESPONSE: IPv6 Router diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index a2989e78e..d127152f5 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -1288,3 +1288,41 @@ ovn-nbctl --wait=sb lb-del lb2 OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list service_monitor | wc -l`]) AT_CLEANUP + +AT_SETUP([ovn -- Load balancer VIP in NAT entries]) +AT_SKIP_IF([test $HAVE_PYTHON = no]) +ovn_start + +ovn-nbctl lr-add lr0 +ovn-nbctl lrp-add lr0 lr0-public 00:00:01:01:02:04 192.168.2.1/24 +ovn-nbctl lrp-add lr0 lr0-join 00:00:01:01:02:04 10.10.0.1/24 + +ovn-nbctl set logical_router lr0 options:chassis=ch1 + +ovn-nbctl lb-add lb1 "192.168.2.1:8080" "10.0.0.4:8080" +ovn-nbctl lb-add lb2 "192.168.2.4:8080" "10.0.0.5:8080" udp +ovn-nbctl lb-add lb3 "192.168.2.5:8080" "10.0.0.6:8080" +ovn-nbctl lb-add lb4 "192.168.2.6:8080" "10.0.0.7:8080" + +ovn-nbctl lr-lb-add lr0 lb1 +ovn-nbctl lr-lb-add lr0 lb2 +ovn-nbctl lr-lb-add lr0 lb3 +ovn-nbctl lr-lb-add lr0 lb4 + +ovn-nbctl lr-nat-add lr0 snat 192.168.2.1 10.0.0.0/24 +ovn-nbctl lr-nat-add lr0 dnat_and_snat 192.168.2.4 10.0.0.4 +ovn-nbctl lr-nat-add lr0 dnat 192.168.2.5 10.0.0.5 + +OVS_WAIT_UNTIL([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ +grep "ip4 && ip4.dst == 192.168.2.1 && tcp && tcp.dst == 8080" -c) ]) + +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ +grep "ip4 && ip4.dst == 192.168.2.4 && udp && udp.dst == 8080" -c) ]) + +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ +grep "ip4 && ip4.dst == 192.168.2.5 && tcp && tcp.dst == 8080" -c) ]) + +AT_CHECK([test 0 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ +grep "ip4 && ip4.dst == 192.168.2.6 && tcp && tcp.dst == 8080" -c) ]) + +AT_CLEANUP diff --git a/tests/system-ovn.at b/tests/system-ovn.at index 3b3379840..f1ae69b20 100644 --- a/tests/system-ovn.at +++ b/tests/system-ovn.at @@ -1635,7 +1635,6 @@ ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.16 ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.2.2 \ external_ip=30.0.0.2 -- add logical_router R2 nat @nat - # Wait for ovn-controller to catch up. ovn-nbctl --wait=hv sync OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \ @@ -1671,6 +1670,56 @@ tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(sr tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) ]) +check_est_flows () { + n=$(ovs-ofctl dump-flows br-int table=14 | grep \ +"priority=120,ct_state=+est+trk,tcp,metadata=0x2,nw_dst=30.0.0.2,tp_dst=8000" \ +| grep nat | sed -n 's/.*n_packets=\([[0-9]]\{1,\}\).*/\1/p') + + echo "n_packets=$n" + test "$n" != 0 +} + +OVS_WAIT_UNTIL([check_est_flows], [check established flows]) + + +ovn-nbctl set logical_router R2 options:lb_force_snat_ip="20.0.0.2" + +# Destroy the load balancer and create again. ovn-controller will +# clear the OF flows and re add again and clears the n_packets +# for these flows. +ovn-nbctl destroy load_balancer $uuid +uuid=`ovn-nbctl create load_balancer vips:30.0.0.1="192.168.1.2,192.168.2.2"` +ovn-nbctl set logical_router R2 load_balancer=$uuid + +# Config OVN load-balancer with another VIP (this time with ports). +ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.168.2.2:80"' + +ovn-nbctl list load_balancer +ovn-sbctl dump-flows R2 +OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-flows br-int table=41 | \ +grep 'nat(src=20.0.0.2)']) + +dnl Test load-balancing that includes L4 ports in NAT. +for i in `seq 1 20`; do + echo Request $i + NS_CHECK_EXEC([alice1], [wget 30.0.0.2:8000 -t 5 -T 1 --retry-connrefused -v -o wget$i.log]) +done + +dnl Each server should have at least one connection. +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.2) | +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) +]) + +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(20.0.0.2) | +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl +tcp,orig=(src=172.16.1.2,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=20.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) +tcp,orig=(src=172.16.1.2,dst=192.168.2.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=20.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>) +]) + +OVS_WAIT_UNTIL([check_est_flows], [check established flows]) + OVS_APP_EXIT_AND_WAIT([ovn-controller]) as ovn-sb