From patchwork Thu Mar 26 14:49:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 1262057 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=fraxinus.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=ovn.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48p7DX1gwGz9sSG for ; Fri, 27 Mar 2020 01:49:31 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 6E503872D3; Thu, 26 Mar 2020 14:49:30 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cfJ820WgYEoS; Thu, 26 Mar 2020 14:49:29 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by fraxinus.osuosl.org (Postfix) with ESMTP id 228E08721A; Thu, 26 Mar 2020 14:49:29 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 1379BC1D7C; Thu, 26 Mar 2020 14:49:29 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 747CCC0177 for ; Thu, 26 Mar 2020 14:49:28 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 70B34878CA for ; Thu, 26 Mar 2020 14:49:28 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RZMtArIf6q9r for ; Thu, 26 Mar 2020 14:49:26 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from relay9-d.mail.gandi.net (relay9-d.mail.gandi.net [217.70.183.199]) by whitealder.osuosl.org (Postfix) with ESMTPS id 6B8BF878BC for ; Thu, 26 Mar 2020 14:49:26 +0000 (UTC) X-Originating-IP: 115.99.134.187 Received: from nummac.local (unknown [115.99.134.187]) (Authenticated sender: numans@ovn.org) by relay9-d.mail.gandi.net (Postfix) with ESMTPSA id A02DCFF80F; Thu, 26 Mar 2020 14:49:21 +0000 (UTC) From: numans@ovn.org To: dev@openvswitch.org Date: Thu, 26 Mar 2020 20:19:12 +0530 Message-Id: <20200326144912.804557-1-numans@ovn.org> X-Mailer: git-send-email 2.24.1 MIME-Version: 1.0 Cc: Tim Rozet Subject: [ovs-dev] [PATCH ovn v3] ovn-northd: Skip unsnat flows for load balancer vips in router ingress pipeline X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Numan Siddique Suppose there is below NAT entry with external_ip = 172.168.0.100 nat external ip: "172.168.0.100" logical ip: "10.0.0.0/24" type: "snat" And a load balancer with the VIP - 172.168.0.100 _uuid : external_ids : {} name : lb1 protocol : tcp vips : {"172.168.0.100:8080"="10.0.0.4:8080"} And if these are associated to a gateway logical router Then we will see the below lflows in the router pipeline ... table=5 (lr_in_unsnat ), priority=90 , match=(ip && ip4.dst == 172.168.0.100), action=(ct_snat;) ... table=6 (lr_in_dnat ), priority=120 , match=(ct.new && ip && ip4.dst == 172.168.0.100 && tcp && tcp.dst == 8080), action=(ct_lb(10.0.0.4:8080);) table=6 (lr_in_dnat ), priority=120 , match=(ct.est && ip && ip4.dst == 172.168.0.100 && tcp && tcp.dst == 8080), action=(ct_dnat;) When a new connection packet destinated for the lb vip 172.168.0.100 and tcp.dst = 8080 is received, the ct.new flow in the lr_in_dnat is hit and the packet's ip4.dst is dnatted to 10.0.0.4 in the dnat conntrack zone. But for the subsequent packet destined to the vip, the ct.est lflow in the lr_in_dnat stage doesn't get hit. In this case, the packet first hits the lr_in_unsnat pri 90 flow as mentioned above with the action ct_snat. Even though ct_snat should have no effect, looks like it is resetting the ct flags. In the case of tcp, the ct.new flow is hit instead of ct.est. In the the case of sctp, neither of the above lflows in lr_in_dnat stage hit. This needs to be investigated further. But we can avoid this scenario in OVN by adding the below lflow. table=5 (lr_in_unsnat ), priority=120 , match=(ip4 && ip4.dst == 172.168.0.100 && tcp.dst == 8080), action=(next;) This patch adds the above lflow if the lb vip also has an entry in the NAT table. This patch is also required to support sctp load balancers in OVN. Reported-by: Tim Rozet Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1815217 Signed-off-by: Numan Siddique Acked-by: Mark Michelson --- v2 -> v3 ----- * v2 was nothing but v1 as I forgot to commit the modified files. * Also updated the ovn-northd.8.xml. v1 -> v2 ------ * Addressed the review comments from Mark - Added the protocol in the match. * I'm not sure if this is an ovs issue. So updated the commit message accordingly. northd/ovn-northd.8.xml | 27 +++++++++++++++++++ northd/ovn-northd.c | 59 +++++++++++++++++++++++++++++++---------- tests/ovn-northd.at | 38 ++++++++++++++++++++++++++ tests/system-ovn.at | 51 ++++++++++++++++++++++++++++++++++- 4 files changed, 160 insertions(+), 15 deletions(-) diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 1e0993e07..b5e4d6d84 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -2075,6 +2075,33 @@ icmp6 { unSNATted here.

+

Ingress Table 5: UNSNAT on Gateway and Distributed Routers

+
    +
  • +

    + If the Router (Gateway or Distributed) is configured with + load balancers, then below lflows are added: +

    + +

    + For each IPv4 address A defined as load balancer + VIP with the protocol P (and the protocol port + T if defined) is also present as an + external_ip in the NAT table, + a priority-120 logical flow is added with the match + ip4 && ip4.dst == A && + P with the action next; to + advance the packet to the next table. If the load balancer + has protocol port B defined, then the match also has + P.dst == B. +

    + +

    + The above flows are also added for IPv6 load balancers. +

    +
  • +
+

Ingress Table 5: UNSNAT on Gateway Routers

    diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index b76df0554..7ef049310 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -7574,7 +7574,7 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, struct ds *match, struct ds *actions, int priority, const char *lb_force_snat_ip, struct lb_vip *lb_vip, bool is_udp, struct nbrec_load_balancer *lb, - struct shash *meter_groups) + struct shash *meter_groups, struct sset *nat_entries) { build_empty_lb_event_flow(od, lflows, lb_vip, lb, S_ROUTER_IN_DNAT, meter_groups); @@ -7607,6 +7607,40 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, free(new_match); free(est_match); + const char *ip_match = NULL; + if (lb_vip->addr_family == AF_INET) { + ip_match = "ip4"; + } else { + ip_match = "ip6"; + } + + if (sset_contains(nat_entries, lb_vip->vip)) { + /* The load balancer vip is also present in the NAT entries. + * So add a high priority lflow to advance the the packet + * destined to the vip (and the vip port if defined) + * in the S_ROUTER_IN_UNSNAT stage. + * There seems to be an issue with ovs-vswitchd. When the new + * connection packet destined for the lb vip is received, + * it is dnat'ed in the S_ROUTER_IN_DNAT stage in the dnat + * conntrack zone. For the next packet, if it goes through + * unsnat stage, the conntrack flags are not set properly, and + * it doesn't hit the established state flows in + * S_ROUTER_IN_DNAT stage. */ + struct ds unsnat_match = DS_EMPTY_INITIALIZER; + ds_put_format(&unsnat_match, "%s && %s.dst == %s && %s", + ip_match, ip_match, lb_vip->vip, + is_udp ? "udp" : "tcp"); + if (lb_vip->vip_port) { + ds_put_format(&unsnat_match, " && %s.dst == %d", + is_udp ? "udp" : "tcp", lb_vip->vip_port); + } + + ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_UNSNAT, 120, + ds_cstr(&unsnat_match), "next;", &lb->header_); + + ds_destroy(&unsnat_match); + } + if (!od->l3dgw_port || !od->l3redirect_port || !lb_vip->n_backends) { return; } @@ -7616,19 +7650,11 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od, * router has a gateway router port associated. */ struct ds undnat_match = DS_EMPTY_INITIALIZER; - if (lb_vip->addr_family == AF_INET) { - ds_put_cstr(&undnat_match, "ip4 && ("); - } else { - ds_put_cstr(&undnat_match, "ip6 && ("); - } + ds_put_format(&undnat_match, "%s && (", ip_match); for (size_t i = 0; i < lb_vip->n_backends; i++) { struct lb_vip_backend *backend = &lb_vip->backends[i]; - if (backend->addr_family == AF_INET) { - ds_put_format(&undnat_match, "(ip4.src == %s", backend->ip); - } else { - ds_put_format(&undnat_match, "(ip6.src == %s", backend->ip); - } + ds_put_format(&undnat_match, "(%s.src == %s", ip_match, backend->ip); if (backend->port) { ds_put_format(&undnat_match, " && %s.src == %d) || ", @@ -8879,6 +8905,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, &nat->header_); sset_add(&nat_entries, nat->external_ip); } + } else { + /* Add the NAT external_ip to the nat_entries even for + * gateway routers. This is required for adding load balancer + * flows.*/ + sset_add(&nat_entries, nat->external_ip); } /* Egress UNDNAT table: It is for already established connections' @@ -9059,8 +9090,6 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, } } - sset_destroy(&nat_entries); - /* Handle force SNAT options set in the gateway router. */ if (dnat_force_snat_ip && !od->l3dgw_port) { /* If a packet with destination IP address as that of the @@ -9121,6 +9150,7 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, /* Load balancing and packet defrag are only valid on * Gateway routers or router with gateway port. */ if (!smap_get(&od->nbr->options, "chassis") && !od->l3dgw_port) { + sset_destroy(&nat_entries); continue; } @@ -9195,10 +9225,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, } add_router_lb_flow(lflows, od, &match, &actions, prio, lb_force_snat_ip, lb_vip, is_udp, - nb_lb, meter_groups); + nb_lb, meter_groups, &nat_entries); } } sset_destroy(&all_ips); + sset_destroy(&nat_entries); } /* Logical router ingress table ND_RA_OPTIONS & ND_RA_RESPONSE: IPv6 Router diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index a2989e78e..d127152f5 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -1288,3 +1288,41 @@ ovn-nbctl --wait=sb lb-del lb2 OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list service_monitor | wc -l`]) AT_CLEANUP + +AT_SETUP([ovn -- Load balancer VIP in NAT entries]) +AT_SKIP_IF([test $HAVE_PYTHON = no]) +ovn_start + +ovn-nbctl lr-add lr0 +ovn-nbctl lrp-add lr0 lr0-public 00:00:01:01:02:04 192.168.2.1/24 +ovn-nbctl lrp-add lr0 lr0-join 00:00:01:01:02:04 10.10.0.1/24 + +ovn-nbctl set logical_router lr0 options:chassis=ch1 + +ovn-nbctl lb-add lb1 "192.168.2.1:8080" "10.0.0.4:8080" +ovn-nbctl lb-add lb2 "192.168.2.4:8080" "10.0.0.5:8080" udp +ovn-nbctl lb-add lb3 "192.168.2.5:8080" "10.0.0.6:8080" +ovn-nbctl lb-add lb4 "192.168.2.6:8080" "10.0.0.7:8080" + +ovn-nbctl lr-lb-add lr0 lb1 +ovn-nbctl lr-lb-add lr0 lb2 +ovn-nbctl lr-lb-add lr0 lb3 +ovn-nbctl lr-lb-add lr0 lb4 + +ovn-nbctl lr-nat-add lr0 snat 192.168.2.1 10.0.0.0/24 +ovn-nbctl lr-nat-add lr0 dnat_and_snat 192.168.2.4 10.0.0.4 +ovn-nbctl lr-nat-add lr0 dnat 192.168.2.5 10.0.0.5 + +OVS_WAIT_UNTIL([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ +grep "ip4 && ip4.dst == 192.168.2.1 && tcp && tcp.dst == 8080" -c) ]) + +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ +grep "ip4 && ip4.dst == 192.168.2.4 && udp && udp.dst == 8080" -c) ]) + +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ +grep "ip4 && ip4.dst == 192.168.2.5 && tcp && tcp.dst == 8080" -c) ]) + +AT_CHECK([test 0 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \ +grep "ip4 && ip4.dst == 192.168.2.6 && tcp && tcp.dst == 8080" -c) ]) + +AT_CLEANUP diff --git a/tests/system-ovn.at b/tests/system-ovn.at index 3b3379840..f1ae69b20 100644 --- a/tests/system-ovn.at +++ b/tests/system-ovn.at @@ -1635,7 +1635,6 @@ ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.16 ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.2.2 \ external_ip=30.0.0.2 -- add logical_router R2 nat @nat - # Wait for ovn-controller to catch up. ovn-nbctl --wait=hv sync OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \ @@ -1671,6 +1670,56 @@ tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=,dport=),reply=(sr tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=,dport=),reply=(src=192.168.2.2,dst=172.16.1.2,sport=,dport=),zone=,protoinfo=(state=) ]) +check_est_flows () { + n=$(ovs-ofctl dump-flows br-int table=14 | grep \ +"priority=120,ct_state=+est+trk,tcp,metadata=0x2,nw_dst=30.0.0.2,tp_dst=8000" \ +| grep nat | sed -n 's/.*n_packets=\([[0-9]]\{1,\}\).*/\1/p') + + echo "n_packets=$n" + test "$n" != 0 +} + +OVS_WAIT_UNTIL([check_est_flows], [check established flows]) + + +ovn-nbctl set logical_router R2 options:lb_force_snat_ip="20.0.0.2" + +# Destroy the load balancer and create again. ovn-controller will +# clear the OF flows and re add again and clears the n_packets +# for these flows. +ovn-nbctl destroy load_balancer $uuid +uuid=`ovn-nbctl create load_balancer vips:30.0.0.1="192.168.1.2,192.168.2.2"` +ovn-nbctl set logical_router R2 load_balancer=$uuid + +# Config OVN load-balancer with another VIP (this time with ports). +ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.168.2.2:80"' + +ovn-nbctl list load_balancer +ovn-sbctl dump-flows R2 +OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-flows br-int table=41 | \ +grep 'nat(src=20.0.0.2)']) + +dnl Test load-balancing that includes L4 ports in NAT. +for i in `seq 1 20`; do + echo Request $i + NS_CHECK_EXEC([alice1], [wget 30.0.0.2:8000 -t 5 -T 1 --retry-connrefused -v -o wget$i.log]) +done + +dnl Each server should have at least one connection. +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.2) | +sed -e 's/zone=[[0-9]]*/zone=/'], [0], [dnl +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=,dport=),reply=(src=192.168.1.2,dst=172.16.1.2,sport=,dport=),zone=,protoinfo=(state=) +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=,dport=),reply=(src=192.168.2.2,dst=172.16.1.2,sport=,dport=),zone=,protoinfo=(state=) +]) + +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(20.0.0.2) | +sed -e 's/zone=[[0-9]]*/zone=/'], [0], [dnl +tcp,orig=(src=172.16.1.2,dst=192.168.1.2,sport=,dport=),reply=(src=192.168.1.2,dst=20.0.0.2,sport=,dport=),zone=,protoinfo=(state=) +tcp,orig=(src=172.16.1.2,dst=192.168.2.2,sport=,dport=),reply=(src=192.168.2.2,dst=20.0.0.2,sport=,dport=),zone=,protoinfo=(state=) +]) + +OVS_WAIT_UNTIL([check_est_flows], [check established flows]) + OVS_APP_EXIT_AND_WAIT([ovn-controller]) as ovn-sb