diff mbox series

[ovs-dev,v4] northd: centralized reply lb traffic even if FIP is defined

Message ID 70e1d9a538aa79b28473880fc459dc4f783537ce.1685089052.git.lorenzo.bianconi@redhat.com
State Changes Requested
Headers show
Series [ovs-dev,v4] northd: centralized reply lb traffic even if FIP is defined | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes fail github build: failed

Commit Message

Lorenzo Bianconi May 26, 2023, 8:21 a.m. UTC
In the current codebase for distributed gw router port use-case,
it is not possible to add a load balancer that redirects the traffic
to a back-end if it is used as internal IP of a FIP NAT rule since
the reply traffic is never centralized. Fix the issue centralizing the
traffic if it is the reply packet for the load balancer.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2023609
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
---
Changes since v3:
- add ovn unit-test and get rid of system-ovn unit-test.
- minor changes in ovn-northd unit-test.
Changes since v2:
- rebase on top of ovn main branch
- fix spelling
Changes since v1:
- add new system-test and unit-test
---
 northd/northd.c         |  15 +++++
 northd/ovn-northd.8.xml |  16 +++++
 tests/ovn-northd.at     |  50 +++++++++++++++
 tests/ovn.at            | 137 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 218 insertions(+)

Comments

Simon Horman May 26, 2023, 11:27 a.m. UTC | #1
On Fri, May 26, 2023 at 10:21:58AM +0200, Lorenzo Bianconi wrote:
> In the current codebase for distributed gw router port use-case,
> it is not possible to add a load balancer that redirects the traffic
> to a back-end if it is used as internal IP of a FIP NAT rule since
> the reply traffic is never centralized. Fix the issue centralizing the
> traffic if it is the reply packet for the load balancer.
> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2023609
> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Ilya Maximets May 29, 2023, 3:21 p.m. UTC | #2
On 5/26/23 10:21, Lorenzo Bianconi wrote:
> In the current codebase for distributed gw router port use-case,
> it is not possible to add a load balancer that redirects the traffic
> to a back-end if it is used as internal IP of a FIP NAT rule since
> the reply traffic is never centralized. Fix the issue centralizing the
> traffic if it is the reply packet for the load balancer.
> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2023609
> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> ---
> Changes since v3:
> - add ovn unit-test and get rid of system-ovn unit-test.
> - minor changes in ovn-northd unit-test.
> Changes since v2:
> - rebase on top of ovn main branch
> - fix spelling
> Changes since v1:
> - add new system-test and unit-test
> ---
>  northd/northd.c         |  15 +++++
>  northd/ovn-northd.8.xml |  16 +++++
>  tests/ovn-northd.at     |  50 +++++++++++++++
>  tests/ovn.at            | 137 ++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 218 insertions(+)
> 
> diff --git a/northd/northd.c b/northd/northd.c
> index a6eca916b..6317c36fd 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -10740,6 +10740,8 @@ build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx,
>                                       struct ovn_datapath *od)
>  {
>      struct ovn_port *dgp = od->l3dgw_ports[0];
> +    struct ds gw_redir_match = DS_EMPTY_INITIALIZER;
> +    struct ds gw_redir_action = DS_EMPTY_INITIALIZER;
>  
>      const char *undnat_action;
>  
> @@ -10784,6 +10786,17 @@ build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx,
>          return;
>      }
>  
> +    /* We need to centralize the LB traffic to properly perform
> +     * the undnat stage.
> +     */
> +    ds_put_format(&gw_redir_match, "%s) && outport == %s",
> +                  ds_cstr(ctx->undnat_match), dgp->json_key);
> +    ds_put_format(&gw_redir_action, "outport = %s; next;",
> +                  dgp->cr_port->json_key);

Hi, Lorenzo.  Not a full review, but this part of a code is potentially
on a hot path in northd processing, so it's better to limit the amount
of memory allocations.  It should be easy to use ctx->undnat_match
instead of gw_redir_match and truncate it back after the lflow creation.
And we could store gw_redir_action in the ctx structure and only ds_clear()
it before re-using.  This should help with avoiding memory allocations
per datapath.

Best regards, Ilya Maximets.

> +    ovn_lflow_add_with_hint(ctx->lflows, od, S_ROUTER_IN_GW_REDIRECT,
> +                            200, ds_cstr(&gw_redir_match),
> +                            ds_cstr(&gw_redir_action), &ctx->lb->nlb->header_);
> +
>      ds_put_format(ctx->undnat_match, ") && (inport == %s || outport == %s)"
>                    " && is_chassis_resident(%s)", dgp->json_key, dgp->json_key,
>                    dgp->cr_port->json_key);
> @@ -10791,6 +10804,8 @@ build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx,
>                              ds_cstr(ctx->undnat_match), undnat_action,
>                              &ctx->lb->nlb->header_);
>      ds_truncate(ctx->undnat_match, undnat_match_len);
> +    ds_destroy(&gw_redir_match);
> +    ds_destroy(&gw_redir_action);
>  }
diff mbox series

Patch

diff --git a/northd/northd.c b/northd/northd.c
index a6eca916b..6317c36fd 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -10740,6 +10740,8 @@  build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx,
                                      struct ovn_datapath *od)
 {
     struct ovn_port *dgp = od->l3dgw_ports[0];
+    struct ds gw_redir_match = DS_EMPTY_INITIALIZER;
+    struct ds gw_redir_action = DS_EMPTY_INITIALIZER;
 
     const char *undnat_action;
 
@@ -10784,6 +10786,17 @@  build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx,
         return;
     }
 
+    /* We need to centralize the LB traffic to properly perform
+     * the undnat stage.
+     */
+    ds_put_format(&gw_redir_match, "%s) && outport == %s",
+                  ds_cstr(ctx->undnat_match), dgp->json_key);
+    ds_put_format(&gw_redir_action, "outport = %s; next;",
+                  dgp->cr_port->json_key);
+    ovn_lflow_add_with_hint(ctx->lflows, od, S_ROUTER_IN_GW_REDIRECT,
+                            200, ds_cstr(&gw_redir_match),
+                            ds_cstr(&gw_redir_action), &ctx->lb->nlb->header_);
+
     ds_put_format(ctx->undnat_match, ") && (inport == %s || outport == %s)"
                   " && is_chassis_resident(%s)", dgp->json_key, dgp->json_key,
                   dgp->cr_port->json_key);
@@ -10791,6 +10804,8 @@  build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx,
                             ds_cstr(ctx->undnat_match), undnat_action,
                             &ctx->lb->nlb->header_);
     ds_truncate(ctx->undnat_match, undnat_match_len);
+    ds_destroy(&gw_redir_match);
+    ds_destroy(&gw_redir_action);
 }
 
 static void
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 540fe03bd..e4f971f4b 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -4578,6 +4578,22 @@  icmp6 {
     </p>
 
     <ul>
+      <li>
+        For all the configured load balancing rules that include an IPv4
+        address <var>VIP</var>, and a list of IPv4 backend addresses
+        <var>B0</var>, <var>B1</var> .. <var>Bn</var> defined for the
+        <var>VIP</var> a priority-200 flow is added that matches <code>
+        ip4 &amp;&amp; (ip4.src == <var>B0</var> || ip4.src == <var>B1</var>
+        || ... || ip4.src == <var>Bn</var>)</code> with an action <code>
+        outport = <var>CR</var>; next;</code> where <var>CR</var> is the
+        <code>chassisredirect</code> port representing the instance of the
+        logical router distributed gateway port on the gateway chassis.
+        If the backend IPv4 address <var>Bx</var> is also configured with
+        L4 port <var>PORT</var> of protocol <var>P</var>, then the match
+        also includes <code>P.src</code> == <var>PORT</var>.
+        Similar flows are added for IPv6.
+      </li>
+
       <li>
         For each NAT rule in the OVN Northbound database that can
         be handled in a distributed manner, a priority-100 logical
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index e3669bdf5..aed6a9ea6 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -9484,3 +9484,53 @@  AT_CHECK([ovn-sbctl lflow-list sw | grep ls_out_pre_lb | grep priority=110 | gre
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([check fip and lb flows])
+AT_KEYWORDS([fip-lb-flows])
+ovn_start
+
+check ovn-nbctl lr-add R1
+check ovn-nbctl lrp-add R1 R1-S0 02:ac:10:01:00:01 10.0.0.1/24 1000::a/64
+check ovn-nbctl lrp-add R1 R1-PUB 02:ac:20:01:01:01 172.16.0.1/24 3000::a/64
+check ovn-nbctl lrp-set-gateway-chassis R1-PUB hv1 20
+
+check ovn-nbctl ls-add S0
+check ovn-nbctl lsp-add S0 S0-R1
+check ovn-nbctl lsp-set-type S0-R1 router
+check ovn-nbctl lsp-set-addresses S0-R1 02:ac:10:01:00:01
+check ovn-nbctl lsp-set-options S0-R1 router-port=R1-S0
+check ovn-nbctl lsp-add S0 S0-P0
+check ovn-nbctl lsp-set-addresses S0-P0 "50:54:00:00:00:03 10.0.0.3 1000::3"
+
+check ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.0.110 10.0.0.3 S0-P0 30:54:00:00:00:03
+check ovn-nbctl lr-nat-add R1 dnat_and_snat 3000::c 1000::3 S0-P0 40:54:00:00:00:03
+
+ovn-sbctl dump-flows R1 > R1flows
+AT_CAPTURE_FILE([R1flows])
+
+AT_CHECK([grep "lr_in_gw_redirect" R1flows | sed s'/table=../table=??/' |sort], [0], [dnl
+  table=??(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
+  table=??(lr_in_gw_redirect  ), priority=100  , match=(ip4.src == 10.0.0.3 && outport == "R1-PUB" && is_chassis_resident("S0-P0")), action=(eth.src = 30:54:00:00:00:03; reg1 = 172.16.0.110; next;)
+  table=??(lr_in_gw_redirect  ), priority=100  , match=(ip6.src == 1000::3 && outport == "R1-PUB" && is_chassis_resident("S0-P0")), action=(eth.src = 40:54:00:00:00:03; xxreg1 = 3000::c; next;)
+  table=??(lr_in_gw_redirect  ), priority=50   , match=(outport == "R1-PUB"), action=(outport = "cr-R1-PUB"; next;)
+])
+
+check ovn-nbctl lb-add lb_tcp4 172.16.0.100:50001 10.0.0.2:50001,10.0.0.3:50001,10.0.0.4:50001 tcp
+check ovn-nbctl lb-add lb_tcp6 [[1000::1]]:50001 [[1000::3]]:8080
+check ovn-nbctl --wait=sb lr-lb-add R1 lb_tcp4
+check ovn-nbctl --wait=sb lr-lb-add R1 lb_tcp6
+
+ovn-sbctl dump-flows R1 > R1flows
+AT_CAPTURE_FILE([R1flows])
+AT_CHECK([grep "lr_in_gw_redirect" R1flows |sed s'/table=../table=??/' |sort], [0], [dnl
+  table=??(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
+  table=??(lr_in_gw_redirect  ), priority=100  , match=(ip4.src == 10.0.0.3 && outport == "R1-PUB" && is_chassis_resident("S0-P0")), action=(eth.src = 30:54:00:00:00:03; reg1 = 172.16.0.110; next;)
+  table=??(lr_in_gw_redirect  ), priority=100  , match=(ip6.src == 1000::3 && outport == "R1-PUB" && is_chassis_resident("S0-P0")), action=(eth.src = 40:54:00:00:00:03; xxreg1 = 3000::c; next;)
+  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip4 && ((ip4.src == 10.0.0.2 && tcp.src == 50001) || (ip4.src == 10.0.0.3 && tcp.src == 50001) || (ip4.src == 10.0.0.4 && tcp.src == 50001)) && outport == "R1-PUB"), action=(outport = "cr-R1-PUB"; next;)
+  table=??(lr_in_gw_redirect  ), priority=200  , match=(ip6 && ((ip6.src == 1000::3 && tcp.src == 8080)) && outport == "R1-PUB"), action=(outport = "cr-R1-PUB"; next;)
+  table=??(lr_in_gw_redirect  ), priority=50   , match=(outport == "R1-PUB"), action=(outport = "cr-R1-PUB"; next;)
+])
+
+AT_CLEANUP
+])
diff --git a/tests/ovn.at b/tests/ovn.at
index 6f9fbbfd2..3fddd98db 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -35819,3 +35819,140 @@  OVS_WAIT_UNTIL([test $(as hv-1 ovs-vsctl list queue | grep -c 'burst="8000000000
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([DNAT_SNAT and LB traffic])
+AT_KEYWORDS([dnat-snat-lb])
+ovn_start
+
+test_ip_req_packet() {
+    local src_mac="505400000088"
+    local dst_mac="$1"
+    local src_ip=$(ip_to_hex 172 168 0 200)
+    local dst_ip="$2"
+    local cksum="$3"
+
+    local packet=${dst_mac}${src_mac}080045000072000000004011${cksum}
+    local packet_l3=${src_ip}${dst_ip}0035111100080000
+
+    packet_l3=${packet_l3}000000000000000000000000000000000000
+    packet_l3=${packet_l3}000000000000000000000000000000000000
+    packet_l3=${packet_l3}000000000000000000000000000000000000
+    packet_l3=${packet_l3}000000000000000000000000000000000000
+    packet_l3=${packet_l3}0000000000000000000000000000
+    packet=${packet}${packet_l3}
+
+    as hv1 reset_pcap_file hv1-vif1 hv1/vif1
+    as hv2 reset_pcap_file hv2-vif1 hv2/vif1
+    check as hv2 ovs-appctl netdev-dummy/receive hv2-vif1 $packet
+}
+
+test_ip_rep_packet() {
+    local src_mac="505400000001"
+    local dst_mac="00000000ff01"
+    local src_ip=$(ip_to_hex 10 0 0 2)
+    local dst_ip=$(ip_to_hex 172 168 0 200)
+
+    local packet=${dst_mac}${src_mac}080045000072000000004011c309
+    local packet_l3=${src_ip}${dst_ip}1111003500080000
+
+    packet_l3=${packet_l3}000000000000000000000000000000000000
+    packet_l3=${packet_l3}000000000000000000000000000000000000
+    packet_l3=${packet_l3}000000000000000000000000000000000000
+    packet_l3=${packet_l3}000000000000000000000000000000000000
+    packet_l3=${packet_l3}0000000000000000000000000000
+    packet=${packet}${packet_l3}
+
+    check as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
+}
+
+test_arp_reply() {
+    local dst_mac=f00000010203
+    local spa=$(ip_to_hex 172 168 0 200)
+    local tpa=$(ip_to_hex 172 168 0 110)
+
+    local packet=${dst_mac}50540000008808060001080006040002505400000088${spa}${dst_mac}${tpa}
+
+    as hv1 ovs-appctl netdev-dummy/receive br-phys_n $packet
+}
+
+net_add n
+
+sim_add hv1
+as hv1
+check ovs-vsctl add-br br-phys
+ovn_attach n br-phys 192.168.0.1
+check ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+check ovs-vsctl -- add-port br-int hv1-vif1 -- \
+    set interface hv1-vif1 external-ids:iface-id=sw0-port1 \
+    options:tx_pcap=hv1/vif1-tx.pcap \
+    options:rxq_pcap=hv1/vif1-rx.pcap
+
+sim_add hv2
+as hv2
+check ovs-vsctl add-br br-phys
+ovn_attach n br-phys 192.168.0.2
+check ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+check ovs-vsctl -- add-port br-int hv2-vif1 -- \
+    set interface hv2-vif1 external-ids:iface-id=public-port1 \
+    options:tx_pcap=hv2/vif1-tx.pcap \
+    options:rxq_pcap=hv2/vif1-rx.pcap
+
+check ovn-nbctl ls-add sw0
+check ovn-nbctl lsp-add sw0 sw0-port1
+check ovn-nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:01 10.0.0.2"
+
+check ovn-nbctl ls-add public
+check ovn-nbctl lsp-add public ln-public
+check ovn-nbctl lsp-set-type ln-public localnet
+check ovn-nbctl lsp-set-addresses ln-public unknown
+check ovn-nbctl lsp-set-options ln-public network_name=phys
+check ovn-nbctl lsp-add public public-port1
+check ovn-nbctl lsp-set-addresses public-port1 "50:54:00:00:00:88 172.168.0.200"
+
+check ovn-nbctl lr-add lr0
+check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24
+check ovn-nbctl lsp-add sw0 sw0-lr0
+check ovn-nbctl lsp-set-type sw0-lr0 router
+check ovn-nbctl lsp-set-addresses sw0-lr0 router
+check ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0
+check ovn-nbctl lrp-add lr0 lr0-public 00:00:20:20:12:13 172.168.0.1/24
+check ovn-nbctl lsp-add public public-lr0
+check ovn-nbctl lsp-set-type public-lr0 router
+check ovn-nbctl lsp-set-addresses public-lr0 router
+check ovn-nbctl lsp-set-options public-lr0 router-port=lr0-public
+
+check ovn-nbctl lrp-set-gateway-chassis lr0-public hv2 20
+
+check ovn-nbctl lr-nat-add lr0 snat 172.168.0.1 10.0.0.0/24
+check ovn-nbctl lr-nat-add lr0 dnat_and_snat 172.168.0.110 10.0.0.2 sw0-port1 f0:00:00:01:02:03
+
+wait_for_ports_up
+OVN_POPULATE_ARP
+
+# send UDP request to the FIP associated to sw0-port1
+test_ip_req_packet "f00000010203" $(ip_to_hex 172 168 0 110) "1ff5"
+OVS_WAIT_UNTIL([test $($PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" hv1/vif1-tx.pcap | wc -l) -ge 1])
+# send UDP reply from sw0-port1
+test_ip_rep_packet
+# generate arp reply
+test_arp_reply
+# packet sent by the FIP
+echo "505400000088f00000010203080045000072000000003f1120f5aca8006eaca800c811110035000800000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" > expected
+
+OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
+
+check ovn-nbctl lb-add lb0 172.168.0.1:4369 10.0.0.2:4369 udp
+check ovn-nbctl lr-lb-add lr0 lb0
+
+# send UDP request to the load-balancer VIP
+test_ip_req_packet "000020201213" $(ip_to_hex 172 168 0 1) "2062"
+OVS_WAIT_UNTIL([test $($PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" hv1/vif1-tx.pcap | wc -l) -ge 1])
+# send UDP reply from sw0-port1
+test_ip_rep_packet
+# packet sent by the load balancer VIP
+echo "505400000088000020201213080045000072000000003f112162aca80001aca800c811110035000800000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" > expected
+OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
+
+AT_CLEANUP
+])