mbox series

[ovs-dev,v4,0/8] northd: I-P for load balancer and lb groups

Message ID 20230802062103.3638403-1-numans@ovn.org
Headers show
Series northd: I-P for load balancer and lb groups | expand

Message

Numan Siddique Aug. 2, 2023, 6:21 a.m. UTC
From: Numan Siddique <numans@ovn.org>

This patch series adds the support to handle load balancer and
load balancer group changes incrementally in the "northd" engine
node.  "flow" engine node doesn't support I-P yet and falls back
to full recompute.  Changes to logical switches and router's load
balancer and load balancer group columns are also handled incrementally
provided those are the only changes to them.

Below are the scale testing results done with these patches applied
using ovn-heater.  The test ran the scenario  -
ocp-500-density-heavy.yml [1].

With these patches applied (with load balancer I-P handling in northd
engine node) the resuts are:

-------------------------------------------------------------------------------------------------------------------------------------------------------
                        Min (s)         Median (s)      90%ile (s)      99%ile (s)      Max (s)         Mean (s)        Total (s)       Count   Failed
-------------------------------------------------------------------------------------------------------------------------------------------------------
Iteration Total         0.132929        2.157103        3.314847        3.331561        4.378626        1.581889        197.736147      125     0
Namespace.add_ports     0.005217        0.005760        0.006565        0.013348        0.021014        0.006106        0.763214        125     0
WorkerNode.bind_port    0.035205        0.045458        0.052278        0.059804        0.063941        0.045652        11.413122       250     0
WorkerNode.ping_port    0.005075        0.006814        3.088548        3.192577        4.242026        0.726453        181.613284      250     0
-------------------------------------------------------------------------------------------------------------------------------------------------------

The results with the present main are:

-------------------------------------------------------------------------------------------------------------------------------------------------------
                        Min (s)         Median (s)      90%ile (s)      99%ile (s)      Max (s)         Mean (s)        Total (s)       Count   Failed
-------------------------------------------------------------------------------------------------------------------------------------------------------
Iteration Total         4.377260        6.486962        7.502040        8.322587        8.334701        6.559002        819.875306      125     0
Namespace.add_ports     0.005112        0.005484        0.005953        0.009153        0.011452        0.005662        0.707752        125     0
WorkerNode.bind_port    0.035360        0.042732        0.049152        0.053698        0.056635        0.043215        10.803700       250     0
WorkerNode.ping_port    0.005338        1.599904        7.229649        7.798039        8.206537        3.209860        802.464911      250     0
-------------------------------------------------------------------------------------------------------------------------------------------------------

Few observations:

 - The total time taken has come down significantly from 819 seconds to 197
   to complete the density heavy tests (excluding the base cluster
   bringup)
 - 99%ile with these patches is 3.3 seconds compared to 8.3 seconds for the
   main.
 - 90%file with these patches is 3.3 seconds compared to 7.5 seconds for
   the main.
 - CPU utilization of northd during the test with these patches
   is between 100% to 300% which is almost the same as main.
   Main difference being that, with these patches the test duration is
   less and hence overall less CPU utilization.

[1] - https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml


v3 -> v4
-------
  * Covered more test scearios.
  * Found few issues and fixed them.  v3 was not handling the scenario of
    a vip getting added or removed from a load balancer.

v2 -> v3
--------
  * v2 was very inefficient in handling the load balancer group changes
    and in associating the load balancers of the lb group to the
    datapaths. This was the main reason for the regression in the full
    recompute time taken.
    v3 addressed these by more efficiently handling the lb group changes
    incrementally.

Numan Siddique (8):
  northd I-P: Sync SB load balancers in a separate engine node.
  northd: Add a new engine node - lb_data.
  northd: Add initial I-P for load balancer and load balancer groups
  northd: Refactor the 'northd' node code which handles logical switch
    changes.
  northd: Handle load balancer changes for a logical switch.
  northd: Handle load balancer group changes for a logical switch.
  northd: Sync SB Port bindings NAT column in a separate engine node.
  northd: Handle load balancer/group changes for a logical router.

 lib/lb.c                 |  318 ++++++--
 lib/lb.h                 |  102 ++-
 northd/automake.mk       |    2 +
 northd/en-lb-data.c      |  800 ++++++++++++++++++
 northd/en-lb-data.h      |  109 +++
 northd/en-lflow.c        |    9 +-
 northd/en-northd.c       |  115 ++-
 northd/en-northd.h       |    3 +
 northd/en-sync-sb.c      |   74 ++
 northd/en-sync-sb.h      |   10 +
 northd/inc-proc-northd.c |   32 +-
 northd/northd.c          | 1667 +++++++++++++++++++++++++-------------
 northd/northd.h          |   39 +-
 tests/ovn-northd.at      |  512 ++++++++++++
 14 files changed, 3142 insertions(+), 650 deletions(-)
 create mode 100644 northd/en-lb-data.c
 create mode 100644 northd/en-lb-data.h