mbox series

[ovs-dev,v2,00/18] northd lflow incremental processing

Message ID 20231026181224.3366779-1-numans@ovn.org
Headers show
Series northd lflow incremental processing | expand

Message

Numan Siddique Oct. 26, 2023, 6:12 p.m. UTC
From: Numan Siddique <numans@ovn.org>

This patch series adds incremental processing in the lflow engine
node to handle changes to northd and other engine nodes.
Changed related to load balancers and NAT are mainly handled in
this patch series.

This patch series can also be found here - https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1

Prior to this patch series, most of the changes to northd engine
resulted in full recomputation of logical flows.  This series
aims to improve the performance of ovn-northd by adding the I-P
support.  In order to add this support, some of the northd engine
node data (from struct ovn_datapath) is split and moved over to
new engine nodes - mainly related to load balancers, NAT and ACLs.

Below are the scale testing results done with these patches applied
using ovn-heater.  The test ran the scenario  -
ocp-500-density-heavy.yml [1].

With all the lflow I-P patches applied, the resuts are:

-------------------------------------------------------------------------------------------------------------------------------------------------------
                        Min (s)         Median (s)      90%ile (s)      99%ile (s)      Max (s)         Mean (s)        Total (s)       Count   Failed
-------------------------------------------------------------------------------------------------------------------------------------------------------
Iteration Total         0.136883        1.129016        1.192001        1.204167        1.212728        0.665017        83.127099       125     0
Namespace.add_ports     0.005216        0.005736        0.007034        0.015486        0.018978        0.006211        0.776373        125     0
WorkerNode.bind_port    0.035030        0.046082        0.052469        0.058293        0.060311        0.045973        11.493259       250     0
WorkerNode.ping_port    0.005057        0.006727        1.047692        1.069253        1.071336        0.266896        66.724094       250     0
-------------------------------------------------------------------------------------------------------------------------------------------------------

The results with the present main are:

-------------------------------------------------------------------------------------------------------------------------------------------------------
                        Min (s)         Median (s)      90%ile (s)      99%ile (s)      Max (s)         Mean (s)        Total (s)       Count   Failed
-------------------------------------------------------------------------------------------------------------------------------------------------------
Iteration Total         0.135491        2.223805        3.311270        3.339078        3.345346        1.729172        216.146495      125     0
Namespace.add_ports     0.005380        0.005744        0.006819        0.018773        0.020800        0.006292        0.786532        125     0
WorkerNode.bind_port    0.034179        0.046055        0.053488        0.058801        0.071043        0.046117        11.529311       250     0
WorkerNode.ping_port    0.004956        0.006952        3.086952        3.191743        3.192807        0.791544        197.886026      250     0
-------------------------------------------------------------------------------------------------------------------------------------------------------

[1] - https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml

v1 -> v2
--------
   * Now also maintaing array indexes for ls_lbacls, lr_nat and
     lr_lb_nat_data tables (similar to ovn_datapaths->array) to
     make the lookup effecient.  The same ovn_datapath->index
     is reused.

   * Made some signficant changes to 'struct lflow_ref' in lflow-mgr.c.
     In v2 we don't use objdep_mgr to maintain the resource to lflow
     references.  Instead we maintain the 'struct lflow' pointer.
     With this we don't need to maintain additional hmap of lflows.


Numan Siddique (18):
  northd: Refactor the northd change tracking.
  northd: Track ovn_datapaths in northd engine track data.
  tests: Add a couple of tests in ovn-northd for I-P.
  northd: Move router ports SB PB options sync to sync_to_sb_pb node.
  northd: Add a new engine 'lr-nat' to manage lr NAT data.
  northd: Add a new engine 'lr-lb-nat-data' to manage routers' lb data.
  northd:  Generate logical router's LB and NAT flows using
    lr_lbnat_data.
  northd: Don't commit dhcp response flows in the conntrack.
  northd: Add a new node ls_lbacls.
  northd: Refactor lflow management into a separate module.
  northd: Use lflow_ref when adding all logical flows.
  northd:  Move ovn_lb_datapaths from lib to northd module.
  northd: Handle lb changes in lflow engine.
  northd: Add lr_lb_nat_data handler for lflow engine node.
  northd: Add ls_lbacls handler for lflow engine node.
  northd:  Add I-P for NB_Global and SB_Global.
  northd: Add a noop handler for northd SB mac binding.
  northd: Add northd change handler for sync_to_sb_lb node.

 lib/lb.c                   |   96 -
 lib/lb.h                   |   57 -
 lib/ovn-util.c             |   17 +-
 lib/ovn-util.h             |    2 +-
 lib/stopwatch-names.h      |    3 +
 northd/aging.c             |   21 +-
 northd/automake.mk         |   12 +-
 northd/en-global-config.c  |  588 ++++
 northd/en-global-config.h  |   65 +
 northd/en-lflow.c          |  123 +-
 northd/en-lflow.h          |    8 +
 northd/en-lr-lb-nat-data.c |  685 ++++
 northd/en-lr-lb-nat-data.h |  125 +
 northd/en-lr-nat.c         |  498 +++
 northd/en-lr-nat.h         |  137 +
 northd/en-ls-lb-acls.c     |  530 +++
 northd/en-ls-lb-acls.h     |  111 +
 northd/en-northd.c         |   67 +-
 northd/en-northd.h         |    2 +-
 northd/en-port-group.h     |    3 +
 northd/en-sync-sb.c        |  513 ++-
 northd/inc-proc-northd.c   |   79 +-
 northd/lflow-mgr.c         | 1078 ++++++
 northd/lflow-mgr.h         |  192 ++
 northd/northd.c            | 6485 ++++++++++++++++--------------------
 northd/northd.h            |  551 ++-
 northd/ovn-northd.c        |    4 +
 tests/ovn-northd.at        |  858 ++++-
 28 files changed, 8827 insertions(+), 4083 deletions(-)
 create mode 100644 northd/en-global-config.c
 create mode 100644 northd/en-global-config.h
 create mode 100644 northd/en-lr-lb-nat-data.c
 create mode 100644 northd/en-lr-lb-nat-data.h
 create mode 100644 northd/en-lr-nat.c
 create mode 100644 northd/en-lr-nat.h
 create mode 100644 northd/en-ls-lb-acls.c
 create mode 100644 northd/en-ls-lb-acls.h
 create mode 100644 northd/lflow-mgr.c
 create mode 100644 northd/lflow-mgr.h

Comments

Han Zhou Nov. 15, 2023, 7:59 a.m. UTC | #1
On Thu, Oct 26, 2023 at 11:12 AM <numans@ovn.org> wrote:
>
> From: Numan Siddique <numans@ovn.org>
>
> This patch series adds incremental processing in the lflow engine
> node to handle changes to northd and other engine nodes.
> Changed related to load balancers and NAT are mainly handled in
> this patch series.
>
> This patch series can also be found here -
https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1

Thanks Numan for the great improvement!

I spent some time these days to review the series and now at patch 10. I
need to take a break and so I just sent the comments I had so far.

I also did scale testing for northd with
https://github.com/hzhou8/ovn-test-script for a 500 chassis, 50 lsp /
chassis setup, and noticed that the recompute latency has increased 20%
after the series. I think in general it is expected to have some
performance penalty for the recompute case because of the new indexes
created for I-P. However, the patch 10 "northd: Refactor lflow management
into a separate module." alone introduces a 10% latency increase, which
necessitates more investigation.

>
> Prior to this patch series, most of the changes to northd engine
> resulted in full recomputation of logical flows.  This series
> aims to improve the performance of ovn-northd by adding the I-P
> support.

I'd like to clarify "most of the changes" a little. I think we should focus
on the most impactful changes that happen in a cloud environment. I don't
think it is realistic to achieve "most of the changes" in I-P because it is
too complex (even with this series we still handle a very small part of the
DB schema incrementally), but it may be realistic to handle the most
impactful changes, which are the most frequent changes in the cloud,
incrementally. Before this series, we could handle regular VIF changes and
related address-sets, port-groups (some of) updates incrementally. Those
are the most common changes related to pod/VM changes in cloud. I believe
the next goal is for LB changes related to pod/VMs (i.e. the LB backends),
which I believe is the main goal of this series. Is my understanding
correct?

While reviewing the patches, I'd say sometimes I feel a little lost because
it is hard to correlate some of the changes with the above goal. I believe
there is a reason for all the changes but I am not sure if it is directly
related to the goal of LB backend related I-P. I understand that there are
other aspects of LB that can be incrementally processed. But I'd recommend
if changes necessary for this goal can be largely reduced it would help the
review and we might be able to merge them sooner and add more but less
impactful I-P later step by step. I guess I will have a clearer picture
when I finish the rest of the patches, but it would be helpful if you could
add more guidance in this cover letter.

Thanks,
Han

>  In order to add this support, some of the northd engine
> node data (from struct ovn_datapath) is split and moved over to
> new engine nodes - mainly related to load balancers, NAT and ACLs.
>
> Below are the scale testing results done with these patches applied
> using ovn-heater.  The test ran the scenario  -
> ocp-500-density-heavy.yml [1].
>
> With all the lflow I-P patches applied, the resuts are:
>
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
>                         Min (s)         Median (s)      90%ile (s)
 99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
Failed
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
> Iteration Total         0.136883        1.129016        1.192001
 1.204167        1.212728        0.665017        83.127099       125     0
> Namespace.add_ports     0.005216        0.005736        0.007034
 0.015486        0.018978        0.006211        0.776373        125     0
> WorkerNode.bind_port    0.035030        0.046082        0.052469
 0.058293        0.060311        0.045973        11.493259       250     0
> WorkerNode.ping_port    0.005057        0.006727        1.047692
 1.069253        1.071336        0.266896        66.724094       250     0
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
>
> The results with the present main are:
>
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
>                         Min (s)         Median (s)      90%ile (s)
 99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
Failed
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
> Iteration Total         0.135491        2.223805        3.311270
 3.339078        3.345346        1.729172        216.146495      125     0
> Namespace.add_ports     0.005380        0.005744        0.006819
 0.018773        0.020800        0.006292        0.786532        125     0
> WorkerNode.bind_port    0.034179        0.046055        0.053488
 0.058801        0.071043        0.046117        11.529311       250     0
> WorkerNode.ping_port    0.004956        0.006952        3.086952
 3.191743        3.192807        0.791544        197.886026      250     0
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
>
> [1] -
https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
>
> v1 -> v2
> --------
>    * Now also maintaing array indexes for ls_lbacls, lr_nat and
>      lr_lb_nat_data tables (similar to ovn_datapaths->array) to
>      make the lookup effecient.  The same ovn_datapath->index
>      is reused.
>
>    * Made some signficant changes to 'struct lflow_ref' in lflow-mgr.c.
>      In v2 we don't use objdep_mgr to maintain the resource to lflow
>      references.  Instead we maintain the 'struct lflow' pointer.
>      With this we don't need to maintain additional hmap of lflows.
>
>
> Numan Siddique (18):
>   northd: Refactor the northd change tracking.
>   northd: Track ovn_datapaths in northd engine track data.
>   tests: Add a couple of tests in ovn-northd for I-P.
>   northd: Move router ports SB PB options sync to sync_to_sb_pb node.
>   northd: Add a new engine 'lr-nat' to manage lr NAT data.
>   northd: Add a new engine 'lr-lb-nat-data' to manage routers' lb data.
>   northd:  Generate logical router's LB and NAT flows using
>     lr_lbnat_data.
>   northd: Don't commit dhcp response flows in the conntrack.
>   northd: Add a new node ls_lbacls.
>   northd: Refactor lflow management into a separate module.
>   northd: Use lflow_ref when adding all logical flows.
>   northd:  Move ovn_lb_datapaths from lib to northd module.
>   northd: Handle lb changes in lflow engine.
>   northd: Add lr_lb_nat_data handler for lflow engine node.
>   northd: Add ls_lbacls handler for lflow engine node.
>   northd:  Add I-P for NB_Global and SB_Global.
>   northd: Add a noop handler for northd SB mac binding.
>   northd: Add northd change handler for sync_to_sb_lb node.
>
>  lib/lb.c                   |   96 -
>  lib/lb.h                   |   57 -
>  lib/ovn-util.c             |   17 +-
>  lib/ovn-util.h             |    2 +-
>  lib/stopwatch-names.h      |    3 +
>  northd/aging.c             |   21 +-
>  northd/automake.mk         |   12 +-
>  northd/en-global-config.c  |  588 ++++
>  northd/en-global-config.h  |   65 +
>  northd/en-lflow.c          |  123 +-
>  northd/en-lflow.h          |    8 +
>  northd/en-lr-lb-nat-data.c |  685 ++++
>  northd/en-lr-lb-nat-data.h |  125 +
>  northd/en-lr-nat.c         |  498 +++
>  northd/en-lr-nat.h         |  137 +
>  northd/en-ls-lb-acls.c     |  530 +++
>  northd/en-ls-lb-acls.h     |  111 +
>  northd/en-northd.c         |   67 +-
>  northd/en-northd.h         |    2 +-
>  northd/en-port-group.h     |    3 +
>  northd/en-sync-sb.c        |  513 ++-
>  northd/inc-proc-northd.c   |   79 +-
>  northd/lflow-mgr.c         | 1078 ++++++
>  northd/lflow-mgr.h         |  192 ++
>  northd/northd.c            | 6485 ++++++++++++++++--------------------
>  northd/northd.h            |  551 ++-
>  northd/ovn-northd.c        |    4 +
>  tests/ovn-northd.at        |  858 ++++-
>  28 files changed, 8827 insertions(+), 4083 deletions(-)
>  create mode 100644 northd/en-global-config.c
>  create mode 100644 northd/en-global-config.h
>  create mode 100644 northd/en-lr-lb-nat-data.c
>  create mode 100644 northd/en-lr-lb-nat-data.h
>  create mode 100644 northd/en-lr-nat.c
>  create mode 100644 northd/en-lr-nat.h
>  create mode 100644 northd/en-ls-lb-acls.c
>  create mode 100644 northd/en-ls-lb-acls.h
>  create mode 100644 northd/lflow-mgr.c
>  create mode 100644 northd/lflow-mgr.h
>
> --
> 2.41.0
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Numan Siddique Nov. 16, 2023, 3:32 a.m. UTC | #2
On Wed, Nov 15, 2023 at 2:59 AM Han Zhou <hzhou@ovn.org> wrote:
>
> On Thu, Oct 26, 2023 at 11:12 AM <numans@ovn.org> wrote:
> >
> > From: Numan Siddique <numans@ovn.org>
> >
> > This patch series adds incremental processing in the lflow engine
> > node to handle changes to northd and other engine nodes.
> > Changed related to load balancers and NAT are mainly handled in
> > this patch series.
> >
> > This patch series can also be found here -
> https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1
>
> Thanks Numan for the great improvement!

Hi Han,

Thanks for the review comments.  I understand it is hard to review
somany patches, especially related to I-P.
I appreciate the time spent on it and  the feedback.

>
> I spent some time these days to review the series and now at patch 10. I
> need to take a break and so I just sent the comments I had so far.
>
> I also did scale testing for northd with
> https://github.com/hzhou8/ovn-test-script for a 500 chassis, 50 lsp /
> chassis setup, and noticed that the recompute latency has increased 20%
> after the series. I think in general it is expected to have some
> performance penalty for the recompute case because of the new indexes
> created for I-P. However, the patch 10 "northd: Refactor lflow management
> into a separate module." alone introduces a 10% latency increase, which
> necessitates more investigation.

Before sending out the series I  did some testing on recomputes with a
large OVN NB DB and SB DB
(from a 500 node ovn-heater density heavy run).
I'm aware of the increase in recomputes.  And I did some more testing
today as well.

In my testing,  I can see that the increase in latency is due to the
new engine nodes added (lr_lbnat mainly)
and due to housekeeping of the lflow references.  I do not see any
increase due to the new lflow-mgr.c added in patch 10.

I compared patch 9 and patch 10 separately and there is no difference
in lflow engine node recompute time between patch 9 and 10.

Below are the results with the time taken for the mentioned engine
nodes in msecs for a recompute for some of the individual patches in
the series.


The sample OVN DBs have

--------------------------------
Resource              Total
-------------------------------
Logical switches    1001
----------------------------
Logical routers      501
----------------------------
Router ports         1501
----------------------------
Switch ports         29501
----------------------------
Load balancers    35253
----------------------------
Load balancer group 1
----------------------------
SB Logical flows    268349
----------------------------
SB DB groups          509
----------------------------

There is one load balancer group which has all the load balancers and
it is associated with all the logical switches and routers.

Below is the time taken for each engine node in msec

---------------------------------------------------------------------
Engine nodes     | lb_data | northd  | lr_lbnat  | lflow  |
---------------------------------------------------------------------
ovn-northd-main  | 358      | 2455    | x         | 2082     |
---------------------------------------------------------------------
ovn-northd-p1    | 373       | 2476    | x         | 2170       |
---------------------------------------------------------------------
ovn-northd-p5    | 367       | 2413    | x         | 2195       |
---------------------------------------------------------------------
ovn-northd-p6    | 354       | 688     | 1815      | 2442    |
---------------------------------------------------------------------
ovn-northd-p7    | 357       | 683     | 1778      | 2806    |
---------------------------------------------------------------------
ovn-northd-p9    | 352       | 682     | 1781      | 2781    |
---------------------------------------------------------------------
ovn-northd-p10   | 365      | 838     | 1707      | 2812    |
---------------------------------------------------------------------
ovn-northd-p13   | 362      | 1066    | 1882      | 2917    |
---------------------------------------------------------------------
ovn-northd-p15   | 359      | 1046    | 1688      | 2907    |
---------------------------------------------------------------------
ovn-northd-p18   | 379      | 1066    | 1729      | 2886    |
---------------------------------------------------------------------

ovn-northd-p1 means ovn-northd with the patch 1 of this series,
ovn-northd-p2 - patch 2 of this series and so on.

Patch 6 adds a new engine node lr_lbnat and that's why the northd
engine node time gets reduced from ~2413 msec to 688.

With the first 10 patches,  northd engine over all time increases to
200msec compared to "main '' and lflow engine node time increases to
around 800 msec.
The point of this data is to show that the overall increase is mainly
due to bookkeeping and new engine nodes.  I tried to optimise but I
couldn't.


IIMO this 20% overall increase should be fine for the following reasons
   -  The scale tests with ovn-heater show significant decrease in
ovn-northd CPU usage and the decrease in the overall test duration.
   -  This means there were very few recomputes triggered.
   -  With the OCP/K8s  kube-burner  tests,  there were 0 recomputes
in the northd and lflow engine nodes.



>
> >
> > Prior to this patch series, most of the changes to northd engine
> > resulted in full recomputation of logical flows.  This series
> > aims to improve the performance of ovn-northd by adding the I-P
> > support.
>
> I'd like to clarify "most of the changes" a little. I think we should focus
> on the most impactful changes that happen in a cloud environment. I don't
> think it is realistic to achieve "most of the changes" in I-P because it is
> too complex (even with this series we still handle a very small part of the
> DB schema incrementally), but it may be realistic to handle the most
> impactful changes, which are the most frequent changes in the cloud,
> incrementally. Before this series, we could handle regular VIF changes and
> related address-sets, port-groups (some of) updates incrementally. Those
> are the most common changes related to pod/VM changes in cloud. I believe
> the next goal is for LB changes related to pod/VMs (i.e. the LB backends),
> which I believe is the main goal of this series. Is my understanding
> correct?
>
That is correct.  LB change handling is very important in my opinion
when used with ovn-kubernetes because all the k8s services
are mapped to OVN LBs.


> While reviewing the patches, I'd say sometimes I feel a little lost because
> it is hard to correlate some of the changes with the above goal.

I understand that.  Maybe I've done a bad job in conveying why the
changes are required for LBs.

Let me try to clarify a bit here.  I'll incorporate and add more
details in the patch commits and cover letter in the next series.

This series mainly targets handling lflow generation for load balancer changes.
As you know we use conntrack in OVN for
   - Load balancers (both in logical switch and router)
   - ACLs in logical switch
   - NAT in load balancers.

And this makes it complicated to decouple logical flow generation for
LBs from ACLs and NATs.  That is the reason I had to split northd node
data related to load balancers, ACLs and NATs into separate nodes.

To give an example, we generate certain logical flows for a logical
switch by checking " if (od->has_lb_vip && od->has_stateful_acl)" (See
build_acl_hints()).
For logical routers, a NAT can be a LB VIP too.  This becomes
difficult to decouple LB and NAT.   We could fall back to recompute
for NAT changes.  But we still need to group logical flows related to
LB and NAT.  And that's why I added lr_lbnat_data engine node in patch
6.

 I believe
> there is a reason for all the changes but I am not sure if it is directly
> related to the goal of LB backend related I-P. I understand that there are
> other aspects of LB that can be incrementally processed. But I'd recommend
> if changes necessary for this goal can be largely reduced it would help the
> review and we might be able to merge them sooner and add more but less
> impactful I-P later step by step. I guess I will have a clearer picture
> when I finish the rest of the patches, but it would be helpful if you could
> add more guidance in this cover letter.

I'll definitely add more details in the cover letter.

Apart from the LB related I-P, this series only adds 2 more additional
patches for I-P.
One is for the NB_Global changes and the other for SB mac binding changes.
ovn-kubernetes periodically writes in the NB_Global.options column
with a timestamp for its internal health checks
and this results in unnecessary recomputes.   After running the
kube-burner density cni tests, I figured out the most common
changes done in the NB DB and added the I-P for that.  With all these
patches applied, there are no recomputes in both northd and lflow
engine nodes
during the test duration.  Not having the I-P for NB Global for
example was resulting in significant ovn-northd CPU usage.

Thanks again for the reviews.

Numan

>
> Thanks,
> Han
>
> >  In order to add this support, some of the northd engine
> > node data (from struct ovn_datapath) is split and moved over to
> > new engine nodes - mainly related to load balancers, NAT and ACLs.
> >
> > Below are the scale testing results done with these patches applied
> > using ovn-heater.  The test ran the scenario  -
> > ocp-500-density-heavy.yml [1].
> >
> > With all the lflow I-P patches applied, the resuts are:
> >
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> >                         Min (s)         Median (s)      90%ile (s)
>  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
> Failed
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> > Iteration Total         0.136883        1.129016        1.192001
>  1.204167        1.212728        0.665017        83.127099       125     0
> > Namespace.add_ports     0.005216        0.005736        0.007034
>  0.015486        0.018978        0.006211        0.776373        125     0
> > WorkerNode.bind_port    0.035030        0.046082        0.052469
>  0.058293        0.060311        0.045973        11.493259       250     0
> > WorkerNode.ping_port    0.005057        0.006727        1.047692
>  1.069253        1.071336        0.266896        66.724094       250     0
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > The results with the present main are:
> >
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> >                         Min (s)         Median (s)      90%ile (s)
>  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
> Failed
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> > Iteration Total         0.135491        2.223805        3.311270
>  3.339078        3.345346        1.729172        216.146495      125     0
> > Namespace.add_ports     0.005380        0.005744        0.006819
>  0.018773        0.020800        0.006292        0.786532        125     0
> > WorkerNode.bind_port    0.034179        0.046055        0.053488
>  0.058801        0.071043        0.046117        11.529311       250     0
> > WorkerNode.ping_port    0.004956        0.006952        3.086952
>  3.191743        3.192807        0.791544        197.886026      250     0
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > [1] -
> https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
> >
> > v1 -> v2
> > --------
> >    * Now also maintaing array indexes for ls_lbacls, lr_nat and
> >      lr_lb_nat_data tables (similar to ovn_datapaths->array) to
> >      make the lookup effecient.  The same ovn_datapath->index
> >      is reused.
> >
> >    * Made some signficant changes to 'struct lflow_ref' in lflow-mgr.c.
> >      In v2 we don't use objdep_mgr to maintain the resource to lflow
> >      references.  Instead we maintain the 'struct lflow' pointer.
> >      With this we don't need to maintain additional hmap of lflows.
> >
> >
> > Numan Siddique (18):
> >   northd: Refactor the northd change tracking.
> >   northd: Track ovn_datapaths in northd engine track data.
> >   tests: Add a couple of tests in ovn-northd for I-P.
> >   northd: Move router ports SB PB options sync to sync_to_sb_pb node.
> >   northd: Add a new engine 'lr-nat' to manage lr NAT data.
> >   northd: Add a new engine 'lr-lb-nat-data' to manage routers' lb data.
> >   northd:  Generate logical router's LB and NAT flows using
> >     lr_lbnat_data.
> >   northd: Don't commit dhcp response flows in the conntrack.
> >   northd: Add a new node ls_lbacls.
> >   northd: Refactor lflow management into a separate module.
> >   northd: Use lflow_ref when adding all logical flows.
> >   northd:  Move ovn_lb_datapaths from lib to northd module.
> >   northd: Handle lb changes in lflow engine.
> >   northd: Add lr_lb_nat_data handler for lflow engine node.
> >   northd: Add ls_lbacls handler for lflow engine node.
> >   northd:  Add I-P for NB_Global and SB_Global.
> >   northd: Add a noop handler for northd SB mac binding.
> >   northd: Add northd change handler for sync_to_sb_lb node.
> >
> >  lib/lb.c                   |   96 -
> >  lib/lb.h                   |   57 -
> >  lib/ovn-util.c             |   17 +-
> >  lib/ovn-util.h             |    2 +-
> >  lib/stopwatch-names.h      |    3 +
> >  northd/aging.c             |   21 +-
> >  northd/automake.mk         |   12 +-
> >  northd/en-global-config.c  |  588 ++++
> >  northd/en-global-config.h  |   65 +
> >  northd/en-lflow.c          |  123 +-
> >  northd/en-lflow.h          |    8 +
> >  northd/en-lr-lb-nat-data.c |  685 ++++
> >  northd/en-lr-lb-nat-data.h |  125 +
> >  northd/en-lr-nat.c         |  498 +++
> >  northd/en-lr-nat.h         |  137 +
> >  northd/en-ls-lb-acls.c     |  530 +++
> >  northd/en-ls-lb-acls.h     |  111 +
> >  northd/en-northd.c         |   67 +-
> >  northd/en-northd.h         |    2 +-
> >  northd/en-port-group.h     |    3 +
> >  northd/en-sync-sb.c        |  513 ++-
> >  northd/inc-proc-northd.c   |   79 +-
> >  northd/lflow-mgr.c         | 1078 ++++++
> >  northd/lflow-mgr.h         |  192 ++
> >  northd/northd.c            | 6485 ++++++++++++++++--------------------
> >  northd/northd.h            |  551 ++-
> >  northd/ovn-northd.c        |    4 +
> >  tests/ovn-northd.at        |  858 ++++-
> >  28 files changed, 8827 insertions(+), 4083 deletions(-)
> >  create mode 100644 northd/en-global-config.c
> >  create mode 100644 northd/en-global-config.h
> >  create mode 100644 northd/en-lr-lb-nat-data.c
> >  create mode 100644 northd/en-lr-lb-nat-data.h
> >  create mode 100644 northd/en-lr-nat.c
> >  create mode 100644 northd/en-lr-nat.h
> >  create mode 100644 northd/en-ls-lb-acls.c
> >  create mode 100644 northd/en-ls-lb-acls.h
> >  create mode 100644 northd/lflow-mgr.c
> >  create mode 100644 northd/lflow-mgr.h
> >
> > --
> > 2.41.0
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Han Zhou Nov. 16, 2023, 7:53 p.m. UTC | #3
On Wed, Nov 15, 2023 at 7:32 PM Numan Siddique <numans@ovn.org> wrote:
>
> On Wed, Nov 15, 2023 at 2:59 AM Han Zhou <hzhou@ovn.org> wrote:
> >
> > On Thu, Oct 26, 2023 at 11:12 AM <numans@ovn.org> wrote:
> > >
> > > From: Numan Siddique <numans@ovn.org>
> > >
> > > This patch series adds incremental processing in the lflow engine
> > > node to handle changes to northd and other engine nodes.
> > > Changed related to load balancers and NAT are mainly handled in
> > > this patch series.
> > >
> > > This patch series can also be found here -
> > https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1
> >
> > Thanks Numan for the great improvement!
>
> Hi Han,
>
> Thanks for the review comments.  I understand it is hard to review
> somany patches, especially related to I-P.
> I appreciate the time spent on it and  the feedback.
>
> >
> > I spent some time these days to review the series and now at patch 10. I
> > need to take a break and so I just sent the comments I had so far.
> >
> > I also did scale testing for northd with
> > https://github.com/hzhou8/ovn-test-script for a 500 chassis, 50 lsp /
> > chassis setup, and noticed that the recompute latency has increased 20%
> > after the series. I think in general it is expected to have some
> > performance penalty for the recompute case because of the new indexes
> > created for I-P. However, the patch 10 "northd: Refactor lflow
management
> > into a separate module." alone introduces a 10% latency increase, which
> > necessitates more investigation.
>
> Before sending out the series I  did some testing on recomputes with a
> large OVN NB DB and SB DB
> (from a 500 node ovn-heater density heavy run).
> I'm aware of the increase in recomputes.  And I did some more testing
> today as well.
>
> In my testing,  I can see that the increase in latency is due to the
> new engine nodes added (lr_lbnat mainly)
> and due to housekeeping of the lflow references.  I do not see any
> increase due to the new lflow-mgr.c added in patch 10.
>
> I compared patch 9 and patch 10 separately and there is no difference
> in lflow engine node recompute time between patch 9 and 10.
>

My results were different. My test profile simulates the ovn-k8s topology
(central mode, not IC) with 500 nodes, 50 LSP/node, with no LB, small
amount of ACLs and PGs.
(
https://github.com/hzhou8/ovn-test-script/blob/main/args.500ch_50lsp_10000pg
)

The test results for ovn-northd recompute time are:
main: 1118.3
p9: 1129.5
p10: 1243.4 ==> 10% more than p9
p18: 1357.6

I am not sure if the p10 increase is related to the hash change or
something else.

> Below are the results with the time taken for the mentioned engine
> nodes in msecs for a recompute for some of the individual patches in
> the series.
>
>
> The sample OVN DBs have
>
> --------------------------------
> Resource              Total
> -------------------------------
> Logical switches    1001
> ----------------------------
> Logical routers      501
> ----------------------------
> Router ports         1501
> ----------------------------
> Switch ports         29501
> ----------------------------
> Load balancers    35253
> ----------------------------
> Load balancer group 1
> ----------------------------
> SB Logical flows    268349
> ----------------------------
> SB DB groups          509
> ----------------------------
>
> There is one load balancer group which has all the load balancers and
> it is associated with all the logical switches and routers.
>
> Below is the time taken for each engine node in msec
>
> ---------------------------------------------------------------------
> Engine nodes     | lb_data | northd  | lr_lbnat  | lflow  |
> ---------------------------------------------------------------------
> ovn-northd-main  | 358      | 2455    | x         | 2082     |
> ---------------------------------------------------------------------
> ovn-northd-p1    | 373       | 2476    | x         | 2170       |
> ---------------------------------------------------------------------
> ovn-northd-p5    | 367       | 2413    | x         | 2195       |
> ---------------------------------------------------------------------
> ovn-northd-p6    | 354       | 688     | 1815      | 2442    |
> ---------------------------------------------------------------------
> ovn-northd-p7    | 357       | 683     | 1778      | 2806    |
> ---------------------------------------------------------------------
> ovn-northd-p9    | 352       | 682     | 1781      | 2781    |
> ---------------------------------------------------------------------
> ovn-northd-p10   | 365      | 838     | 1707      | 2812    |
> ---------------------------------------------------------------------
> ovn-northd-p13   | 362      | 1066    | 1882      | 2917    |
> ---------------------------------------------------------------------
> ovn-northd-p15   | 359      | 1046    | 1688      | 2907    |
> ---------------------------------------------------------------------
> ovn-northd-p18   | 379      | 1066    | 1729      | 2886    |
> ---------------------------------------------------------------------
>
> ovn-northd-p1 means ovn-northd with the patch 1 of this series,
> ovn-northd-p2 - patch 2 of this series and so on.
>
> Patch 6 adds a new engine node lr_lbnat and that's why the northd
> engine node time gets reduced from ~2413 msec to 688.
>
> With the first 10 patches,  northd engine over all time increases to
> 200msec compared to "main '' and lflow engine node time increases to
> around 800 msec.
> The point of this data is to show that the overall increase is mainly
> due to bookkeeping and new engine nodes.  I tried to optimise but I
> couldn't.
>
>
> IIMO this 20% overall increase should be fine for the following reasons
>    -  The scale tests with ovn-heater show significant decrease in
> ovn-northd CPU usage and the decrease in the overall test duration.
>    -  This means there were very few recomputes triggered.
>    -  With the OCP/K8s  kube-burner  tests,  there were 0 recomputes
> in the northd and lflow engine nodes.
>
>
>
> >
> > >
> > > Prior to this patch series, most of the changes to northd engine
> > > resulted in full recomputation of logical flows.  This series
> > > aims to improve the performance of ovn-northd by adding the I-P
> > > support.
> >
> > I'd like to clarify "most of the changes" a little. I think we should
focus
> > on the most impactful changes that happen in a cloud environment. I
don't
> > think it is realistic to achieve "most of the changes" in I-P because
it is
> > too complex (even with this series we still handle a very small part of
the
> > DB schema incrementally), but it may be realistic to handle the most
> > impactful changes, which are the most frequent changes in the cloud,
> > incrementally. Before this series, we could handle regular VIF changes
and
> > related address-sets, port-groups (some of) updates incrementally. Those
> > are the most common changes related to pod/VM changes in cloud. I
believe
> > the next goal is for LB changes related to pod/VMs (i.e. the LB
backends),
> > which I believe is the main goal of this series. Is my understanding
> > correct?
> >
> That is correct.  LB change handling is very important in my opinion
> when used with ovn-kubernetes because all the k8s services
> are mapped to OVN LBs.
>
>
> > While reviewing the patches, I'd say sometimes I feel a little lost
because
> > it is hard to correlate some of the changes with the above goal.
>
> I understand that.  Maybe I've done a bad job in conveying why the
> changes are required for LBs.
>
> Let me try to clarify a bit here.  I'll incorporate and add more
> details in the patch commits and cover letter in the next series.
>
> This series mainly targets handling lflow generation for load balancer
changes.
> As you know we use conntrack in OVN for
>    - Load balancers (both in logical switch and router)
>    - ACLs in logical switch
>    - NAT in load balancers.
>
> And this makes it complicated to decouple logical flow generation for
> LBs from ACLs and NATs.  That is the reason I had to split northd node
> data related to load balancers, ACLs and NATs into separate nodes.
>
> To give an example, we generate certain logical flows for a logical
> switch by checking " if (od->has_lb_vip && od->has_stateful_acl)" (See
> build_acl_hints()).
> For logical routers, a NAT can be a LB VIP too.  This becomes
> difficult to decouple LB and NAT.   We could fall back to recompute
> for NAT changes.  But we still need to group logical flows related to
> LB and NAT.  And that's why I added lr_lbnat_data engine node in patch
> 6.
>

Thanks for explaining. However, I wonder if these are related to LB backend
updates. I agree that service creation/deletion may also be important to be
handled incrementally, but they are less critical (less frequent) than the
backend updates which are directly related to pod/vm creation/deletion. The
has_xxx and NAT related handling are more related to the service
creation/deletion rather than LB backend updates, right?

Thanks,
Han

>  I believe
> > there is a reason for all the changes but I am not sure if it is
directly
> > related to the goal of LB backend related I-P. I understand that there
are
> > other aspects of LB that can be incrementally processed. But I'd
recommend
> > if changes necessary for this goal can be largely reduced it would help
the
> > review and we might be able to merge them sooner and add more but less
> > impactful I-P later step by step. I guess I will have a clearer picture
> > when I finish the rest of the patches, but it would be helpful if you
could
> > add more guidance in this cover letter.
>
> I'll definitely add more details in the cover letter.
>
> Apart from the LB related I-P, this series only adds 2 more additional
> patches for I-P.
> One is for the NB_Global changes and the other for SB mac binding changes.
> ovn-kubernetes periodically writes in the NB_Global.options column
> with a timestamp for its internal health checks
> and this results in unnecessary recomputes.   After running the
> kube-burner density cni tests, I figured out the most common
> changes done in the NB DB and added the I-P for that.  With all these
> patches applied, there are no recomputes in both northd and lflow
> engine nodes
> during the test duration.  Not having the I-P for NB Global for
> example was resulting in significant ovn-northd CPU usage.
>
> Thanks again for the reviews.
>
> Numan
>
> >
> > Thanks,
> > Han
> >
> > >  In order to add this support, some of the northd engine
> > > node data (from struct ovn_datapath) is split and moved over to
> > > new engine nodes - mainly related to load balancers, NAT and ACLs.
> > >
> > > Below are the scale testing results done with these patches applied
> > > using ovn-heater.  The test ran the scenario  -
> > > ocp-500-density-heavy.yml [1].
> > >
> > > With all the lflow I-P patches applied, the resuts are:
> > >
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > >                         Min (s)         Median (s)      90%ile (s)
> >  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
> > Failed
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > > Iteration Total         0.136883        1.129016        1.192001
> >  1.204167        1.212728        0.665017        83.127099       125
  0
> > > Namespace.add_ports     0.005216        0.005736        0.007034
> >  0.015486        0.018978        0.006211        0.776373        125
  0
> > > WorkerNode.bind_port    0.035030        0.046082        0.052469
> >  0.058293        0.060311        0.045973        11.493259       250
  0
> > > WorkerNode.ping_port    0.005057        0.006727        1.047692
> >  1.069253        1.071336        0.266896        66.724094       250
  0
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > >
> > > The results with the present main are:
> > >
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > >                         Min (s)         Median (s)      90%ile (s)
> >  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
> > Failed
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > > Iteration Total         0.135491        2.223805        3.311270
> >  3.339078        3.345346        1.729172        216.146495      125
  0
> > > Namespace.add_ports     0.005380        0.005744        0.006819
> >  0.018773        0.020800        0.006292        0.786532        125
  0
> > > WorkerNode.bind_port    0.034179        0.046055        0.053488
> >  0.058801        0.071043        0.046117        11.529311       250
  0
> > > WorkerNode.ping_port    0.004956        0.006952        3.086952
> >  3.191743        3.192807        0.791544        197.886026      250
  0
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > >
> > > [1] -
> >
https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
> > >
> > > v1 -> v2
> > > --------
> > >    * Now also maintaing array indexes for ls_lbacls, lr_nat and
> > >      lr_lb_nat_data tables (similar to ovn_datapaths->array) to
> > >      make the lookup effecient.  The same ovn_datapath->index
> > >      is reused.
> > >
> > >    * Made some signficant changes to 'struct lflow_ref' in
lflow-mgr.c.
> > >      In v2 we don't use objdep_mgr to maintain the resource to lflow
> > >      references.  Instead we maintain the 'struct lflow' pointer.
> > >      With this we don't need to maintain additional hmap of lflows.
> > >
> > >
> > > Numan Siddique (18):
> > >   northd: Refactor the northd change tracking.
> > >   northd: Track ovn_datapaths in northd engine track data.
> > >   tests: Add a couple of tests in ovn-northd for I-P.
> > >   northd: Move router ports SB PB options sync to sync_to_sb_pb node.
> > >   northd: Add a new engine 'lr-nat' to manage lr NAT data.
> > >   northd: Add a new engine 'lr-lb-nat-data' to manage routers' lb
data.
> > >   northd:  Generate logical router's LB and NAT flows using
> > >     lr_lbnat_data.
> > >   northd: Don't commit dhcp response flows in the conntrack.
> > >   northd: Add a new node ls_lbacls.
> > >   northd: Refactor lflow management into a separate module.
> > >   northd: Use lflow_ref when adding all logical flows.
> > >   northd:  Move ovn_lb_datapaths from lib to northd module.
> > >   northd: Handle lb changes in lflow engine.
> > >   northd: Add lr_lb_nat_data handler for lflow engine node.
> > >   northd: Add ls_lbacls handler for lflow engine node.
> > >   northd:  Add I-P for NB_Global and SB_Global.
> > >   northd: Add a noop handler for northd SB mac binding.
> > >   northd: Add northd change handler for sync_to_sb_lb node.
> > >
> > >  lib/lb.c                   |   96 -
> > >  lib/lb.h                   |   57 -
> > >  lib/ovn-util.c             |   17 +-
> > >  lib/ovn-util.h             |    2 +-
> > >  lib/stopwatch-names.h      |    3 +
> > >  northd/aging.c             |   21 +-
> > >  northd/automake.mk         |   12 +-
> > >  northd/en-global-config.c  |  588 ++++
> > >  northd/en-global-config.h  |   65 +
> > >  northd/en-lflow.c          |  123 +-
> > >  northd/en-lflow.h          |    8 +
> > >  northd/en-lr-lb-nat-data.c |  685 ++++
> > >  northd/en-lr-lb-nat-data.h |  125 +
> > >  northd/en-lr-nat.c         |  498 +++
> > >  northd/en-lr-nat.h         |  137 +
> > >  northd/en-ls-lb-acls.c     |  530 +++
> > >  northd/en-ls-lb-acls.h     |  111 +
> > >  northd/en-northd.c         |   67 +-
> > >  northd/en-northd.h         |    2 +-
> > >  northd/en-port-group.h     |    3 +
> > >  northd/en-sync-sb.c        |  513 ++-
> > >  northd/inc-proc-northd.c   |   79 +-
> > >  northd/lflow-mgr.c         | 1078 ++++++
> > >  northd/lflow-mgr.h         |  192 ++
> > >  northd/northd.c            | 6485
++++++++++++++++--------------------
> > >  northd/northd.h            |  551 ++-
> > >  northd/ovn-northd.c        |    4 +
> > >  tests/ovn-northd.at        |  858 ++++-
> > >  28 files changed, 8827 insertions(+), 4083 deletions(-)
> > >  create mode 100644 northd/en-global-config.c
> > >  create mode 100644 northd/en-global-config.h
> > >  create mode 100644 northd/en-lr-lb-nat-data.c
> > >  create mode 100644 northd/en-lr-lb-nat-data.h
> > >  create mode 100644 northd/en-lr-nat.c
> > >  create mode 100644 northd/en-lr-nat.h
> > >  create mode 100644 northd/en-ls-lb-acls.c
> > >  create mode 100644 northd/en-ls-lb-acls.h
> > >  create mode 100644 northd/lflow-mgr.c
> > >  create mode 100644 northd/lflow-mgr.h
> > >
> > > --
> > > 2.41.0
> > >
> > > _______________________________________________
> > > dev mailing list
> > > dev@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Numan Siddique Nov. 16, 2023, 9:05 p.m. UTC | #4
On Thu, Nov 16, 2023 at 2:54 PM Han Zhou <hzhou@ovn.org> wrote:
>
> On Wed, Nov 15, 2023 at 7:32 PM Numan Siddique <numans@ovn.org> wrote:
> >
> > On Wed, Nov 15, 2023 at 2:59 AM Han Zhou <hzhou@ovn.org> wrote:
> > >
> > > On Thu, Oct 26, 2023 at 11:12 AM <numans@ovn.org> wrote:
> > > >
> > > > From: Numan Siddique <numans@ovn.org>
> > > >
> > > > This patch series adds incremental processing in the lflow engine
> > > > node to handle changes to northd and other engine nodes.
> > > > Changed related to load balancers and NAT are mainly handled in
> > > > this patch series.
> > > >
> > > > This patch series can also be found here -
> > > https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1
> > >
> > > Thanks Numan for the great improvement!
> >
> > Hi Han,
> >
> > Thanks for the review comments.  I understand it is hard to review
> > somany patches, especially related to I-P.
> > I appreciate the time spent on it and  the feedback.
> >
> > >
> > > I spent some time these days to review the series and now at patch 10. I
> > > need to take a break and so I just sent the comments I had so far.
> > >
> > > I also did scale testing for northd with
> > > https://github.com/hzhou8/ovn-test-script for a 500 chassis, 50 lsp /
> > > chassis setup, and noticed that the recompute latency has increased 20%
> > > after the series. I think in general it is expected to have some
> > > performance penalty for the recompute case because of the new indexes
> > > created for I-P. However, the patch 10 "northd: Refactor lflow
> management
> > > into a separate module." alone introduces a 10% latency increase, which
> > > necessitates more investigation.
> >
> > Before sending out the series I  did some testing on recomputes with a
> > large OVN NB DB and SB DB
> > (from a 500 node ovn-heater density heavy run).
> > I'm aware of the increase in recomputes.  And I did some more testing
> > today as well.
> >
> > In my testing,  I can see that the increase in latency is due to the
> > new engine nodes added (lr_lbnat mainly)
> > and due to housekeeping of the lflow references.  I do not see any
> > increase due to the new lflow-mgr.c added in patch 10.
> >
> > I compared patch 9 and patch 10 separately and there is no difference
> > in lflow engine node recompute time between patch 9 and 10.
> >
>
> My results were different. My test profile simulates the ovn-k8s topology
> (central mode, not IC) with 500 nodes, 50 LSP/node, with no LB, small
> amount of ACLs and PGs.
> (
> https://github.com/hzhou8/ovn-test-script/blob/main/args.500ch_50lsp_10000pg
> )
>
> The test results for ovn-northd recompute time are:
> main: 1118.3
> p9: 1129.5
> p10: 1243.4 ==> 10% more than p9
> p18: 1357.6
>
> I am not sure if the p10 increase is related to the hash change or
> something else.
>
> > Below are the results with the time taken for the mentioned engine
> > nodes in msecs for a recompute for some of the individual patches in
> > the series.
> >
> >
> > The sample OVN DBs have
> >
> > --------------------------------
> > Resource              Total
> > -------------------------------
> > Logical switches    1001
> > ----------------------------
> > Logical routers      501
> > ----------------------------
> > Router ports         1501
> > ----------------------------
> > Switch ports         29501
> > ----------------------------
> > Load balancers    35253
> > ----------------------------
> > Load balancer group 1
> > ----------------------------
> > SB Logical flows    268349
> > ----------------------------
> > SB DB groups          509
> > ----------------------------
> >
> > There is one load balancer group which has all the load balancers and
> > it is associated with all the logical switches and routers.
> >
> > Below is the time taken for each engine node in msec
> >
> > ---------------------------------------------------------------------
> > Engine nodes     | lb_data | northd  | lr_lbnat  | lflow  |
> > ---------------------------------------------------------------------
> > ovn-northd-main  | 358      | 2455    | x         | 2082     |
> > ---------------------------------------------------------------------
> > ovn-northd-p1    | 373       | 2476    | x         | 2170       |
> > ---------------------------------------------------------------------
> > ovn-northd-p5    | 367       | 2413    | x         | 2195       |
> > ---------------------------------------------------------------------
> > ovn-northd-p6    | 354       | 688     | 1815      | 2442    |
> > ---------------------------------------------------------------------
> > ovn-northd-p7    | 357       | 683     | 1778      | 2806    |
> > ---------------------------------------------------------------------
> > ovn-northd-p9    | 352       | 682     | 1781      | 2781    |
> > ---------------------------------------------------------------------
> > ovn-northd-p10   | 365      | 838     | 1707      | 2812    |
> > ---------------------------------------------------------------------
> > ovn-northd-p13   | 362      | 1066    | 1882      | 2917    |
> > ---------------------------------------------------------------------
> > ovn-northd-p15   | 359      | 1046    | 1688      | 2907    |
> > ---------------------------------------------------------------------
> > ovn-northd-p18   | 379      | 1066    | 1729      | 2886    |
> > ---------------------------------------------------------------------
> >
> > ovn-northd-p1 means ovn-northd with the patch 1 of this series,
> > ovn-northd-p2 - patch 2 of this series and so on.
> >
> > Patch 6 adds a new engine node lr_lbnat and that's why the northd
> > engine node time gets reduced from ~2413 msec to 688.
> >
> > With the first 10 patches,  northd engine over all time increases to
> > 200msec compared to "main '' and lflow engine node time increases to
> > around 800 msec.
> > The point of this data is to show that the overall increase is mainly
> > due to bookkeeping and new engine nodes.  I tried to optimise but I
> > couldn't.
> >
> >
> > IIMO this 20% overall increase should be fine for the following reasons
> >    -  The scale tests with ovn-heater show significant decrease in
> > ovn-northd CPU usage and the decrease in the overall test duration.
> >    -  This means there were very few recomputes triggered.
> >    -  With the OCP/K8s  kube-burner  tests,  there were 0 recomputes
> > in the northd and lflow engine nodes.
> >
> >
> >
> > >
> > > >
> > > > Prior to this patch series, most of the changes to northd engine
> > > > resulted in full recomputation of logical flows.  This series
> > > > aims to improve the performance of ovn-northd by adding the I-P
> > > > support.
> > >
> > > I'd like to clarify "most of the changes" a little. I think we should
> focus
> > > on the most impactful changes that happen in a cloud environment. I
> don't
> > > think it is realistic to achieve "most of the changes" in I-P because
> it is
> > > too complex (even with this series we still handle a very small part of
> the
> > > DB schema incrementally), but it may be realistic to handle the most
> > > impactful changes, which are the most frequent changes in the cloud,
> > > incrementally. Before this series, we could handle regular VIF changes
> and
> > > related address-sets, port-groups (some of) updates incrementally. Those
> > > are the most common changes related to pod/VM changes in cloud. I
> believe
> > > the next goal is for LB changes related to pod/VMs (i.e. the LB
> backends),
> > > which I believe is the main goal of this series. Is my understanding
> > > correct?
> > >
> > That is correct.  LB change handling is very important in my opinion
> > when used with ovn-kubernetes because all the k8s services
> > are mapped to OVN LBs.
> >
> >
> > > While reviewing the patches, I'd say sometimes I feel a little lost
> because
> > > it is hard to correlate some of the changes with the above goal.
> >
> > I understand that.  Maybe I've done a bad job in conveying why the
> > changes are required for LBs.
> >
> > Let me try to clarify a bit here.  I'll incorporate and add more
> > details in the patch commits and cover letter in the next series.
> >
> > This series mainly targets handling lflow generation for load balancer
> changes.
> > As you know we use conntrack in OVN for
> >    - Load balancers (both in logical switch and router)
> >    - ACLs in logical switch
> >    - NAT in load balancers.
> >
> > And this makes it complicated to decouple logical flow generation for
> > LBs from ACLs and NATs.  That is the reason I had to split northd node
> > data related to load balancers, ACLs and NATs into separate nodes.
> >
> > To give an example, we generate certain logical flows for a logical
> > switch by checking " if (od->has_lb_vip && od->has_stateful_acl)" (See
> > build_acl_hints()).
> > For logical routers, a NAT can be a LB VIP too.  This becomes
> > difficult to decouple LB and NAT.   We could fall back to recompute
> > for NAT changes.  But we still need to group logical flows related to
> > LB and NAT.  And that's why I added lr_lbnat_data engine node in patch
> > 6.
> >
>
> Thanks for explaining. However, I wonder if these are related to LB backend
> updates. I agree that service creation/deletion may also be important to be
> handled incrementally, but they are less critical (less frequent) than the
> backend updates which are directly related to pod/vm creation/deletion.

From what I understand, at least for OCP use cases they seem to be critical.
Also it depends on the use case I suppose.

 The
> has_xxx and NAT related handling are more related to the service
> creation/deletion rather than LB backend updates, right?

I think so.

Numan

>
> Thanks,
> Han
>
> >  I believe
> > > there is a reason for all the changes but I am not sure if it is
> directly
> > > related to the goal of LB backend related I-P. I understand that there
> are
> > > other aspects of LB that can be incrementally processed. But I'd
> recommend
> > > if changes necessary for this goal can be largely reduced it would help
> the
> > > review and we might be able to merge them sooner and add more but less
> > > impactful I-P later step by step. I guess I will have a clearer picture
> > > when I finish the rest of the patches, but it would be helpful if you
> could
> > > add more guidance in this cover letter.
> >
> > I'll definitely add more details in the cover letter.
> >
> > Apart from the LB related I-P, this series only adds 2 more additional
> > patches for I-P.
> > One is for the NB_Global changes and the other for SB mac binding changes.
> > ovn-kubernetes periodically writes in the NB_Global.options column
> > with a timestamp for its internal health checks
> > and this results in unnecessary recomputes.   After running the
> > kube-burner density cni tests, I figured out the most common
> > changes done in the NB DB and added the I-P for that.  With all these
> > patches applied, there are no recomputes in both northd and lflow
> > engine nodes
> > during the test duration.  Not having the I-P for NB Global for
> > example was resulting in significant ovn-northd CPU usage.
> >
> > Thanks again for the reviews.
> >
> > Numan
> >
> > >
> > > Thanks,
> > > Han
> > >
> > > >  In order to add this support, some of the northd engine
> > > > node data (from struct ovn_datapath) is split and moved over to
> > > > new engine nodes - mainly related to load balancers, NAT and ACLs.
> > > >
> > > > Below are the scale testing results done with these patches applied
> > > > using ovn-heater.  The test ran the scenario  -
> > > > ocp-500-density-heavy.yml [1].
> > > >
> > > > With all the lflow I-P patches applied, the resuts are:
> > > >
> > > >
> > >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> > > >                         Min (s)         Median (s)      90%ile (s)
> > >  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
> > > Failed
> > > >
> > >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> > > > Iteration Total         0.136883        1.129016        1.192001
> > >  1.204167        1.212728        0.665017        83.127099       125
>   0
> > > > Namespace.add_ports     0.005216        0.005736        0.007034
> > >  0.015486        0.018978        0.006211        0.776373        125
>   0
> > > > WorkerNode.bind_port    0.035030        0.046082        0.052469
> > >  0.058293        0.060311        0.045973        11.493259       250
>   0
> > > > WorkerNode.ping_port    0.005057        0.006727        1.047692
> > >  1.069253        1.071336        0.266896        66.724094       250
>   0
> > > >
> > >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> > > >
> > > > The results with the present main are:
> > > >
> > > >
> > >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> > > >                         Min (s)         Median (s)      90%ile (s)
> > >  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
> > > Failed
> > > >
> > >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> > > > Iteration Total         0.135491        2.223805        3.311270
> > >  3.339078        3.345346        1.729172        216.146495      125
>   0
> > > > Namespace.add_ports     0.005380        0.005744        0.006819
> > >  0.018773        0.020800        0.006292        0.786532        125
>   0
> > > > WorkerNode.bind_port    0.034179        0.046055        0.053488
> > >  0.058801        0.071043        0.046117        11.529311       250
>   0
> > > > WorkerNode.ping_port    0.004956        0.006952        3.086952
> > >  3.191743        3.192807        0.791544        197.886026      250
>   0
> > > >
> > >
> -------------------------------------------------------------------------------------------------------------------------------------------------------
> > > >
> > > > [1] -
> > >
> https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
> > > >
> > > > v1 -> v2
> > > > --------
> > > >    * Now also maintaing array indexes for ls_lbacls, lr_nat and
> > > >      lr_lb_nat_data tables (similar to ovn_datapaths->array) to
> > > >      make the lookup effecient.  The same ovn_datapath->index
> > > >      is reused.
> > > >
> > > >    * Made some signficant changes to 'struct lflow_ref' in
> lflow-mgr.c.
> > > >      In v2 we don't use objdep_mgr to maintain the resource to lflow
> > > >      references.  Instead we maintain the 'struct lflow' pointer.
> > > >      With this we don't need to maintain additional hmap of lflows.
> > > >
> > > >
> > > > Numan Siddique (18):
> > > >   northd: Refactor the northd change tracking.
> > > >   northd: Track ovn_datapaths in northd engine track data.
> > > >   tests: Add a couple of tests in ovn-northd for I-P.
> > > >   northd: Move router ports SB PB options sync to sync_to_sb_pb node.
> > > >   northd: Add a new engine 'lr-nat' to manage lr NAT data.
> > > >   northd: Add a new engine 'lr-lb-nat-data' to manage routers' lb
> data.
> > > >   northd:  Generate logical router's LB and NAT flows using
> > > >     lr_lbnat_data.
> > > >   northd: Don't commit dhcp response flows in the conntrack.
> > > >   northd: Add a new node ls_lbacls.
> > > >   northd: Refactor lflow management into a separate module.
> > > >   northd: Use lflow_ref when adding all logical flows.
> > > >   northd:  Move ovn_lb_datapaths from lib to northd module.
> > > >   northd: Handle lb changes in lflow engine.
> > > >   northd: Add lr_lb_nat_data handler for lflow engine node.
> > > >   northd: Add ls_lbacls handler for lflow engine node.
> > > >   northd:  Add I-P for NB_Global and SB_Global.
> > > >   northd: Add a noop handler for northd SB mac binding.
> > > >   northd: Add northd change handler for sync_to_sb_lb node.
> > > >
> > > >  lib/lb.c                   |   96 -
> > > >  lib/lb.h                   |   57 -
> > > >  lib/ovn-util.c             |   17 +-
> > > >  lib/ovn-util.h             |    2 +-
> > > >  lib/stopwatch-names.h      |    3 +
> > > >  northd/aging.c             |   21 +-
> > > >  northd/automake.mk         |   12 +-
> > > >  northd/en-global-config.c  |  588 ++++
> > > >  northd/en-global-config.h  |   65 +
> > > >  northd/en-lflow.c          |  123 +-
> > > >  northd/en-lflow.h          |    8 +
> > > >  northd/en-lr-lb-nat-data.c |  685 ++++
> > > >  northd/en-lr-lb-nat-data.h |  125 +
> > > >  northd/en-lr-nat.c         |  498 +++
> > > >  northd/en-lr-nat.h         |  137 +
> > > >  northd/en-ls-lb-acls.c     |  530 +++
> > > >  northd/en-ls-lb-acls.h     |  111 +
> > > >  northd/en-northd.c         |   67 +-
> > > >  northd/en-northd.h         |    2 +-
> > > >  northd/en-port-group.h     |    3 +
> > > >  northd/en-sync-sb.c        |  513 ++-
> > > >  northd/inc-proc-northd.c   |   79 +-
> > > >  northd/lflow-mgr.c         | 1078 ++++++
> > > >  northd/lflow-mgr.h         |  192 ++
> > > >  northd/northd.c            | 6485
> ++++++++++++++++--------------------
> > > >  northd/northd.h            |  551 ++-
> > > >  northd/ovn-northd.c        |    4 +
> > > >  tests/ovn-northd.at        |  858 ++++-
> > > >  28 files changed, 8827 insertions(+), 4083 deletions(-)
> > > >  create mode 100644 northd/en-global-config.c
> > > >  create mode 100644 northd/en-global-config.h
> > > >  create mode 100644 northd/en-lr-lb-nat-data.c
> > > >  create mode 100644 northd/en-lr-lb-nat-data.h
> > > >  create mode 100644 northd/en-lr-nat.c
> > > >  create mode 100644 northd/en-lr-nat.h
> > > >  create mode 100644 northd/en-ls-lb-acls.c
> > > >  create mode 100644 northd/en-ls-lb-acls.h
> > > >  create mode 100644 northd/lflow-mgr.c
> > > >  create mode 100644 northd/lflow-mgr.h
> > > >
> > > > --
> > > > 2.41.0
> > > >
> > > > _______________________________________________
> > > > dev mailing list
> > > > dev@openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > _______________________________________________
> > > dev mailing list
> > > dev@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Dumitru Ceara Nov. 24, 2023, 3:38 p.m. UTC | #5
On 11/16/23 22:05, Numan Siddique wrote:
> On Thu, Nov 16, 2023 at 2:54 PM Han Zhou <hzhou@ovn.org> wrote:
>>
>> On Wed, Nov 15, 2023 at 7:32 PM Numan Siddique <numans@ovn.org> wrote:
>>>
>>> On Wed, Nov 15, 2023 at 2:59 AM Han Zhou <hzhou@ovn.org> wrote:
>>>>
>>>> On Thu, Oct 26, 2023 at 11:12 AM <numans@ovn.org> wrote:
>>>>>
>>>>> From: Numan Siddique <numans@ovn.org>
>>>>>
>>>>> This patch series adds incremental processing in the lflow engine
>>>>> node to handle changes to northd and other engine nodes.
>>>>> Changed related to load balancers and NAT are mainly handled in
>>>>> this patch series.
>>>>>
>>>>> This patch series can also be found here -
>>>> https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1
>>>>
>>>> Thanks Numan for the great improvement!
>>>
>>> Hi Han,
>>>
>>> Thanks for the review comments.  I understand it is hard to review
>>> somany patches, especially related to I-P.
>>> I appreciate the time spent on it and  the feedback.
>>>
>>>>
>>>> I spent some time these days to review the series and now at patch 10. I
>>>> need to take a break and so I just sent the comments I had so far.
>>>>
>>>> I also did scale testing for northd with
>>>> https://github.com/hzhou8/ovn-test-script for a 500 chassis, 50 lsp /
>>>> chassis setup, and noticed that the recompute latency has increased 20%
>>>> after the series. I think in general it is expected to have some
>>>> performance penalty for the recompute case because of the new indexes
>>>> created for I-P. However, the patch 10 "northd: Refactor lflow
>> management
>>>> into a separate module." alone introduces a 10% latency increase, which
>>>> necessitates more investigation.
>>>
>>> Before sending out the series I  did some testing on recomputes with a
>>> large OVN NB DB and SB DB
>>> (from a 500 node ovn-heater density heavy run).
>>> I'm aware of the increase in recomputes.  And I did some more testing
>>> today as well.
>>>
>>> In my testing,  I can see that the increase in latency is due to the
>>> new engine nodes added (lr_lbnat mainly)
>>> and due to housekeeping of the lflow references.  I do not see any
>>> increase due to the new lflow-mgr.c added in patch 10.
>>>
>>> I compared patch 9 and patch 10 separately and there is no difference
>>> in lflow engine node recompute time between patch 9 and 10.
>>>
>>
>> My results were different. My test profile simulates the ovn-k8s topology
>> (central mode, not IC) with 500 nodes, 50 LSP/node, with no LB, small
>> amount of ACLs and PGs.
>> (
>> https://github.com/hzhou8/ovn-test-script/blob/main/args.500ch_50lsp_10000pg
>> )
>>
>> The test results for ovn-northd recompute time are:
>> main: 1118.3
>> p9: 1129.5
>> p10: 1243.4 ==> 10% more than p9
>> p18: 1357.6
>>
>> I am not sure if the p10 increase is related to the hash change or
>> something else.
>>

I didn't review p10 in detail yet but I did run some tests and (with gcc
and CFLAGS="-O2 -fno-omit-frame-pointer -fno-common -g") I see no
significant difference between p9 and p10 when running Han's scale test
profile.

Then I also tried the same thing using the 500 nodes + 50 LSP/node perf
test we already have in the repo:

https://github.com/ovn-org/ovn/blob/5ef2a08ade01d698f84e197987ea679d8978d2b9/tests/perf-northd.at#L185

Again, I didn't see any significant change between p9 and p10 on my
laptop.  I wonder what's different in our setups.

Kind of related, I posted a series to improve the in-tree perf testing
and allow us to also gather recompute stats (build the DB, reset
stopwatches and trigger a recompute, then collect stopwatches):

https://patchwork.ozlabs.org/project/ovn/list/?series=383727&state=*

Would it be an idea to try to merge both scenarios you guys used into
something that's defined as a new test in-tree?  Like that it's easier
for anyone to just run the same test.

Thanks,
Dumitru

>>> Below are the results with the time taken for the mentioned engine
>>> nodes in msecs for a recompute for some of the individual patches in
>>> the series.
>>>
>>>
>>> The sample OVN DBs have
>>>
>>> --------------------------------
>>> Resource              Total
>>> -------------------------------
>>> Logical switches    1001
>>> ----------------------------
>>> Logical routers      501
>>> ----------------------------
>>> Router ports         1501
>>> ----------------------------
>>> Switch ports         29501
>>> ----------------------------
>>> Load balancers    35253
>>> ----------------------------
>>> Load balancer group 1
>>> ----------------------------
>>> SB Logical flows    268349
>>> ----------------------------
>>> SB DB groups          509
>>> ----------------------------
>>>
>>> There is one load balancer group which has all the load balancers and
>>> it is associated with all the logical switches and routers.
>>>
>>> Below is the time taken for each engine node in msec
>>>
>>> ---------------------------------------------------------------------
>>> Engine nodes     | lb_data | northd  | lr_lbnat  | lflow  |
>>> ---------------------------------------------------------------------
>>> ovn-northd-main  | 358      | 2455    | x         | 2082     |
>>> ---------------------------------------------------------------------
>>> ovn-northd-p1    | 373       | 2476    | x         | 2170       |
>>> ---------------------------------------------------------------------
>>> ovn-northd-p5    | 367       | 2413    | x         | 2195       |
>>> ---------------------------------------------------------------------
>>> ovn-northd-p6    | 354       | 688     | 1815      | 2442    |
>>> ---------------------------------------------------------------------
>>> ovn-northd-p7    | 357       | 683     | 1778      | 2806    |
>>> ---------------------------------------------------------------------
>>> ovn-northd-p9    | 352       | 682     | 1781      | 2781    |
>>> ---------------------------------------------------------------------
>>> ovn-northd-p10   | 365      | 838     | 1707      | 2812    |
>>> ---------------------------------------------------------------------
>>> ovn-northd-p13   | 362      | 1066    | 1882      | 2917    |
>>> ---------------------------------------------------------------------
>>> ovn-northd-p15   | 359      | 1046    | 1688      | 2907    |
>>> ---------------------------------------------------------------------
>>> ovn-northd-p18   | 379      | 1066    | 1729      | 2886    |
>>> ---------------------------------------------------------------------
>>>
>>> ovn-northd-p1 means ovn-northd with the patch 1 of this series,
>>> ovn-northd-p2 - patch 2 of this series and so on.
>>>
>>> Patch 6 adds a new engine node lr_lbnat and that's why the northd
>>> engine node time gets reduced from ~2413 msec to 688.
>>>
>>> With the first 10 patches,  northd engine over all time increases to
>>> 200msec compared to "main '' and lflow engine node time increases to
>>> around 800 msec.
>>> The point of this data is to show that the overall increase is mainly
>>> due to bookkeeping and new engine nodes.  I tried to optimise but I
>>> couldn't.
>>>
>>>
>>> IIMO this 20% overall increase should be fine for the following reasons
>>>    -  The scale tests with ovn-heater show significant decrease in
>>> ovn-northd CPU usage and the decrease in the overall test duration.
>>>    -  This means there were very few recomputes triggered.
>>>    -  With the OCP/K8s  kube-burner  tests,  there were 0 recomputes
>>> in the northd and lflow engine nodes.
>>>
>>>
>>>
>>>>
>>>>>
>>>>> Prior to this patch series, most of the changes to northd engine
>>>>> resulted in full recomputation of logical flows.  This series
>>>>> aims to improve the performance of ovn-northd by adding the I-P
>>>>> support.
>>>>
>>>> I'd like to clarify "most of the changes" a little. I think we should
>> focus
>>>> on the most impactful changes that happen in a cloud environment. I
>> don't
>>>> think it is realistic to achieve "most of the changes" in I-P because
>> it is
>>>> too complex (even with this series we still handle a very small part of
>> the
>>>> DB schema incrementally), but it may be realistic to handle the most
>>>> impactful changes, which are the most frequent changes in the cloud,
>>>> incrementally. Before this series, we could handle regular VIF changes
>> and
>>>> related address-sets, port-groups (some of) updates incrementally. Those
>>>> are the most common changes related to pod/VM changes in cloud. I
>> believe
>>>> the next goal is for LB changes related to pod/VMs (i.e. the LB
>> backends),
>>>> which I believe is the main goal of this series. Is my understanding
>>>> correct?
>>>>
>>> That is correct.  LB change handling is very important in my opinion
>>> when used with ovn-kubernetes because all the k8s services
>>> are mapped to OVN LBs.
>>>
>>>
>>>> While reviewing the patches, I'd say sometimes I feel a little lost
>> because
>>>> it is hard to correlate some of the changes with the above goal.
>>>
>>> I understand that.  Maybe I've done a bad job in conveying why the
>>> changes are required for LBs.
>>>
>>> Let me try to clarify a bit here.  I'll incorporate and add more
>>> details in the patch commits and cover letter in the next series.
>>>
>>> This series mainly targets handling lflow generation for load balancer
>> changes.
>>> As you know we use conntrack in OVN for
>>>    - Load balancers (both in logical switch and router)
>>>    - ACLs in logical switch
>>>    - NAT in load balancers.
>>>
>>> And this makes it complicated to decouple logical flow generation for
>>> LBs from ACLs and NATs.  That is the reason I had to split northd node
>>> data related to load balancers, ACLs and NATs into separate nodes.
>>>
>>> To give an example, we generate certain logical flows for a logical
>>> switch by checking " if (od->has_lb_vip && od->has_stateful_acl)" (See
>>> build_acl_hints()).
>>> For logical routers, a NAT can be a LB VIP too.  This becomes
>>> difficult to decouple LB and NAT.   We could fall back to recompute
>>> for NAT changes.  But we still need to group logical flows related to
>>> LB and NAT.  And that's why I added lr_lbnat_data engine node in patch
>>> 6.
>>>
>>
>> Thanks for explaining. However, I wonder if these are related to LB backend
>> updates. I agree that service creation/deletion may also be important to be
>> handled incrementally, but they are less critical (less frequent) than the
>> backend updates which are directly related to pod/vm creation/deletion.
> 
> From what I understand, at least for OCP use cases they seem to be critical.
> Also it depends on the use case I suppose.
> 
>  The
>> has_xxx and NAT related handling are more related to the service
>> creation/deletion rather than LB backend updates, right?
> 
> I think so.
> 
> Numan
> 
>>
>> Thanks,
>> Han
>>
>>>  I believe
>>>> there is a reason for all the changes but I am not sure if it is
>> directly
>>>> related to the goal of LB backend related I-P. I understand that there
>> are
>>>> other aspects of LB that can be incrementally processed. But I'd
>> recommend
>>>> if changes necessary for this goal can be largely reduced it would help
>> the
>>>> review and we might be able to merge them sooner and add more but less
>>>> impactful I-P later step by step. I guess I will have a clearer picture
>>>> when I finish the rest of the patches, but it would be helpful if you
>> could
>>>> add more guidance in this cover letter.
>>>
>>> I'll definitely add more details in the cover letter.
>>>
>>> Apart from the LB related I-P, this series only adds 2 more additional
>>> patches for I-P.
>>> One is for the NB_Global changes and the other for SB mac binding changes.
>>> ovn-kubernetes periodically writes in the NB_Global.options column
>>> with a timestamp for its internal health checks
>>> and this results in unnecessary recomputes.   After running the
>>> kube-burner density cni tests, I figured out the most common
>>> changes done in the NB DB and added the I-P for that.  With all these
>>> patches applied, there are no recomputes in both northd and lflow
>>> engine nodes
>>> during the test duration.  Not having the I-P for NB Global for
>>> example was resulting in significant ovn-northd CPU usage.
>>>
>>> Thanks again for the reviews.
>>>
>>> Numan
>>>
>>>>
>>>> Thanks,
>>>> Han
>>>>
>>>>>  In order to add this support, some of the northd engine
>>>>> node data (from struct ovn_datapath) is split and moved over to
>>>>> new engine nodes - mainly related to load balancers, NAT and ACLs.
>>>>>
>>>>> Below are the scale testing results done with these patches applied
>>>>> using ovn-heater.  The test ran the scenario  -
>>>>> ocp-500-density-heavy.yml [1].
>>>>>
>>>>> With all the lflow I-P patches applied, the resuts are:
>>>>>
>>>>>
>>>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>                         Min (s)         Median (s)      90%ile (s)
>>>>  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
>>>> Failed
>>>>>
>>>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>> Iteration Total         0.136883        1.129016        1.192001
>>>>  1.204167        1.212728        0.665017        83.127099       125
>>   0
>>>>> Namespace.add_ports     0.005216        0.005736        0.007034
>>>>  0.015486        0.018978        0.006211        0.776373        125
>>   0
>>>>> WorkerNode.bind_port    0.035030        0.046082        0.052469
>>>>  0.058293        0.060311        0.045973        11.493259       250
>>   0
>>>>> WorkerNode.ping_port    0.005057        0.006727        1.047692
>>>>  1.069253        1.071336        0.266896        66.724094       250
>>   0
>>>>>
>>>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>
>>>>> The results with the present main are:
>>>>>
>>>>>
>>>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>                         Min (s)         Median (s)      90%ile (s)
>>>>  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
>>>> Failed
>>>>>
>>>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>> Iteration Total         0.135491        2.223805        3.311270
>>>>  3.339078        3.345346        1.729172        216.146495      125
>>   0
>>>>> Namespace.add_ports     0.005380        0.005744        0.006819
>>>>  0.018773        0.020800        0.006292        0.786532        125
>>   0
>>>>> WorkerNode.bind_port    0.034179        0.046055        0.053488
>>>>  0.058801        0.071043        0.046117        11.529311       250
>>   0
>>>>> WorkerNode.ping_port    0.004956        0.006952        3.086952
>>>>  3.191743        3.192807        0.791544        197.886026      250
>>   0
>>>>>
>>>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>
>>>>> [1] -
>>>>
>> https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
>>>>>
>>>>> v1 -> v2
>>>>> --------
>>>>>    * Now also maintaing array indexes for ls_lbacls, lr_nat and
>>>>>      lr_lb_nat_data tables (similar to ovn_datapaths->array) to
>>>>>      make the lookup effecient.  The same ovn_datapath->index
>>>>>      is reused.
>>>>>
>>>>>    * Made some signficant changes to 'struct lflow_ref' in
>> lflow-mgr.c.
>>>>>      In v2 we don't use objdep_mgr to maintain the resource to lflow
>>>>>      references.  Instead we maintain the 'struct lflow' pointer.
>>>>>      With this we don't need to maintain additional hmap of lflows.
>>>>>
>>>>>
>>>>> Numan Siddique (18):
>>>>>   northd: Refactor the northd change tracking.
>>>>>   northd: Track ovn_datapaths in northd engine track data.
>>>>>   tests: Add a couple of tests in ovn-northd for I-P.
>>>>>   northd: Move router ports SB PB options sync to sync_to_sb_pb node.
>>>>>   northd: Add a new engine 'lr-nat' to manage lr NAT data.
>>>>>   northd: Add a new engine 'lr-lb-nat-data' to manage routers' lb
>> data.
>>>>>   northd:  Generate logical router's LB and NAT flows using
>>>>>     lr_lbnat_data.
>>>>>   northd: Don't commit dhcp response flows in the conntrack.
>>>>>   northd: Add a new node ls_lbacls.
>>>>>   northd: Refactor lflow management into a separate module.
>>>>>   northd: Use lflow_ref when adding all logical flows.
>>>>>   northd:  Move ovn_lb_datapaths from lib to northd module.
>>>>>   northd: Handle lb changes in lflow engine.
>>>>>   northd: Add lr_lb_nat_data handler for lflow engine node.
>>>>>   northd: Add ls_lbacls handler for lflow engine node.
>>>>>   northd:  Add I-P for NB_Global and SB_Global.
>>>>>   northd: Add a noop handler for northd SB mac binding.
>>>>>   northd: Add northd change handler for sync_to_sb_lb node.
>>>>>
>>>>>  lib/lb.c                   |   96 -
>>>>>  lib/lb.h                   |   57 -
>>>>>  lib/ovn-util.c             |   17 +-
>>>>>  lib/ovn-util.h             |    2 +-
>>>>>  lib/stopwatch-names.h      |    3 +
>>>>>  northd/aging.c             |   21 +-
>>>>>  northd/automake.mk         |   12 +-
>>>>>  northd/en-global-config.c  |  588 ++++
>>>>>  northd/en-global-config.h  |   65 +
>>>>>  northd/en-lflow.c          |  123 +-
>>>>>  northd/en-lflow.h          |    8 +
>>>>>  northd/en-lr-lb-nat-data.c |  685 ++++
>>>>>  northd/en-lr-lb-nat-data.h |  125 +
>>>>>  northd/en-lr-nat.c         |  498 +++
>>>>>  northd/en-lr-nat.h         |  137 +
>>>>>  northd/en-ls-lb-acls.c     |  530 +++
>>>>>  northd/en-ls-lb-acls.h     |  111 +
>>>>>  northd/en-northd.c         |   67 +-
>>>>>  northd/en-northd.h         |    2 +-
>>>>>  northd/en-port-group.h     |    3 +
>>>>>  northd/en-sync-sb.c        |  513 ++-
>>>>>  northd/inc-proc-northd.c   |   79 +-
>>>>>  northd/lflow-mgr.c         | 1078 ++++++
>>>>>  northd/lflow-mgr.h         |  192 ++
>>>>>  northd/northd.c            | 6485
>> ++++++++++++++++--------------------
>>>>>  northd/northd.h            |  551 ++-
>>>>>  northd/ovn-northd.c        |    4 +
>>>>>  tests/ovn-northd.at        |  858 ++++-
>>>>>  28 files changed, 8827 insertions(+), 4083 deletions(-)
>>>>>  create mode 100644 northd/en-global-config.c
>>>>>  create mode 100644 northd/en-global-config.h
>>>>>  create mode 100644 northd/en-lr-lb-nat-data.c
>>>>>  create mode 100644 northd/en-lr-lb-nat-data.h
>>>>>  create mode 100644 northd/en-lr-nat.c
>>>>>  create mode 100644 northd/en-lr-nat.h
>>>>>  create mode 100644 northd/en-ls-lb-acls.c
>>>>>  create mode 100644 northd/en-ls-lb-acls.h
>>>>>  create mode 100644 northd/lflow-mgr.c
>>>>>  create mode 100644 northd/lflow-mgr.h
>>>>>
>>>>> --
>>>>> 2.41.0
>>>>>
>>>>> _______________________________________________
>>>>> dev mailing list
>>>>> dev@openvswitch.org
>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev@openvswitch.org
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Numan Siddique Nov. 24, 2023, 3:41 p.m. UTC | #6
On Fri, Nov 24, 2023 at 10:38 AM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On 11/16/23 22:05, Numan Siddique wrote:
> > On Thu, Nov 16, 2023 at 2:54 PM Han Zhou <hzhou@ovn.org> wrote:
> >>
> >> On Wed, Nov 15, 2023 at 7:32 PM Numan Siddique <numans@ovn.org> wrote:
> >>>
> >>> On Wed, Nov 15, 2023 at 2:59 AM Han Zhou <hzhou@ovn.org> wrote:
> >>>>
> >>>> On Thu, Oct 26, 2023 at 11:12 AM <numans@ovn.org> wrote:
> >>>>>
> >>>>> From: Numan Siddique <numans@ovn.org>
> >>>>>
> >>>>> This patch series adds incremental processing in the lflow engine
> >>>>> node to handle changes to northd and other engine nodes.
> >>>>> Changed related to load balancers and NAT are mainly handled in
> >>>>> this patch series.
> >>>>>
> >>>>> This patch series can also be found here -
> >>>> https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1
> >>>>
> >>>> Thanks Numan for the great improvement!
> >>>
> >>> Hi Han,
> >>>
> >>> Thanks for the review comments.  I understand it is hard to review
> >>> somany patches, especially related to I-P.
> >>> I appreciate the time spent on it and  the feedback.
> >>>
> >>>>
> >>>> I spent some time these days to review the series and now at patch 10. I
> >>>> need to take a break and so I just sent the comments I had so far.
> >>>>
> >>>> I also did scale testing for northd with
> >>>> https://github.com/hzhou8/ovn-test-script for a 500 chassis, 50 lsp /
> >>>> chassis setup, and noticed that the recompute latency has increased 20%
> >>>> after the series. I think in general it is expected to have some
> >>>> performance penalty for the recompute case because of the new indexes
> >>>> created for I-P. However, the patch 10 "northd: Refactor lflow
> >> management
> >>>> into a separate module." alone introduces a 10% latency increase, which
> >>>> necessitates more investigation.
> >>>
> >>> Before sending out the series I  did some testing on recomputes with a
> >>> large OVN NB DB and SB DB
> >>> (from a 500 node ovn-heater density heavy run).
> >>> I'm aware of the increase in recomputes.  And I did some more testing
> >>> today as well.
> >>>
> >>> In my testing,  I can see that the increase in latency is due to the
> >>> new engine nodes added (lr_lbnat mainly)
> >>> and due to housekeeping of the lflow references.  I do not see any
> >>> increase due to the new lflow-mgr.c added in patch 10.
> >>>
> >>> I compared patch 9 and patch 10 separately and there is no difference
> >>> in lflow engine node recompute time between patch 9 and 10.
> >>>
> >>
> >> My results were different. My test profile simulates the ovn-k8s topology
> >> (central mode, not IC) with 500 nodes, 50 LSP/node, with no LB, small
> >> amount of ACLs and PGs.
> >> (
> >> https://github.com/hzhou8/ovn-test-script/blob/main/args.500ch_50lsp_10000pg
> >> )
> >>
> >> The test results for ovn-northd recompute time are:
> >> main: 1118.3
> >> p9: 1129.5
> >> p10: 1243.4 ==> 10% more than p9
> >> p18: 1357.6
> >>
> >> I am not sure if the p10 increase is related to the hash change or
> >> something else.
> >>
>
> I didn't review p10 in detail yet but I did run some tests and (with gcc
> and CFLAGS="-O2 -fno-omit-frame-pointer -fno-common -g") I see no
> significant difference between p9 and p10 when running Han's scale test
> profile.
>
> Then I also tried the same thing using the 500 nodes + 50 LSP/node perf
> test we already have in the repo:
>
> https://github.com/ovn-org/ovn/blob/5ef2a08ade01d698f84e197987ea679d8978d2b9/tests/perf-northd.at#L185
>
> Again, I didn't see any significant change between p9 and p10 on my
> laptop.  I wonder what's different in our setups.
>
> Kind of related, I posted a series to improve the in-tree perf testing
> and allow us to also gather recompute stats (build the DB, reset
> stopwatches and trigger a recompute, then collect stopwatches):
>
> https://patchwork.ozlabs.org/project/ovn/list/?series=383727&state=*
>
> Would it be an idea to try to merge both scenarios you guys used into
> something that's defined as a new test in-tree?  Like that it's easier
> for anyone to just run the same test.

That sounds good to me. I'll take a look at your patches soon.

Thanks for testing them out.  Can you also please compare OVN main vs
p10 of this series ?

Thanks
Numan


>
> Thanks,
> Dumitru
>
> >>> Below are the results with the time taken for the mentioned engine
> >>> nodes in msecs for a recompute for some of the individual patches in
> >>> the series.
> >>>
> >>>
> >>> The sample OVN DBs have
> >>>
> >>> --------------------------------
> >>> Resource              Total
> >>> -------------------------------
> >>> Logical switches    1001
> >>> ----------------------------
> >>> Logical routers      501
> >>> ----------------------------
> >>> Router ports         1501
> >>> ----------------------------
> >>> Switch ports         29501
> >>> ----------------------------
> >>> Load balancers    35253
> >>> ----------------------------
> >>> Load balancer group 1
> >>> ----------------------------
> >>> SB Logical flows    268349
> >>> ----------------------------
> >>> SB DB groups          509
> >>> ----------------------------
> >>>
> >>> There is one load balancer group which has all the load balancers and
> >>> it is associated with all the logical switches and routers.
> >>>
> >>> Below is the time taken for each engine node in msec
> >>>
> >>> ---------------------------------------------------------------------
> >>> Engine nodes     | lb_data | northd  | lr_lbnat  | lflow  |
> >>> ---------------------------------------------------------------------
> >>> ovn-northd-main  | 358      | 2455    | x         | 2082     |
> >>> ---------------------------------------------------------------------
> >>> ovn-northd-p1    | 373       | 2476    | x         | 2170       |
> >>> ---------------------------------------------------------------------
> >>> ovn-northd-p5    | 367       | 2413    | x         | 2195       |
> >>> ---------------------------------------------------------------------
> >>> ovn-northd-p6    | 354       | 688     | 1815      | 2442    |
> >>> ---------------------------------------------------------------------
> >>> ovn-northd-p7    | 357       | 683     | 1778      | 2806    |
> >>> ---------------------------------------------------------------------
> >>> ovn-northd-p9    | 352       | 682     | 1781      | 2781    |
> >>> ---------------------------------------------------------------------
> >>> ovn-northd-p10   | 365      | 838     | 1707      | 2812    |
> >>> ---------------------------------------------------------------------
> >>> ovn-northd-p13   | 362      | 1066    | 1882      | 2917    |
> >>> ---------------------------------------------------------------------
> >>> ovn-northd-p15   | 359      | 1046    | 1688      | 2907    |
> >>> ---------------------------------------------------------------------
> >>> ovn-northd-p18   | 379      | 1066    | 1729      | 2886    |
> >>> ---------------------------------------------------------------------
> >>>
> >>> ovn-northd-p1 means ovn-northd with the patch 1 of this series,
> >>> ovn-northd-p2 - patch 2 of this series and so on.
> >>>
> >>> Patch 6 adds a new engine node lr_lbnat and that's why the northd
> >>> engine node time gets reduced from ~2413 msec to 688.
> >>>
> >>> With the first 10 patches,  northd engine over all time increases to
> >>> 200msec compared to "main '' and lflow engine node time increases to
> >>> around 800 msec.
> >>> The point of this data is to show that the overall increase is mainly
> >>> due to bookkeeping and new engine nodes.  I tried to optimise but I
> >>> couldn't.
> >>>
> >>>
> >>> IIMO this 20% overall increase should be fine for the following reasons
> >>>    -  The scale tests with ovn-heater show significant decrease in
> >>> ovn-northd CPU usage and the decrease in the overall test duration.
> >>>    -  This means there were very few recomputes triggered.
> >>>    -  With the OCP/K8s  kube-burner  tests,  there were 0 recomputes
> >>> in the northd and lflow engine nodes.
> >>>
> >>>
> >>>
> >>>>
> >>>>>
> >>>>> Prior to this patch series, most of the changes to northd engine
> >>>>> resulted in full recomputation of logical flows.  This series
> >>>>> aims to improve the performance of ovn-northd by adding the I-P
> >>>>> support.
> >>>>
> >>>> I'd like to clarify "most of the changes" a little. I think we should
> >> focus
> >>>> on the most impactful changes that happen in a cloud environment. I
> >> don't
> >>>> think it is realistic to achieve "most of the changes" in I-P because
> >> it is
> >>>> too complex (even with this series we still handle a very small part of
> >> the
> >>>> DB schema incrementally), but it may be realistic to handle the most
> >>>> impactful changes, which are the most frequent changes in the cloud,
> >>>> incrementally. Before this series, we could handle regular VIF changes
> >> and
> >>>> related address-sets, port-groups (some of) updates incrementally. Those
> >>>> are the most common changes related to pod/VM changes in cloud. I
> >> believe
> >>>> the next goal is for LB changes related to pod/VMs (i.e. the LB
> >> backends),
> >>>> which I believe is the main goal of this series. Is my understanding
> >>>> correct?
> >>>>
> >>> That is correct.  LB change handling is very important in my opinion
> >>> when used with ovn-kubernetes because all the k8s services
> >>> are mapped to OVN LBs.
> >>>
> >>>
> >>>> While reviewing the patches, I'd say sometimes I feel a little lost
> >> because
> >>>> it is hard to correlate some of the changes with the above goal.
> >>>
> >>> I understand that.  Maybe I've done a bad job in conveying why the
> >>> changes are required for LBs.
> >>>
> >>> Let me try to clarify a bit here.  I'll incorporate and add more
> >>> details in the patch commits and cover letter in the next series.
> >>>
> >>> This series mainly targets handling lflow generation for load balancer
> >> changes.
> >>> As you know we use conntrack in OVN for
> >>>    - Load balancers (both in logical switch and router)
> >>>    - ACLs in logical switch
> >>>    - NAT in load balancers.
> >>>
> >>> And this makes it complicated to decouple logical flow generation for
> >>> LBs from ACLs and NATs.  That is the reason I had to split northd node
> >>> data related to load balancers, ACLs and NATs into separate nodes.
> >>>
> >>> To give an example, we generate certain logical flows for a logical
> >>> switch by checking " if (od->has_lb_vip && od->has_stateful_acl)" (See
> >>> build_acl_hints()).
> >>> For logical routers, a NAT can be a LB VIP too.  This becomes
> >>> difficult to decouple LB and NAT.   We could fall back to recompute
> >>> for NAT changes.  But we still need to group logical flows related to
> >>> LB and NAT.  And that's why I added lr_lbnat_data engine node in patch
> >>> 6.
> >>>
> >>
> >> Thanks for explaining. However, I wonder if these are related to LB backend
> >> updates. I agree that service creation/deletion may also be important to be
> >> handled incrementally, but they are less critical (less frequent) than the
> >> backend updates which are directly related to pod/vm creation/deletion.
> >
> > From what I understand, at least for OCP use cases they seem to be critical.
> > Also it depends on the use case I suppose.
> >
> >  The
> >> has_xxx and NAT related handling are more related to the service
> >> creation/deletion rather than LB backend updates, right?
> >
> > I think so.
> >
> > Numan
> >
> >>
> >> Thanks,
> >> Han
> >>
> >>>  I believe
> >>>> there is a reason for all the changes but I am not sure if it is
> >> directly
> >>>> related to the goal of LB backend related I-P. I understand that there
> >> are
> >>>> other aspects of LB that can be incrementally processed. But I'd
> >> recommend
> >>>> if changes necessary for this goal can be largely reduced it would help
> >> the
> >>>> review and we might be able to merge them sooner and add more but less
> >>>> impactful I-P later step by step. I guess I will have a clearer picture
> >>>> when I finish the rest of the patches, but it would be helpful if you
> >> could
> >>>> add more guidance in this cover letter.
> >>>
> >>> I'll definitely add more details in the cover letter.
> >>>
> >>> Apart from the LB related I-P, this series only adds 2 more additional
> >>> patches for I-P.
> >>> One is for the NB_Global changes and the other for SB mac binding changes.
> >>> ovn-kubernetes periodically writes in the NB_Global.options column
> >>> with a timestamp for its internal health checks
> >>> and this results in unnecessary recomputes.   After running the
> >>> kube-burner density cni tests, I figured out the most common
> >>> changes done in the NB DB and added the I-P for that.  With all these
> >>> patches applied, there are no recomputes in both northd and lflow
> >>> engine nodes
> >>> during the test duration.  Not having the I-P for NB Global for
> >>> example was resulting in significant ovn-northd CPU usage.
> >>>
> >>> Thanks again for the reviews.
> >>>
> >>> Numan
> >>>
> >>>>
> >>>> Thanks,
> >>>> Han
> >>>>
> >>>>>  In order to add this support, some of the northd engine
> >>>>> node data (from struct ovn_datapath) is split and moved over to
> >>>>> new engine nodes - mainly related to load balancers, NAT and ACLs.
> >>>>>
> >>>>> Below are the scale testing results done with these patches applied
> >>>>> using ovn-heater.  The test ran the scenario  -
> >>>>> ocp-500-density-heavy.yml [1].
> >>>>>
> >>>>> With all the lflow I-P patches applied, the resuts are:
> >>>>>
> >>>>>
> >>>>
> >> -------------------------------------------------------------------------------------------------------------------------------------------------------
> >>>>>                         Min (s)         Median (s)      90%ile (s)
> >>>>  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
> >>>> Failed
> >>>>>
> >>>>
> >> -------------------------------------------------------------------------------------------------------------------------------------------------------
> >>>>> Iteration Total         0.136883        1.129016        1.192001
> >>>>  1.204167        1.212728        0.665017        83.127099       125
> >>   0
> >>>>> Namespace.add_ports     0.005216        0.005736        0.007034
> >>>>  0.015486        0.018978        0.006211        0.776373        125
> >>   0
> >>>>> WorkerNode.bind_port    0.035030        0.046082        0.052469
> >>>>  0.058293        0.060311        0.045973        11.493259       250
> >>   0
> >>>>> WorkerNode.ping_port    0.005057        0.006727        1.047692
> >>>>  1.069253        1.071336        0.266896        66.724094       250
> >>   0
> >>>>>
> >>>>
> >> -------------------------------------------------------------------------------------------------------------------------------------------------------
> >>>>>
> >>>>> The results with the present main are:
> >>>>>
> >>>>>
> >>>>
> >> -------------------------------------------------------------------------------------------------------------------------------------------------------
> >>>>>                         Min (s)         Median (s)      90%ile (s)
> >>>>  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
> >>>> Failed
> >>>>>
> >>>>
> >> -------------------------------------------------------------------------------------------------------------------------------------------------------
> >>>>> Iteration Total         0.135491        2.223805        3.311270
> >>>>  3.339078        3.345346        1.729172        216.146495      125
> >>   0
> >>>>> Namespace.add_ports     0.005380        0.005744        0.006819
> >>>>  0.018773        0.020800        0.006292        0.786532        125
> >>   0
> >>>>> WorkerNode.bind_port    0.034179        0.046055        0.053488
> >>>>  0.058801        0.071043        0.046117        11.529311       250
> >>   0
> >>>>> WorkerNode.ping_port    0.004956        0.006952        3.086952
> >>>>  3.191743        3.192807        0.791544        197.886026      250
> >>   0
> >>>>>
> >>>>
> >> -------------------------------------------------------------------------------------------------------------------------------------------------------
> >>>>>
> >>>>> [1] -
> >>>>
> >> https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
> >>>>>
> >>>>> v1 -> v2
> >>>>> --------
> >>>>>    * Now also maintaing array indexes for ls_lbacls, lr_nat and
> >>>>>      lr_lb_nat_data tables (similar to ovn_datapaths->array) to
> >>>>>      make the lookup effecient.  The same ovn_datapath->index
> >>>>>      is reused.
> >>>>>
> >>>>>    * Made some signficant changes to 'struct lflow_ref' in
> >> lflow-mgr.c.
> >>>>>      In v2 we don't use objdep_mgr to maintain the resource to lflow
> >>>>>      references.  Instead we maintain the 'struct lflow' pointer.
> >>>>>      With this we don't need to maintain additional hmap of lflows.
> >>>>>
> >>>>>
> >>>>> Numan Siddique (18):
> >>>>>   northd: Refactor the northd change tracking.
> >>>>>   northd: Track ovn_datapaths in northd engine track data.
> >>>>>   tests: Add a couple of tests in ovn-northd for I-P.
> >>>>>   northd: Move router ports SB PB options sync to sync_to_sb_pb node.
> >>>>>   northd: Add a new engine 'lr-nat' to manage lr NAT data.
> >>>>>   northd: Add a new engine 'lr-lb-nat-data' to manage routers' lb
> >> data.
> >>>>>   northd:  Generate logical router's LB and NAT flows using
> >>>>>     lr_lbnat_data.
> >>>>>   northd: Don't commit dhcp response flows in the conntrack.
> >>>>>   northd: Add a new node ls_lbacls.
> >>>>>   northd: Refactor lflow management into a separate module.
> >>>>>   northd: Use lflow_ref when adding all logical flows.
> >>>>>   northd:  Move ovn_lb_datapaths from lib to northd module.
> >>>>>   northd: Handle lb changes in lflow engine.
> >>>>>   northd: Add lr_lb_nat_data handler for lflow engine node.
> >>>>>   northd: Add ls_lbacls handler for lflow engine node.
> >>>>>   northd:  Add I-P for NB_Global and SB_Global.
> >>>>>   northd: Add a noop handler for northd SB mac binding.
> >>>>>   northd: Add northd change handler for sync_to_sb_lb node.
> >>>>>
> >>>>>  lib/lb.c                   |   96 -
> >>>>>  lib/lb.h                   |   57 -
> >>>>>  lib/ovn-util.c             |   17 +-
> >>>>>  lib/ovn-util.h             |    2 +-
> >>>>>  lib/stopwatch-names.h      |    3 +
> >>>>>  northd/aging.c             |   21 +-
> >>>>>  northd/automake.mk         |   12 +-
> >>>>>  northd/en-global-config.c  |  588 ++++
> >>>>>  northd/en-global-config.h  |   65 +
> >>>>>  northd/en-lflow.c          |  123 +-
> >>>>>  northd/en-lflow.h          |    8 +
> >>>>>  northd/en-lr-lb-nat-data.c |  685 ++++
> >>>>>  northd/en-lr-lb-nat-data.h |  125 +
> >>>>>  northd/en-lr-nat.c         |  498 +++
> >>>>>  northd/en-lr-nat.h         |  137 +
> >>>>>  northd/en-ls-lb-acls.c     |  530 +++
> >>>>>  northd/en-ls-lb-acls.h     |  111 +
> >>>>>  northd/en-northd.c         |   67 +-
> >>>>>  northd/en-northd.h         |    2 +-
> >>>>>  northd/en-port-group.h     |    3 +
> >>>>>  northd/en-sync-sb.c        |  513 ++-
> >>>>>  northd/inc-proc-northd.c   |   79 +-
> >>>>>  northd/lflow-mgr.c         | 1078 ++++++
> >>>>>  northd/lflow-mgr.h         |  192 ++
> >>>>>  northd/northd.c            | 6485
> >> ++++++++++++++++--------------------
> >>>>>  northd/northd.h            |  551 ++-
> >>>>>  northd/ovn-northd.c        |    4 +
> >>>>>  tests/ovn-northd.at        |  858 ++++-
> >>>>>  28 files changed, 8827 insertions(+), 4083 deletions(-)
> >>>>>  create mode 100644 northd/en-global-config.c
> >>>>>  create mode 100644 northd/en-global-config.h
> >>>>>  create mode 100644 northd/en-lr-lb-nat-data.c
> >>>>>  create mode 100644 northd/en-lr-lb-nat-data.h
> >>>>>  create mode 100644 northd/en-lr-nat.c
> >>>>>  create mode 100644 northd/en-lr-nat.h
> >>>>>  create mode 100644 northd/en-ls-lb-acls.c
> >>>>>  create mode 100644 northd/en-ls-lb-acls.h
> >>>>>  create mode 100644 northd/lflow-mgr.c
> >>>>>  create mode 100644 northd/lflow-mgr.h
> >>>>>
> >>>>> --
> >>>>> 2.41.0
> >>>>>
> >>>>> _______________________________________________
> >>>>> dev mailing list
> >>>>> dev@openvswitch.org
> >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>> _______________________________________________
> >>>> dev mailing list
> >>>> dev@openvswitch.org
> >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >> _______________________________________________
> >> dev mailing list
> >> dev@openvswitch.org
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Dumitru Ceara Nov. 28, 2023, 4:12 p.m. UTC | #7
On 11/24/23 16:41, Numan Siddique wrote:
> On Fri, Nov 24, 2023 at 10:38 AM Dumitru Ceara <dceara@redhat.com> wrote:
>>
>> On 11/16/23 22:05, Numan Siddique wrote:
>>> On Thu, Nov 16, 2023 at 2:54 PM Han Zhou <hzhou@ovn.org> wrote:
>>>>
>>>> On Wed, Nov 15, 2023 at 7:32 PM Numan Siddique <numans@ovn.org> wrote:
>>>>>
>>>>> On Wed, Nov 15, 2023 at 2:59 AM Han Zhou <hzhou@ovn.org> wrote:
>>>>>>
>>>>>> On Thu, Oct 26, 2023 at 11:12 AM <numans@ovn.org> wrote:
>>>>>>>
>>>>>>> From: Numan Siddique <numans@ovn.org>
>>>>>>>
>>>>>>> This patch series adds incremental processing in the lflow engine
>>>>>>> node to handle changes to northd and other engine nodes.
>>>>>>> Changed related to load balancers and NAT are mainly handled in
>>>>>>> this patch series.
>>>>>>>
>>>>>>> This patch series can also be found here -
>>>>>> https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1
>>>>>>
>>>>>> Thanks Numan for the great improvement!
>>>>>
>>>>> Hi Han,
>>>>>
>>>>> Thanks for the review comments.  I understand it is hard to review
>>>>> somany patches, especially related to I-P.
>>>>> I appreciate the time spent on it and  the feedback.
>>>>>
>>>>>>
>>>>>> I spent some time these days to review the series and now at patch 10. I
>>>>>> need to take a break and so I just sent the comments I had so far.
>>>>>>
>>>>>> I also did scale testing for northd with
>>>>>> https://github.com/hzhou8/ovn-test-script for a 500 chassis, 50 lsp /
>>>>>> chassis setup, and noticed that the recompute latency has increased 20%
>>>>>> after the series. I think in general it is expected to have some
>>>>>> performance penalty for the recompute case because of the new indexes
>>>>>> created for I-P. However, the patch 10 "northd: Refactor lflow
>>>> management
>>>>>> into a separate module." alone introduces a 10% latency increase, which
>>>>>> necessitates more investigation.
>>>>>
>>>>> Before sending out the series I  did some testing on recomputes with a
>>>>> large OVN NB DB and SB DB
>>>>> (from a 500 node ovn-heater density heavy run).
>>>>> I'm aware of the increase in recomputes.  And I did some more testing
>>>>> today as well.
>>>>>
>>>>> In my testing,  I can see that the increase in latency is due to the
>>>>> new engine nodes added (lr_lbnat mainly)
>>>>> and due to housekeeping of the lflow references.  I do not see any
>>>>> increase due to the new lflow-mgr.c added in patch 10.
>>>>>
>>>>> I compared patch 9 and patch 10 separately and there is no difference
>>>>> in lflow engine node recompute time between patch 9 and 10.
>>>>>
>>>>
>>>> My results were different. My test profile simulates the ovn-k8s topology
>>>> (central mode, not IC) with 500 nodes, 50 LSP/node, with no LB, small
>>>> amount of ACLs and PGs.
>>>> (
>>>> https://github.com/hzhou8/ovn-test-script/blob/main/args.500ch_50lsp_10000pg
>>>> )
>>>>
>>>> The test results for ovn-northd recompute time are:
>>>> main: 1118.3
>>>> p9: 1129.5
>>>> p10: 1243.4 ==> 10% more than p9
>>>> p18: 1357.6
>>>>
>>>> I am not sure if the p10 increase is related to the hash change or
>>>> something else.
>>>>
>>
>> I didn't review p10 in detail yet but I did run some tests and (with gcc
>> and CFLAGS="-O2 -fno-omit-frame-pointer -fno-common -g") I see no
>> significant difference between p9 and p10 when running Han's scale test
>> profile.
>>
>> Then I also tried the same thing using the 500 nodes + 50 LSP/node perf
>> test we already have in the repo:
>>
>> https://github.com/ovn-org/ovn/blob/5ef2a08ade01d698f84e197987ea679d8978d2b9/tests/perf-northd.at#L185
>>
>> Again, I didn't see any significant change between p9 and p10 on my
>> laptop.  I wonder what's different in our setups.
>>
>> Kind of related, I posted a series to improve the in-tree perf testing
>> and allow us to also gather recompute stats (build the DB, reset
>> stopwatches and trigger a recompute, then collect stopwatches):
>>
>> https://patchwork.ozlabs.org/project/ovn/list/?series=383727&state=*
>>
>> Would it be an idea to try to merge both scenarios you guys used into
>> something that's defined as a new test in-tree?  Like that it's easier
>> for anyone to just run the same test.
> 
> That sounds good to me. I'll take a look at your patches soon.
> 
> Thanks for testing them out.  Can you also please compare OVN main vs
> p10 of this series ?
> 

Sorry for the delay in replying; I ran some more tests both with Han's
benchmark and also with the in-tree check-perf 500 node + 50 LSP/node
test.  I did this with debug logs disabled and ran them multiple times
to collect some more relevant averages (the results did seem more stable).

This is what I measured (where "pX" == "patch X of this series" and
values are averages in milliseconds):

-------------------------------------------------------------------------------
| Benchmark | Recompute "main" | Recompute p9 | Recompute p10 | Recompute p16 |
|------------------------------------------------------------------------------
|   Han's   |      1552        |    1560      |     1676      |     1704      |
-------------------------------------------------------------------------------
|  in-tree  |      1146        |    1250      |     1281      |     1337      |
-------------------------------------------------------------------------------

So, in both cases, I see some increase in time it takes to run full
recomputes.  However, IMO, a hit of 10-15% in time it takes to recompute
might be acceptable as long as the most common cases that used to
trigger recompute are now handled incrementally.

Regards,
Dumitru

> Thanks
> Numan
> 
> 
>>
>> Thanks,
>> Dumitru
>>
>>>>> Below are the results with the time taken for the mentioned engine
>>>>> nodes in msecs for a recompute for some of the individual patches in
>>>>> the series.
>>>>>
>>>>>
>>>>> The sample OVN DBs have
>>>>>
>>>>> --------------------------------
>>>>> Resource              Total
>>>>> -------------------------------
>>>>> Logical switches    1001
>>>>> ----------------------------
>>>>> Logical routers      501
>>>>> ----------------------------
>>>>> Router ports         1501
>>>>> ----------------------------
>>>>> Switch ports         29501
>>>>> ----------------------------
>>>>> Load balancers    35253
>>>>> ----------------------------
>>>>> Load balancer group 1
>>>>> ----------------------------
>>>>> SB Logical flows    268349
>>>>> ----------------------------
>>>>> SB DB groups          509
>>>>> ----------------------------
>>>>>
>>>>> There is one load balancer group which has all the load balancers and
>>>>> it is associated with all the logical switches and routers.
>>>>>
>>>>> Below is the time taken for each engine node in msec
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> Engine nodes     | lb_data | northd  | lr_lbnat  | lflow  |
>>>>> ---------------------------------------------------------------------
>>>>> ovn-northd-main  | 358      | 2455    | x         | 2082     |
>>>>> ---------------------------------------------------------------------
>>>>> ovn-northd-p1    | 373       | 2476    | x         | 2170       |
>>>>> ---------------------------------------------------------------------
>>>>> ovn-northd-p5    | 367       | 2413    | x         | 2195       |
>>>>> ---------------------------------------------------------------------
>>>>> ovn-northd-p6    | 354       | 688     | 1815      | 2442    |
>>>>> ---------------------------------------------------------------------
>>>>> ovn-northd-p7    | 357       | 683     | 1778      | 2806    |
>>>>> ---------------------------------------------------------------------
>>>>> ovn-northd-p9    | 352       | 682     | 1781      | 2781    |
>>>>> ---------------------------------------------------------------------
>>>>> ovn-northd-p10   | 365      | 838     | 1707      | 2812    |
>>>>> ---------------------------------------------------------------------
>>>>> ovn-northd-p13   | 362      | 1066    | 1882      | 2917    |
>>>>> ---------------------------------------------------------------------
>>>>> ovn-northd-p15   | 359      | 1046    | 1688      | 2907    |
>>>>> ---------------------------------------------------------------------
>>>>> ovn-northd-p18   | 379      | 1066    | 1729      | 2886    |
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>> ovn-northd-p1 means ovn-northd with the patch 1 of this series,
>>>>> ovn-northd-p2 - patch 2 of this series and so on.
>>>>>
>>>>> Patch 6 adds a new engine node lr_lbnat and that's why the northd
>>>>> engine node time gets reduced from ~2413 msec to 688.
>>>>>
>>>>> With the first 10 patches,  northd engine over all time increases to
>>>>> 200msec compared to "main '' and lflow engine node time increases to
>>>>> around 800 msec.
>>>>> The point of this data is to show that the overall increase is mainly
>>>>> due to bookkeeping and new engine nodes.  I tried to optimise but I
>>>>> couldn't.
>>>>>
>>>>>
>>>>> IIMO this 20% overall increase should be fine for the following reasons
>>>>>    -  The scale tests with ovn-heater show significant decrease in
>>>>> ovn-northd CPU usage and the decrease in the overall test duration.
>>>>>    -  This means there were very few recomputes triggered.
>>>>>    -  With the OCP/K8s  kube-burner  tests,  there were 0 recomputes
>>>>> in the northd and lflow engine nodes.
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Prior to this patch series, most of the changes to northd engine
>>>>>>> resulted in full recomputation of logical flows.  This series
>>>>>>> aims to improve the performance of ovn-northd by adding the I-P
>>>>>>> support.
>>>>>>
>>>>>> I'd like to clarify "most of the changes" a little. I think we should
>>>> focus
>>>>>> on the most impactful changes that happen in a cloud environment. I
>>>> don't
>>>>>> think it is realistic to achieve "most of the changes" in I-P because
>>>> it is
>>>>>> too complex (even with this series we still handle a very small part of
>>>> the
>>>>>> DB schema incrementally), but it may be realistic to handle the most
>>>>>> impactful changes, which are the most frequent changes in the cloud,
>>>>>> incrementally. Before this series, we could handle regular VIF changes
>>>> and
>>>>>> related address-sets, port-groups (some of) updates incrementally. Those
>>>>>> are the most common changes related to pod/VM changes in cloud. I
>>>> believe
>>>>>> the next goal is for LB changes related to pod/VMs (i.e. the LB
>>>> backends),
>>>>>> which I believe is the main goal of this series. Is my understanding
>>>>>> correct?
>>>>>>
>>>>> That is correct.  LB change handling is very important in my opinion
>>>>> when used with ovn-kubernetes because all the k8s services
>>>>> are mapped to OVN LBs.
>>>>>
>>>>>
>>>>>> While reviewing the patches, I'd say sometimes I feel a little lost
>>>> because
>>>>>> it is hard to correlate some of the changes with the above goal.
>>>>>
>>>>> I understand that.  Maybe I've done a bad job in conveying why the
>>>>> changes are required for LBs.
>>>>>
>>>>> Let me try to clarify a bit here.  I'll incorporate and add more
>>>>> details in the patch commits and cover letter in the next series.
>>>>>
>>>>> This series mainly targets handling lflow generation for load balancer
>>>> changes.
>>>>> As you know we use conntrack in OVN for
>>>>>    - Load balancers (both in logical switch and router)
>>>>>    - ACLs in logical switch
>>>>>    - NAT in load balancers.
>>>>>
>>>>> And this makes it complicated to decouple logical flow generation for
>>>>> LBs from ACLs and NATs.  That is the reason I had to split northd node
>>>>> data related to load balancers, ACLs and NATs into separate nodes.
>>>>>
>>>>> To give an example, we generate certain logical flows for a logical
>>>>> switch by checking " if (od->has_lb_vip && od->has_stateful_acl)" (See
>>>>> build_acl_hints()).
>>>>> For logical routers, a NAT can be a LB VIP too.  This becomes
>>>>> difficult to decouple LB and NAT.   We could fall back to recompute
>>>>> for NAT changes.  But we still need to group logical flows related to
>>>>> LB and NAT.  And that's why I added lr_lbnat_data engine node in patch
>>>>> 6.
>>>>>
>>>>
>>>> Thanks for explaining. However, I wonder if these are related to LB backend
>>>> updates. I agree that service creation/deletion may also be important to be
>>>> handled incrementally, but they are less critical (less frequent) than the
>>>> backend updates which are directly related to pod/vm creation/deletion.
>>>
>>> From what I understand, at least for OCP use cases they seem to be critical.
>>> Also it depends on the use case I suppose.
>>>
>>>  The
>>>> has_xxx and NAT related handling are more related to the service
>>>> creation/deletion rather than LB backend updates, right?
>>>
>>> I think so.
>>>
>>> Numan
>>>
>>>>
>>>> Thanks,
>>>> Han
>>>>
>>>>>  I believe
>>>>>> there is a reason for all the changes but I am not sure if it is
>>>> directly
>>>>>> related to the goal of LB backend related I-P. I understand that there
>>>> are
>>>>>> other aspects of LB that can be incrementally processed. But I'd
>>>> recommend
>>>>>> if changes necessary for this goal can be largely reduced it would help
>>>> the
>>>>>> review and we might be able to merge them sooner and add more but less
>>>>>> impactful I-P later step by step. I guess I will have a clearer picture
>>>>>> when I finish the rest of the patches, but it would be helpful if you
>>>> could
>>>>>> add more guidance in this cover letter.
>>>>>
>>>>> I'll definitely add more details in the cover letter.
>>>>>
>>>>> Apart from the LB related I-P, this series only adds 2 more additional
>>>>> patches for I-P.
>>>>> One is for the NB_Global changes and the other for SB mac binding changes.
>>>>> ovn-kubernetes periodically writes in the NB_Global.options column
>>>>> with a timestamp for its internal health checks
>>>>> and this results in unnecessary recomputes.   After running the
>>>>> kube-burner density cni tests, I figured out the most common
>>>>> changes done in the NB DB and added the I-P for that.  With all these
>>>>> patches applied, there are no recomputes in both northd and lflow
>>>>> engine nodes
>>>>> during the test duration.  Not having the I-P for NB Global for
>>>>> example was resulting in significant ovn-northd CPU usage.
>>>>>
>>>>> Thanks again for the reviews.
>>>>>
>>>>> Numan
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Han
>>>>>>
>>>>>>>  In order to add this support, some of the northd engine
>>>>>>> node data (from struct ovn_datapath) is split and moved over to
>>>>>>> new engine nodes - mainly related to load balancers, NAT and ACLs.
>>>>>>>
>>>>>>> Below are the scale testing results done with these patches applied
>>>>>>> using ovn-heater.  The test ran the scenario  -
>>>>>>> ocp-500-density-heavy.yml [1].
>>>>>>>
>>>>>>> With all the lflow I-P patches applied, the resuts are:
>>>>>>>
>>>>>>>
>>>>>>
>>>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>>                         Min (s)         Median (s)      90%ile (s)
>>>>>>  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
>>>>>> Failed
>>>>>>>
>>>>>>
>>>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>> Iteration Total         0.136883        1.129016        1.192001
>>>>>>  1.204167        1.212728        0.665017        83.127099       125
>>>>   0
>>>>>>> Namespace.add_ports     0.005216        0.005736        0.007034
>>>>>>  0.015486        0.018978        0.006211        0.776373        125
>>>>   0
>>>>>>> WorkerNode.bind_port    0.035030        0.046082        0.052469
>>>>>>  0.058293        0.060311        0.045973        11.493259       250
>>>>   0
>>>>>>> WorkerNode.ping_port    0.005057        0.006727        1.047692
>>>>>>  1.069253        1.071336        0.266896        66.724094       250
>>>>   0
>>>>>>>
>>>>>>
>>>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>>
>>>>>>> The results with the present main are:
>>>>>>>
>>>>>>>
>>>>>>
>>>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>>                         Min (s)         Median (s)      90%ile (s)
>>>>>>  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
>>>>>> Failed
>>>>>>>
>>>>>>
>>>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>> Iteration Total         0.135491        2.223805        3.311270
>>>>>>  3.339078        3.345346        1.729172        216.146495      125
>>>>   0
>>>>>>> Namespace.add_ports     0.005380        0.005744        0.006819
>>>>>>  0.018773        0.020800        0.006292        0.786532        125
>>>>   0
>>>>>>> WorkerNode.bind_port    0.034179        0.046055        0.053488
>>>>>>  0.058801        0.071043        0.046117        11.529311       250
>>>>   0
>>>>>>> WorkerNode.ping_port    0.004956        0.006952        3.086952
>>>>>>  3.191743        3.192807        0.791544        197.886026      250
>>>>   0
>>>>>>>
>>>>>>
>>>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>>
>>>>>>> [1] -
>>>>>>
>>>> https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
>>>>>>>
>>>>>>> v1 -> v2
>>>>>>> --------
>>>>>>>    * Now also maintaing array indexes for ls_lbacls, lr_nat and
>>>>>>>      lr_lb_nat_data tables (similar to ovn_datapaths->array) to
>>>>>>>      make the lookup effecient.  The same ovn_datapath->index
>>>>>>>      is reused.
>>>>>>>
>>>>>>>    * Made some signficant changes to 'struct lflow_ref' in
>>>> lflow-mgr.c.
>>>>>>>      In v2 we don't use objdep_mgr to maintain the resource to lflow
>>>>>>>      references.  Instead we maintain the 'struct lflow' pointer.
>>>>>>>      With this we don't need to maintain additional hmap of lflows.
>>>>>>>
>>>>>>>
>>>>>>> Numan Siddique (18):
>>>>>>>   northd: Refactor the northd change tracking.
>>>>>>>   northd: Track ovn_datapaths in northd engine track data.
>>>>>>>   tests: Add a couple of tests in ovn-northd for I-P.
>>>>>>>   northd: Move router ports SB PB options sync to sync_to_sb_pb node.
>>>>>>>   northd: Add a new engine 'lr-nat' to manage lr NAT data.
>>>>>>>   northd: Add a new engine 'lr-lb-nat-data' to manage routers' lb
>>>> data.
>>>>>>>   northd:  Generate logical router's LB and NAT flows using
>>>>>>>     lr_lbnat_data.
>>>>>>>   northd: Don't commit dhcp response flows in the conntrack.
>>>>>>>   northd: Add a new node ls_lbacls.
>>>>>>>   northd: Refactor lflow management into a separate module.
>>>>>>>   northd: Use lflow_ref when adding all logical flows.
>>>>>>>   northd:  Move ovn_lb_datapaths from lib to northd module.
>>>>>>>   northd: Handle lb changes in lflow engine.
>>>>>>>   northd: Add lr_lb_nat_data handler for lflow engine node.
>>>>>>>   northd: Add ls_lbacls handler for lflow engine node.
>>>>>>>   northd:  Add I-P for NB_Global and SB_Global.
>>>>>>>   northd: Add a noop handler for northd SB mac binding.
>>>>>>>   northd: Add northd change handler for sync_to_sb_lb node.
>>>>>>>
>>>>>>>  lib/lb.c                   |   96 -
>>>>>>>  lib/lb.h                   |   57 -
>>>>>>>  lib/ovn-util.c             |   17 +-
>>>>>>>  lib/ovn-util.h             |    2 +-
>>>>>>>  lib/stopwatch-names.h      |    3 +
>>>>>>>  northd/aging.c             |   21 +-
>>>>>>>  northd/automake.mk         |   12 +-
>>>>>>>  northd/en-global-config.c  |  588 ++++
>>>>>>>  northd/en-global-config.h  |   65 +
>>>>>>>  northd/en-lflow.c          |  123 +-
>>>>>>>  northd/en-lflow.h          |    8 +
>>>>>>>  northd/en-lr-lb-nat-data.c |  685 ++++
>>>>>>>  northd/en-lr-lb-nat-data.h |  125 +
>>>>>>>  northd/en-lr-nat.c         |  498 +++
>>>>>>>  northd/en-lr-nat.h         |  137 +
>>>>>>>  northd/en-ls-lb-acls.c     |  530 +++
>>>>>>>  northd/en-ls-lb-acls.h     |  111 +
>>>>>>>  northd/en-northd.c         |   67 +-
>>>>>>>  northd/en-northd.h         |    2 +-
>>>>>>>  northd/en-port-group.h     |    3 +
>>>>>>>  northd/en-sync-sb.c        |  513 ++-
>>>>>>>  northd/inc-proc-northd.c   |   79 +-
>>>>>>>  northd/lflow-mgr.c         | 1078 ++++++
>>>>>>>  northd/lflow-mgr.h         |  192 ++
>>>>>>>  northd/northd.c            | 6485
>>>> ++++++++++++++++--------------------
>>>>>>>  northd/northd.h            |  551 ++-
>>>>>>>  northd/ovn-northd.c        |    4 +
>>>>>>>  tests/ovn-northd.at        |  858 ++++-
>>>>>>>  28 files changed, 8827 insertions(+), 4083 deletions(-)
>>>>>>>  create mode 100644 northd/en-global-config.c
>>>>>>>  create mode 100644 northd/en-global-config.h
>>>>>>>  create mode 100644 northd/en-lr-lb-nat-data.c
>>>>>>>  create mode 100644 northd/en-lr-lb-nat-data.h
>>>>>>>  create mode 100644 northd/en-lr-nat.c
>>>>>>>  create mode 100644 northd/en-lr-nat.h
>>>>>>>  create mode 100644 northd/en-ls-lb-acls.c
>>>>>>>  create mode 100644 northd/en-ls-lb-acls.h
>>>>>>>  create mode 100644 northd/lflow-mgr.c
>>>>>>>  create mode 100644 northd/lflow-mgr.h
>>>>>>>
>>>>>>> --
>>>>>>> 2.41.0
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> dev mailing list
>>>>>>> dev@openvswitch.org
>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>> _______________________________________________
>>>>>> dev mailing list
>>>>>> dev@openvswitch.org
>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev@openvswitch.org
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>> _______________________________________________
>>> dev mailing list
>>> dev@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>