diff mbox series

[ovs-dev,v2,07/14] northd: Incremental processing of VIF changes in 'northd' node.

Message ID 20230602041150.3019311-8-hzhou@ovn.org
State Accepted
Headers show
Series ovn-northd incremental processing for VIF changes | expand

Checks

Context Check Description
ovsrobot/apply-robot warning apply and check: warning
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Han Zhou June 2, 2023, 4:11 a.m. UTC
This patch introduces a change handler for NB logical_switch within the
'northd' node. It specifically handles cases where logical switch ports
in the tracked logical switches are changed (added/updated/deleted).
Only regular logical switch ports - which are common for VMs/Pods - are
addressed. For other scenarios, it reverts to recompute.

This update avoids recompute in the northd node (especially the
resource-intensive build_ports()) for NB changes of VIF
add/update/delete.  However, it does not eliminate the need for lflow
recompute.

Below are the performance test results simulating an ovn-k8s topology of
500 nodes x 50 lsp per node:

Before:
ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
    ovn-northd completion:          955ms
ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01 -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" -- lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
    ovn-northd completion:          919ms

After:
ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
    ovn-northd completion:          776ms
ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01 -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" -- lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
    ovn-northd completion:          773ms

Both addition and deletion show ~20% reduction of ovn-northd completion
time.  Note: the test uses only 1 thread of ovn-northd for flow
recompute. Using multithread should show a larger percentage of
improvement.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Reviewed-by: Ales Musil <amusil@redhat.com>
---
 lib/ovn-util.c           |  15 ++
 lib/ovn-util.h           |   1 +
 northd/en-northd.c       | 130 ++++++---
 northd/en-northd.h       |   2 +
 northd/inc-proc-northd.c |  12 +-
 northd/northd.c          | 567 +++++++++++++++++++++++++++++++++------
 northd/northd.h          |  27 +-
 tests/ovn-northd.at      |  58 ++++
 8 files changed, 685 insertions(+), 127 deletions(-)

Comments

0-day Robot June 2, 2023, 4:19 a.m. UTC | #1
Bleep bloop.  Greetings Han Zhou, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Comment with 'xxx' marker
#761 FILE: northd/northd.c:4687:
    /* XXX: Add tag_alloc_table and queue_id_bitmap as part of northd_data

WARNING: Comment with 'xxx' marker
#822 FILE: northd/northd.c:4847:
        /* Currently not support "unknown" address handling.  XXX: Need to

WARNING: Comment with 'xxx' marker
#840 FILE: northd/northd.c:4865:
    /* XXX: Need a better OVSDB IDL interface for this check. */

WARNING: Comment with 'xxx' marker
#883 FILE: northd/northd.c:4908:
        /* XXX: the new SB port_binding will change in IDL, so need to handle

Lines checked: 1285, Warnings: 4, Errors: 0


Please check this out.  If you feel there has been an error, please email aconole@redhat.com

Thanks,
0-day Robot
Numan Siddique June 7, 2023, 7:38 p.m. UTC | #2
On Fri, Jun 2, 2023 at 12:13 AM Han Zhou <hzhou@ovn.org> wrote:
>
> This patch introduces a change handler for NB logical_switch within the
> 'northd' node. It specifically handles cases where logical switch ports
> in the tracked logical switches are changed (added/updated/deleted).
> Only regular logical switch ports - which are common for VMs/Pods - are
> addressed. For other scenarios, it reverts to recompute.
>
> This update avoids recompute in the northd node (especially the
> resource-intensive build_ports()) for NB changes of VIF
> add/update/delete.  However, it does not eliminate the need for lflow
> recompute.
>
> Below are the performance test results simulating an ovn-k8s topology of
> 500 nodes x 50 lsp per node:
>
> Before:
> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
>     ovn-northd completion:          955ms
> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01 -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" -- lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
>     ovn-northd completion:          919ms
>
> After:
> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
>     ovn-northd completion:          776ms
> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01 -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" -- lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
>     ovn-northd completion:          773ms
>
> Both addition and deletion show ~20% reduction of ovn-northd completion
> time.  Note: the test uses only 1 thread of ovn-northd for flow
> recompute. Using multithread should show a larger percentage of
> improvement.
>
> Signed-off-by: Han Zhou <hzhou@ovn.org>
> Reviewed-by: Ales Musil <amusil@redhat.com>

Few nits below.

Acked-by: Numan Siddique <numans@ovn.org>

Numan


> ---
>  lib/ovn-util.c           |  15 ++
>  lib/ovn-util.h           |   1 +
>  northd/en-northd.c       | 130 ++++++---
>  northd/en-northd.h       |   2 +
>  northd/inc-proc-northd.c |  12 +-
>  northd/northd.c          | 567 +++++++++++++++++++++++++++++++++------
>  northd/northd.h          |  27 +-
>  tests/ovn-northd.at      |  58 ++++
>  8 files changed, 685 insertions(+), 127 deletions(-)
>
> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
> index bffb521cfd9d..da2a9d45ac64 100644
> --- a/lib/ovn-util.c
> +++ b/lib/ovn-util.c
> @@ -720,6 +720,21 @@ ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t min,
>      return 0;
>  }
>
> +bool
> +ovn_free_tnlid(struct hmap *tnlids, uint32_t tnlid)
> +{
> +    uint32_t hash = hash_int(tnlid, 0);
> +    struct tnlid_node *node;
> +    HMAP_FOR_EACH_IN_BUCKET (node, hmap_node, hash, tnlids) {
> +        if (node->tnlid == tnlid) {
> +            hmap_remove(tnlids, &node->hmap_node);
> +            free(node);
> +            return true;
> +        }
> +    }
> +    return false;
> +}
> +
>  char *
>  ovn_chassis_redirect_name(const char *port_name)
>  {
> diff --git a/lib/ovn-util.h b/lib/ovn-util.h
> index b17b0e2364c5..9681a12219a9 100644
> --- a/lib/ovn-util.h
> +++ b/lib/ovn-util.h
> @@ -167,6 +167,7 @@ bool ovn_add_tnlid(struct hmap *set, uint32_t tnlid);
>  bool ovn_tnlid_present(struct hmap *tnlids, uint32_t tnlid);
>  uint32_t ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t min,
>                              uint32_t max, uint32_t *hint);
> +bool ovn_free_tnlid(struct hmap *tnlids, uint32_t tnlid);
>
>  static inline void
>  get_unique_lport_key(uint64_t dp_tunnel_key, uint64_t lport_tunnel_key,
> diff --git a/northd/en-northd.c b/northd/en-northd.c
> index a3dc37e198e3..f2bf98f774b1 100644
> --- a/northd/en-northd.c
> +++ b/northd/en-northd.c
> @@ -31,92 +31,103 @@
>
>  VLOG_DEFINE_THIS_MODULE(en_northd);
>
> -void en_northd_run(struct engine_node *node, void *data)
> +static void
> +northd_get_input_data(struct engine_node *node,
> +                      struct northd_input *input_data)
>  {
> -    const struct engine_context *eng_ctx = engine_get_context();
> -
> -    struct northd_input input_data;
> -
> -    northd_destroy(data);
> -    northd_init(data);
> -
> -    input_data.sbrec_chassis_by_name =
> +    input_data->sbrec_chassis_by_name =
>          engine_ovsdb_node_get_index(
>              engine_get_input("SB_chassis", node),
>              "sbrec_chassis_by_name");
> -    input_data.sbrec_chassis_by_hostname =
> +    input_data->sbrec_chassis_by_hostname =
>          engine_ovsdb_node_get_index(
>              engine_get_input("SB_chassis", node),
>              "sbrec_chassis_by_hostname");
> -    input_data.sbrec_ha_chassis_grp_by_name =
> +    input_data->sbrec_ha_chassis_grp_by_name =
>          engine_ovsdb_node_get_index(
>              engine_get_input("SB_ha_chassis_group", node),
>              "sbrec_ha_chassis_grp_by_name");
> -    input_data.sbrec_ip_mcast_by_dp =
> +    input_data->sbrec_ip_mcast_by_dp =
>          engine_ovsdb_node_get_index(
>              engine_get_input("SB_ip_multicast", node),
>              "sbrec_ip_mcast_by_dp");
> -    input_data.sbrec_static_mac_binding_by_lport_ip =
> +    input_data->sbrec_static_mac_binding_by_lport_ip =
>          engine_ovsdb_node_get_index(
>              engine_get_input("SB_static_mac_binding", node),
>              "sbrec_static_mac_binding_by_lport_ip");
> +    input_data->sbrec_fdb_by_dp_and_port =
> +        engine_ovsdb_node_get_index(
> +            engine_get_input("SB_fdb", node),
> +            "sbrec_fdb_by_dp_and_port");
>
> -    input_data.nbrec_nb_global_table =
> +    input_data->nbrec_nb_global_table =
>          EN_OVSDB_GET(engine_get_input("NB_nb_global", node));
> -    input_data.nbrec_logical_switch_table =
> +    input_data->nbrec_logical_switch_table =
>          EN_OVSDB_GET(engine_get_input("NB_logical_switch", node));
> -    input_data.nbrec_logical_router_table =
> +    input_data->nbrec_logical_router_table =
>          EN_OVSDB_GET(engine_get_input("NB_logical_router", node));
> -    input_data.nbrec_load_balancer_table =
> +    input_data->nbrec_load_balancer_table =
>          EN_OVSDB_GET(engine_get_input("NB_load_balancer", node));
> -    input_data.nbrec_load_balancer_group_table =
> +    input_data->nbrec_load_balancer_group_table =
>          EN_OVSDB_GET(engine_get_input("NB_load_balancer_group", node));
> -    input_data.nbrec_port_group_table =
> +    input_data->nbrec_port_group_table =
>          EN_OVSDB_GET(engine_get_input("NB_port_group", node));
> -    input_data.nbrec_meter_table =
> +    input_data->nbrec_meter_table =
>          EN_OVSDB_GET(engine_get_input("NB_meter", node));
> -    input_data.nbrec_acl_table =
> +    input_data->nbrec_acl_table =
>          EN_OVSDB_GET(engine_get_input("NB_acl", node));
> -    input_data.nbrec_static_mac_binding_table =
> +    input_data->nbrec_static_mac_binding_table =
>          EN_OVSDB_GET(engine_get_input("NB_static_mac_binding", node));
> -    input_data.nbrec_chassis_template_var_table =
> +    input_data->nbrec_chassis_template_var_table =
>          EN_OVSDB_GET(engine_get_input("NB_chassis_template_var", node));
> -    input_data.nbrec_mirror_table =
> +    input_data->nbrec_mirror_table =
>          EN_OVSDB_GET(engine_get_input("NB_mirror", node));
>
> -    input_data.sbrec_sb_global_table =
> +    input_data->sbrec_sb_global_table =
>          EN_OVSDB_GET(engine_get_input("SB_sb_global", node));
> -    input_data.sbrec_datapath_binding_table =
> +    input_data->sbrec_datapath_binding_table =
>          EN_OVSDB_GET(engine_get_input("SB_datapath_binding", node));
> -    input_data.sbrec_port_binding_table =
> +    input_data->sbrec_port_binding_table =
>          EN_OVSDB_GET(engine_get_input("SB_port_binding", node));
> -    input_data.sbrec_mac_binding_table =
> +    input_data->sbrec_mac_binding_table =
>          EN_OVSDB_GET(engine_get_input("SB_mac_binding", node));
> -    input_data.sbrec_ha_chassis_group_table =
> +    input_data->sbrec_ha_chassis_group_table =
>          EN_OVSDB_GET(engine_get_input("SB_ha_chassis_group", node));
> -    input_data.sbrec_chassis_table =
> +    input_data->sbrec_chassis_table =
>          EN_OVSDB_GET(engine_get_input("SB_chassis", node));
> -    input_data.sbrec_fdb_table =
> +    input_data->sbrec_fdb_table =
>          EN_OVSDB_GET(engine_get_input("SB_fdb", node));
> -    input_data.sbrec_load_balancer_table =
> +    input_data->sbrec_load_balancer_table =
>          EN_OVSDB_GET(engine_get_input("SB_load_balancer", node));
> -    input_data.sbrec_service_monitor_table =
> +    input_data->sbrec_service_monitor_table =
>          EN_OVSDB_GET(engine_get_input("SB_service_monitor", node));
> -    input_data.sbrec_port_group_table =
> +    input_data->sbrec_port_group_table =
>          EN_OVSDB_GET(engine_get_input("SB_port_group", node));
> -    input_data.sbrec_meter_table =
> +    input_data->sbrec_meter_table =
>          EN_OVSDB_GET(engine_get_input("SB_meter", node));
> -    input_data.sbrec_dns_table =
> +    input_data->sbrec_dns_table =
>          EN_OVSDB_GET(engine_get_input("SB_dns", node));
> -    input_data.sbrec_ip_multicast_table =
> +    input_data->sbrec_ip_multicast_table =
>          EN_OVSDB_GET(engine_get_input("SB_ip_multicast", node));
> -    input_data.sbrec_static_mac_binding_table =
> +    input_data->sbrec_static_mac_binding_table =
>          EN_OVSDB_GET(engine_get_input("SB_static_mac_binding", node));
> -    input_data.sbrec_chassis_template_var_table =
> +    input_data->sbrec_chassis_template_var_table =
>          EN_OVSDB_GET(engine_get_input("SB_chassis_template_var", node));
> -    input_data.sbrec_mirror_table =
> +    input_data->sbrec_mirror_table =
>          EN_OVSDB_GET(engine_get_input("SB_mirror", node));
> +}
>
> +void
> +en_northd_run(struct engine_node *node, void *data)
> +{
> +    const struct engine_context *eng_ctx = engine_get_context();
> +
> +    struct northd_input input_data;
> +
> +    northd_destroy(data);
> +    northd_init(data);
> +
> +    northd_get_input_data(node, &input_data);
>      northd_run(&input_data, data,
>                 eng_ctx->ovnnb_idl_txn,
>                 eng_ctx->ovnsb_idl_txn);
> @@ -148,17 +159,48 @@ northd_nb_nb_global_handler(struct engine_node *node,
>      return true;
>  }
>
> -void *en_northd_init(struct engine_node *node OVS_UNUSED,
> -                     struct engine_arg *arg OVS_UNUSED)
> +bool
> +northd_nb_logical_switch_handler(struct engine_node *node,
> +                                 void *data)
>  {
> -    struct northd_data *data = xmalloc(sizeof *data);
> +    const struct engine_context *eng_ctx = engine_get_context();
> +    struct northd_data *nd = data;
> +
> +    struct northd_input input_data;
> +
> +    northd_get_input_data(node, &input_data);
> +
> +    if (!northd_handle_ls_changes(eng_ctx->ovnsb_idl_txn, &input_data, nd)) {
> +        return false;
> +    }
> +
> +    if (nd->change_tracked) {
> +        engine_set_node_state(node, EN_UPDATED);
> +    }
> +
> +    return true;
> +}
> +
> +void
> +*en_northd_init(struct engine_node *node OVS_UNUSED,
> +                struct engine_arg *arg OVS_UNUSED)
> +{
> +    struct northd_data *data = xzalloc(sizeof *data);
>
>      northd_init(data);
>
>      return data;
>  }
>
> -void en_northd_cleanup(void *data)
> +void
> +en_northd_cleanup(void *data)
>  {
>      northd_destroy(data);
>  }
> +
> +void
> +en_northd_clear_tracked_data(void *data_)
> +{
> +    struct northd_data *data = data_;
> +    destroy_northd_data_tracked_changes(data);
> +}
> diff --git a/northd/en-northd.h b/northd/en-northd.h
> index 8d8343b459a6..a53a162bda48 100644
> --- a/northd/en-northd.h
> +++ b/northd/en-northd.h
> @@ -13,6 +13,8 @@ void en_northd_run(struct engine_node *node OVS_UNUSED, void *data OVS_UNUSED);
>  void *en_northd_init(struct engine_node *node OVS_UNUSED,
>                       struct engine_arg *arg);
>  void en_northd_cleanup(void *data);
> +void en_northd_clear_tracked_data(void *data);
>  bool northd_nb_nb_global_handler(struct engine_node *, void *data OVS_UNUSED);
> +bool northd_nb_logical_switch_handler(struct engine_node *, void *data);
>
>  #endif /* EN_NORTHD_H */
> diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
> index 863c9323c444..f992a9ec8420 100644
> --- a/northd/inc-proc-northd.c
> +++ b/northd/inc-proc-northd.c
> @@ -128,7 +128,7 @@ enum sb_engine_node {
>
>  /* Define engine nodes for other nodes. They should be defined as static to
>   * avoid sparse errors. */
> -static ENGINE_NODE(northd, "northd");
> +static ENGINE_NODE_WITH_CLEAR_TRACK_DATA(northd, "northd");
>  static ENGINE_NODE(lflow, "lflow");
>  static ENGINE_NODE(mac_binding_aging, "mac_binding_aging");
>  static ENGINE_NODE(mac_binding_aging_waker, "mac_binding_aging_waker");
> @@ -143,7 +143,8 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
>       * on the second argument */
>      engine_add_input(&en_northd, &en_nb_nb_global,
>                       northd_nb_nb_global_handler);
> -    engine_add_input(&en_northd, &en_nb_logical_switch, NULL);
> +    engine_add_input(&en_northd, &en_nb_logical_switch,
> +                     northd_nb_logical_switch_handler);
>      engine_add_input(&en_northd, &en_nb_port_group, NULL);
>      engine_add_input(&en_northd, &en_nb_load_balancer, NULL);
>      engine_add_input(&en_northd, &en_nb_load_balancer_group, NULL);
> @@ -252,6 +253,13 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
>                                  "sbrec_address_set_by_name",
>                                  sbrec_address_set_by_name);
>
> +    struct ovsdb_idl_index *sbrec_fdb_by_dp_and_port
> +        = ovsdb_idl_index_create2(sb->idl, &sbrec_fdb_col_dp_key,
> +                                  &sbrec_fdb_col_port_key);
> +    engine_ovsdb_node_add_index(&en_sb_fdb,
> +                                "sbrec_fdb_by_dp_and_port",
> +                                sbrec_fdb_by_dp_and_port);
> +
>      struct northd_data *northd_data =
>          engine_get_internal_data(&en_northd);
>      unixctl_command_register("debug/chassis-features-list", "", 0, 0,
> diff --git a/northd/northd.c b/northd/northd.c
> index d79f075681d9..1a4e24978642 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -19,6 +19,7 @@
>
>  #include "debug.h"
>  #include "bitmap.h"
> +#include "coverage.h"
>  #include "dirs.h"
>  #include "ipam.h"
>  #include "openvswitch/dynamic-string.h"
> @@ -59,6 +60,8 @@
>
>  VLOG_DEFINE_THIS_MODULE(northd);
>
> +COVERAGE_DEFINE(northd_run);
> +
>  static bool controller_event_en;
>  static bool lflow_hash_lock_initialized = false;
>
> @@ -807,13 +810,20 @@ ovn_datapath_create(struct hmap *datapaths, const struct uuid *key,
>      od->port_key_hint = 0;
>      hmap_insert(datapaths, &od->key_node, uuid_hash(&od->key));
>      od->lr_group = NULL;
> -    ovs_list_init(&od->port_list);
> +    hmap_init(&od->ports);
>      return od;
>  }
>
>  static void ovn_ls_port_group_destroy(struct hmap *nb_pgs);
>  static void destroy_mcast_info_for_datapath(struct ovn_datapath *od);
>
> +static void
> +destroy_ports_for_datapath(struct ovn_datapath *od)
> +{
> +    ovs_assert(hmap_is_empty(&od->ports));
> +    hmap_destroy(&od->ports);
> +}
> +
>  static void
>  ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od)
>  {
> @@ -834,6 +844,7 @@ ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od)
>          free(od->l3dgw_ports);
>          ovn_ls_port_group_destroy(&od->nb_pgs);
>          destroy_mcast_info_for_datapath(od);
> +        destroy_ports_for_datapath(od);
>
>          free(od);
>      }
> @@ -1480,6 +1491,9 @@ struct ovn_port {
>      struct lport_addresses *ps_addrs;   /* Port security addresses. */
>      unsigned int n_ps_addrs;
>
> +    bool lsp_can_be_inc_processed; /* If it can be incrementally processed when
> +                                      the port changes. */
> +
>      /* Logical router port data. */
>      const struct nbrec_logical_router_port *nbrp; /* May be NULL. */
>
> @@ -1521,11 +1535,16 @@ struct ovn_port {
>
>      struct ovs_list list;       /* In list of similar records. */
>
> -    struct ovs_list dp_node;
> +    struct hmap_node dp_node;   /* Node in od->ports. */
>
>      struct lport_addresses proxy_arp_addrs;
> +
> +    /* Temporarily used for traversing a list (or hmap) of ports. */
> +    bool visited;
>  };
>
> +static bool lsp_can_be_inc_processed(const struct nbrec_logical_switch_port *);
> +
>  static bool
>  is_l3dgw_port(const struct ovn_port *op)
>  {
> @@ -1585,6 +1604,9 @@ ovn_port_set_nb(struct ovn_port *op,
>                  const struct nbrec_logical_router_port *nbrp)
>  {
>      op->nbsp = nbsp;
> +    if (nbsp) {
> +        op->lsp_can_be_inc_processed = lsp_can_be_inc_processed(nbsp);
> +    }
>      op->nbrp = nbrp;
>      init_mcast_port_info(&op->mcast_info, op->nbsp, op->nbrp);
>  }
> @@ -1610,35 +1632,46 @@ ovn_port_create(struct hmap *ports, const char *key,
>  }
>
>  static void
> -ovn_port_destroy(struct hmap *ports, struct ovn_port *port)
> +ovn_port_destroy_orphan(struct ovn_port *port)
>  {
> -    if (port) {
> -        /* Don't remove port->list.  It is used within build_ports() as a
> -         * private list and once we've exited that function it is not safe to
> -         * use it. */
> -        hmap_remove(ports, &port->key_node);
> +    if (port->tunnel_key) {
> +        ovs_assert(port->od);
> +        ovn_free_tnlid(&port->od->port_tnlids, port->tunnel_key);
> +    }
> +    for (int i = 0; i < port->n_lsp_addrs; i++) {
> +        destroy_lport_addresses(&port->lsp_addrs[i]);
> +    }
> +    free(port->lsp_addrs);
>
> -        if (port->peer) {
> -            port->peer->peer = NULL;
> -        }
> +    if (port->peer) {
> +        port->peer->peer = NULL;
> +    }
>
> -        for (int i = 0; i < port->n_lsp_addrs; i++) {
> -            destroy_lport_addresses(&port->lsp_addrs[i]);
> -        }
> -        free(port->lsp_addrs);
> +    for (int i = 0; i < port->n_ps_addrs; i++) {
> +        destroy_lport_addresses(&port->ps_addrs[i]);
> +    }
> +    free(port->ps_addrs);
>
> -        for (int i = 0; i < port->n_ps_addrs; i++) {
> -            destroy_lport_addresses(&port->ps_addrs[i]);
> -        }
> -        free(port->ps_addrs);
> +    destroy_routable_addresses(&port->routables);
>
> -        destroy_routable_addresses(&port->routables);
> +    destroy_lport_addresses(&port->lrp_networks);
> +    destroy_lport_addresses(&port->proxy_arp_addrs);
> +    free(port->json_key);
> +    free(port->key);
> +    free(port);
> +}
>
> -        destroy_lport_addresses(&port->lrp_networks);
> -        destroy_lport_addresses(&port->proxy_arp_addrs);
> -        free(port->json_key);
> -        free(port->key);
> -        free(port);
> +static void
> +ovn_port_destroy(struct hmap *ports, struct ovn_port *port)
> +{
> +    if (port) {
> +        /* Don't remove port->list. The node should be removed from such lists
> +         * before calling this function. */

nit:  Please update the comment accordingly (from list to hmap)



> +        hmap_remove(ports, &port->key_node);
> +        if (port->od && !port->l3dgw_port) {
> +            hmap_remove(&port->od->ports, &port->dp_node);
> +        }
> +        ovn_port_destroy_orphan(port);
>      }
>  }
>
> @@ -2408,6 +2441,53 @@ tag_alloc_create_new_tag(struct hmap *tag_alloc_table,
>  }
>
>
> +static void
> +parse_lsp_addrs(struct ovn_port *op)
> +{
> +    const struct nbrec_logical_switch_port *nbsp = op->nbsp;
> +    ovs_assert(nbsp);
> +    op->lsp_addrs
> +        = xmalloc(sizeof *op->lsp_addrs * nbsp->n_addresses);
> +    for (size_t j = 0; j < nbsp->n_addresses; j++) {
> +        if (!strcmp(nbsp->addresses[j], "unknown")) {
> +            op->has_unknown = true;
> +            continue;
> +        }
> +        if (!strcmp(nbsp->addresses[j], "router")) {
> +            continue;
> +        }
> +        if (is_dynamic_lsp_address(nbsp->addresses[j])) {
> +            continue;
> +        } else if (!extract_lsp_addresses(nbsp->addresses[j],
> +                               &op->lsp_addrs[op->n_lsp_addrs])) {
> +            static struct vlog_rate_limit rl
> +                = VLOG_RATE_LIMIT_INIT(1, 1);
> +            VLOG_INFO_RL(&rl, "invalid syntax '%s' in logical "
> +                              "switch port addresses. No MAC "
> +                              "address found",
> +                              op->nbsp->addresses[j]);
> +            continue;
> +        }
> +        op->n_lsp_addrs++;
> +    }
> +    op->n_lsp_non_router_addrs = op->n_lsp_addrs;
> +
> +    op->ps_addrs
> +        = xmalloc(sizeof *op->ps_addrs * nbsp->n_port_security);
> +    for (size_t j = 0; j < nbsp->n_port_security; j++) {
> +        if (!extract_lsp_addresses(nbsp->port_security[j],
> +                                   &op->ps_addrs[op->n_ps_addrs])) {
> +            static struct vlog_rate_limit rl
> +                = VLOG_RATE_LIMIT_INIT(1, 1);
> +            VLOG_INFO_RL(&rl, "invalid syntax '%s' in port "
> +                              "security. No MAC address found",
> +                              op->nbsp->port_security[j]);
> +            continue;
> +        }
> +        op->n_ps_addrs++;
> +    }
> +}
> +
>  static void
>  join_logical_ports(const struct sbrec_port_binding_table *sbrec_pb_table,
>                     struct hmap *ls_datapaths, struct hmap *lr_datapaths,
> @@ -2504,49 +2584,11 @@ join_logical_ports(const struct sbrec_port_binding_table *sbrec_pb_table,
>                  od->has_vtep_lports = true;
>              }
>
> -            op->lsp_addrs
> -                = xmalloc(sizeof *op->lsp_addrs * nbsp->n_addresses);
> -            for (size_t j = 0; j < nbsp->n_addresses; j++) {
> -                if (!strcmp(nbsp->addresses[j], "unknown")) {
> -                    op->has_unknown = true;
> -                    continue;
> -                }
> -                if (!strcmp(nbsp->addresses[j], "router")) {
> -                    continue;
> -                }
> -                if (is_dynamic_lsp_address(nbsp->addresses[j])) {
> -                    continue;
> -                } else if (!extract_lsp_addresses(nbsp->addresses[j],
> -                                       &op->lsp_addrs[op->n_lsp_addrs])) {
> -                    static struct vlog_rate_limit rl
> -                        = VLOG_RATE_LIMIT_INIT(1, 1);
> -                    VLOG_INFO_RL(&rl, "invalid syntax '%s' in logical "
> -                                      "switch port addresses. No MAC "
> -                                      "address found",
> -                                      op->nbsp->addresses[j]);
> -                    continue;
> -                }
> -                op->n_lsp_addrs++;
> -            }
> -            op->n_lsp_non_router_addrs = op->n_lsp_addrs;
> -
> -            op->ps_addrs
> -                = xmalloc(sizeof *op->ps_addrs * nbsp->n_port_security);
> -            for (size_t j = 0; j < nbsp->n_port_security; j++) {
> -                if (!extract_lsp_addresses(nbsp->port_security[j],
> -                                           &op->ps_addrs[op->n_ps_addrs])) {
> -                    static struct vlog_rate_limit rl
> -                        = VLOG_RATE_LIMIT_INIT(1, 1);
> -                    VLOG_INFO_RL(&rl, "invalid syntax '%s' in port "
> -                                      "security. No MAC address found",
> -                                      op->nbsp->port_security[j]);
> -                    continue;
> -                }
> -                op->n_ps_addrs++;
> -            }
> +            parse_lsp_addrs(op);
>
>              op->od = od;
> -            ovs_list_push_back(&od->port_list, &op->dp_node);
> +            hmap_insert(&od->ports, &op->dp_node,
> +                        hmap_node_hash(&op->key_node));
>              tag_alloc_add_existing_tags(tag_alloc_table, nbsp);
>          }
>      }
> @@ -2593,7 +2635,8 @@ join_logical_ports(const struct sbrec_port_binding_table *sbrec_pb_table,
>
>              op->lrp_networks = lrp_networks;
>              op->od = od;
> -            ovs_list_push_back(&od->port_list, &op->dp_node);
> +            hmap_insert(&od->ports, &op->dp_node,
> +                        hmap_node_hash(&op->key_node));
>
>              if (!od->redirect_bridged) {
>                  const char *redirect_type =
> @@ -3314,6 +3357,10 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn,
>          struct smap new;
>          smap_init(&new);
>          if (is_cr_port(op)) {
> +            ovs_assert(sbrec_chassis_by_name);
> +            ovs_assert(sbrec_chassis_by_hostname);
> +            ovs_assert(sbrec_ha_chassis_grp_by_name);
> +            ovs_assert(active_ha_chassis_grps);
>              const char *redirect_type = smap_get(&op->nbrp->options,
>                                                   "redirect-type");
>
> @@ -3423,8 +3470,10 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn,
>              struct smap options;
>
>              if (has_qos && !queue_id) {
> +                ovs_assert(queue_id_bitmap);
>                  queue_id = allocate_queueid(queue_id_bitmap);
>              } else if (!has_qos && queue_id) {
> +                ovs_assert(queue_id_bitmap);
>                  bitmap_set0(queue_id_bitmap, queue_id);
>                  queue_id = 0;
>              }
> @@ -3473,6 +3522,10 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn,
>              sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0);
>
>              if (!strcmp(op->nbsp->type, "external")) {
> +                ovs_assert(sbrec_chassis_by_name);
> +                ovs_assert(sbrec_chassis_by_hostname);
> +                ovs_assert(sbrec_ha_chassis_grp_by_name);
> +                ovs_assert(active_ha_chassis_grps);
>                  if (op->nbsp->ha_chassis_group) {
>                      sync_ha_chassis_group_for_sbpb(
>                          ovnsb_txn, sbrec_chassis_by_name,
> @@ -3729,6 +3782,24 @@ cleanup_stale_fdb_entries(const struct sbrec_fdb_table *sbrec_fdb_table,
>      }
>  }
>
> +static void
> +delete_fdb_entry(struct ovsdb_idl_index *sbrec_fdb_by_dp_and_port,
> +                 uint32_t dp_key, uint32_t port_key)
> +{
> +    struct sbrec_fdb *target =
> +        sbrec_fdb_index_init_row(sbrec_fdb_by_dp_and_port);
> +    sbrec_fdb_index_set_dp_key(target, dp_key);
> +    sbrec_fdb_index_set_port_key(target, port_key);
> +
> +    struct sbrec_fdb *fdb_e = sbrec_fdb_index_find(sbrec_fdb_by_dp_and_port,
> +                                                   target);
> +    sbrec_fdb_index_destroy_row(target);
> +
> +    if (fdb_e) {
> +        sbrec_fdb_delete(fdb_e);
> +    }
> +}
> +
>  struct service_monitor_info {
>      struct hmap_node hmap_node;
>      const struct sbrec_service_monitor *sbrec_mon;
> @@ -3820,14 +3891,15 @@ ovn_lb_svc_create(struct ovsdb_idl_txn *ovnsb_txn, struct ovn_northd_lb *lb,
>              }
>              ds_destroy(&key);
>
> -            backend_nb->op = op;
> -            backend_nb->svc_mon_src_ip = svc_mon_src_ip;
> -
>              if (!lb_vip_nb->lb_health_check || !op || !svc_mon_src_ip ||
>                  !lsp_is_enabled(op->nbsp)) {
> +                free(svc_mon_src_ip);
>                  continue;
>              }
>
> +            backend_nb->op = op;
> +            backend_nb->svc_mon_src_ip = svc_mon_src_ip;
> +
>              const char *protocol = lb->nlb->protocol;
>              if (!protocol || !protocol[0]) {
>                  protocol = "tcp";
> @@ -4155,7 +4227,7 @@ build_lrouter_lb_reachable_ips(struct ovn_datapath *od,
>              ovs_be32 vip_ip4 = in6_addr_get_mapped_ipv4(&lb->vips[i].vip);
>              struct ovn_port *op;
>
> -            LIST_FOR_EACH (op, dp_node, &od->port_list) {
> +            HMAP_FOR_EACH (op, dp_node, &od->ports) {
>                  if (lrouter_port_ipv4_reachable(op, vip_ip4)) {
>                      sset_add(&od->lb_ips->ips_v4_reachable,
>                               lb->vips[i].vip_str);
> @@ -4165,7 +4237,7 @@ build_lrouter_lb_reachable_ips(struct ovn_datapath *od,
>          } else {
>              struct ovn_port *op;
>
> -            LIST_FOR_EACH (op, dp_node, &od->port_list) {
> +            HMAP_FOR_EACH (op, dp_node, &od->ports) {
>                  if (lrouter_port_ipv6_reachable(op, &lb->vips[i].vip)) {
>                      sset_add(&od->lb_ips->ips_v6_reachable,
>                               lb->vips[i].vip_str);
> @@ -4538,7 +4610,10 @@ ovn_port_add_tnlid(struct ovn_port *op, uint32_t tunnel_key)
>      return added;
>  }
>
> -static void
> +/* Returns false if the requested key is confict with another allocated key, so
> + * that the I-P engine can fallback to recompute if needed; otherwise return
> + * true (even if the key is not allocated). */
> +static bool
>  ovn_port_assign_requested_tnl_id(
>      const struct sbrec_chassis_table *sbrec_chassis_table, struct ovn_port *op)
>  {
> @@ -4553,7 +4628,7 @@ ovn_port_assign_requested_tnl_id(
>              VLOG_WARN_RL(&rl, "Tunnel key %"PRIu32" for port %s "
>                           "is incompatible with VXLAN",
>                           tunnel_key, op_get_name(op));
> -            return;
> +            return true;
>          }
>          if (!ovn_port_add_tnlid(op, tunnel_key)) {
>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> @@ -4561,11 +4636,13 @@ ovn_port_assign_requested_tnl_id(
>                           "%"PRIu32" as another LSP or LRP",
>                           op->nbsp ? "switch" : "router",
>                           op_get_name(op), tunnel_key);
> +            return false;
>          }
>      }
> +    return true;
>  }
>
> -static void
> +static bool
>  ovn_port_allocate_key(const struct sbrec_chassis_table *sbrec_chassis_table,
>                        struct hmap *ports,
>                        struct ovn_port *op)
> @@ -4581,8 +4658,10 @@ ovn_port_allocate_key(const struct sbrec_chassis_table *sbrec_chassis_table,
>              }
>              ovs_list_remove(&op->list);
>              ovn_port_destroy(ports, op);
> +            return false;
>          }
>      }
> +    return true;
>  }
>
>  /* Updates the southbound Port_Binding table so that it contains the logical
> @@ -4605,6 +4684,8 @@ build_ports(struct ovsdb_idl_txn *ovnsb_txn,
>      struct hmap *ls_ports, struct hmap *lr_ports)
>  {
>      struct ovs_list sb_only, nb_only, both;
> +    /* XXX: Add tag_alloc_table and queue_id_bitmap as part of northd_data
> +     * to improve I-P. */
>      struct hmap tag_alloc_table = HMAP_INITIALIZER(&tag_alloc_table);
>      unsigned long *queue_id_bitmap = bitmap_allocate(QDISC_MAX_QUEUE_ID + 1);
>      bitmap_set1(queue_id_bitmap, 0);
> @@ -4711,6 +4792,329 @@ build_ports(struct ovsdb_idl_txn *ovnsb_txn,
>      sset_destroy(&active_ha_chassis_grps);
>  }
>
> +void
> +destroy_northd_data_tracked_changes(struct northd_data *nd)
> +{
> +    struct ls_change *ls_change;
> +    LIST_FOR_EACH_SAFE (ls_change, list_node,
> +                        &nd->tracked_ls_changes.updated) {
> +        struct ovn_port *op;
> +        LIST_FOR_EACH (op, list, &ls_change->added_ports) {
> +            ovs_list_remove(&op->list);
> +        }
> +        LIST_FOR_EACH (op, list, &ls_change->updated_ports) {
> +            ovs_list_remove(&op->list);
> +        }
> +        LIST_FOR_EACH_SAFE (op, list, &ls_change->deleted_ports) {
> +            ovs_list_remove(&op->list);
> +            ovn_port_destroy_orphan(op);
> +        }
> +        ovs_list_remove(&ls_change->list_node);
> +        free(ls_change);
> +    }
> +
> +    nd->change_tracked = false;
> +}
> +
> +/* Check if a changed LSP can be handled incrementally within the I-P engine
> + * node en_northd.
> + */
> +static bool
> +lsp_can_be_inc_processed(const struct nbrec_logical_switch_port *nbsp)
> +{
> +    /* Currently only support normal VIF. */
nit: s/Currently only support normal VIF/ Support only normal VIF for now


> +    if (nbsp->type[0]) {
> +        return false;
> +    }
> +
> +    /* Currently not support tag allocation. */
nit: s/Currently not support tag allocation/ Do not support tag
allocation for now

Same small nits for below comments.

I'm fine if you want to ignore these.


> +    if ((nbsp->parent_name && nbsp->parent_name[0]) || nbsp->tag ||
> +        nbsp->tag_request) {
> +        return false;
> +    }
> +
> +    /* Currently not support port with qos settings (need special handling for
> +     * qdisc_queue_id sync. */
> +    if (port_has_qos_params(&nbsp->options)) {
> +        return false;
> +    }
> +
> +    for (size_t j = 0; j < nbsp->n_addresses; j++) {
> +        /* Currently not support dynamic address handling. */
> +        if (is_dynamic_lsp_address(nbsp->addresses[j])) {
> +            return false;
> +        }
> +        /* Currently not support "unknown" address handling.  XXX: Need to
> +         * handle od->has_unknown change and track it when the first LSP with
> +         * 'unknown' is added or when the last one is removed. */
> +        if (!strcmp(nbsp->addresses[j], "unknown")) {
> +            return false;
> +        }
> +    }
> +
> +    return true;
> +}
> +
> +static bool
> +ls_port_has_changed(const struct nbrec_logical_switch_port *old,
> +                    const struct nbrec_logical_switch_port *new)
> +{
> +    if (old != new) {
> +        return true;
> +    }
> +    /* XXX: Need a better OVSDB IDL interface for this check. */
> +    return (nbrec_logical_switch_port_row_get_seqno(new,
> +                                OVSDB_IDL_CHANGE_MODIFY) > 0);
> +}
> +
> +static struct ovn_port *
> +ovn_port_find_in_datapath(struct ovn_datapath *od, const char *name)
> +{
> +    struct ovn_port *op;
> +    HMAP_FOR_EACH_WITH_HASH (op, dp_node, hash_string(name, 0), &od->ports) {
> +        if (!strcmp(op->key, name)) {
> +            return op;
> +        }
> +    }
> +    return NULL;
> +}
> +
> +static struct ovn_port *
> +ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct hmap *ls_ports,
> +               const char *key, const struct nbrec_logical_switch_port *nbsp,
> +               struct ovn_datapath *od, const struct sbrec_port_binding *sb,
> +               const struct sbrec_mirror_table *sbrec_mirror_table,
> +               const struct sbrec_chassis_table *sbrec_chassis_table,
> +               struct ovsdb_idl_index *sbrec_chassis_by_name,
> +               struct ovsdb_idl_index *sbrec_chassis_by_hostname)
> +{
> +    struct ovn_port *op = ovn_port_create(ls_ports, key, nbsp, NULL,
> +                                          NULL);
> +    parse_lsp_addrs(op);
> +    op->od = od;
> +    hmap_insert(&od->ports, &op->dp_node, hmap_node_hash(&op->key_node));
> +
> +    /* Assign explicitly requested tunnel ids first. */
> +    if (!ovn_port_assign_requested_tnl_id(sbrec_chassis_table, op)) {
> +        return NULL;
> +    }
> +    if (sb) {
> +        op->sb = sb;
> +        /* Keep nonconflicting tunnel IDs that are already assigned. */
> +        if (!op->tunnel_key) {
> +            ovn_port_add_tnlid(op, op->sb->tunnel_key);
> +        }
> +    } else {
> +        /* XXX: the new SB port_binding will change in IDL, so need to handle
> +         * SB port_binding updates incrementally to achieve end-to-end
> +         * incremental processing. */
> +        op->sb = sbrec_port_binding_insert(ovnsb_txn);
> +        sbrec_port_binding_set_logical_port(op->sb, op->key);
> +    }
> +    /* Assign new tunnel ids where needed. */
> +    if (!ovn_port_allocate_key(sbrec_chassis_table, ls_ports, op)) {
> +        return NULL;
> +    }
> +    ovn_port_update_sbrec(ovnsb_txn, sbrec_chassis_by_name,
> +                          sbrec_chassis_by_hostname, NULL, sbrec_mirror_table,
> +                          op, NULL, NULL);
> +    return op;
> +}
> +
> +static bool
> +check_ls_changes_other_than_lsp(const struct nbrec_logical_switch *ls)
> +{
> +    /* Check if the columns are changed in this row. */
> +    enum nbrec_logical_switch_column_id col;
> +    for (col = 0; col < NBREC_LOGICAL_SWITCH_N_COLUMNS; col++) {
> +        if (nbrec_logical_switch_is_updated(ls, col) &&
> +            col != NBREC_LOGICAL_SWITCH_COL_PORTS) {
> +            return true;
> +        }
> +    }
> +
> +    /* Check if the referenced rows are changed.
> +       XXX: Need a better OVSDB IDL interface for this check. */
> +    for (size_t i = 0; i < ls->n_acls; i++) {
> +        if (nbrec_acl_row_get_seqno(ls->acls[i],
> +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> +            return true;
> +        }
> +    }
> +    if (ls->copp && nbrec_copp_row_get_seqno(ls->copp,
> +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> +        return true;
> +    }
> +    for (size_t i = 0; i < ls->n_dns_records; i++) {
> +        if (nbrec_dns_row_get_seqno(ls->dns_records[i],
> +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> +            return true;
> +        }
> +    }
> +    for (size_t i = 0; i < ls->n_forwarding_groups; i++) {
> +        if (nbrec_forwarding_group_row_get_seqno(ls->forwarding_groups[i],
> +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> +            return true;
> +        }
> +    }
> +    for (size_t i = 0; i < ls->n_load_balancer; i++) {
> +        if (nbrec_load_balancer_row_get_seqno(ls->load_balancer[i],
> +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> +            return true;
> +        }
> +    }
> +    for (size_t i = 0; i < ls->n_load_balancer_group; i++) {
> +        if (nbrec_load_balancer_group_row_get_seqno(ls->load_balancer_group[i],
> +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> +            return true;
> +        }
> +    }
> +    for (size_t i = 0; i < ls->n_qos_rules; i++) {
> +        if (nbrec_qos_row_get_seqno(ls->qos_rules[i],
> +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> +            return true;
> +        }
> +    }
> +    return false;
> +}
> +
> +/* Return true if changes are handled incrementally, false otherwise.
> + * When there are any changes, try to track what's exactly changed and set
> + * northd_data->change_tracked accordingly: change tracked - true, otherwise,
> + * false. */
> +bool
> +northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
> +                         const struct northd_input *ni,
> +                         struct northd_data *nd)
> +{
> +    const struct nbrec_logical_switch *changed_ls;
> +    struct ls_change *ls_change = NULL;
> +
> +    NBREC_LOGICAL_SWITCH_TABLE_FOR_EACH_TRACKED (changed_ls,
> +                                             ni->nbrec_logical_switch_table) {
> +        ls_change = NULL;
> +        if (nbrec_logical_switch_is_new(changed_ls) ||
> +            nbrec_logical_switch_is_deleted(changed_ls)) {
> +            goto fail;
> +        }
> +        struct ovn_datapath *od = ovn_datapath_find(
> +                                    &nd->ls_datapaths.datapaths,
> +                                    &changed_ls->header_.uuid);
> +        if (!od) {
> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> +            VLOG_WARN_RL(&rl, "Internal error: a tracked updated LS doesn't "
> +                         "exist in ls_datapaths: "UUID_FMT,
> +                         UUID_ARGS(&changed_ls->header_.uuid));
> +            goto fail;
> +        }
> +
> +        /* Now only able to handle lsp changes. */
> +        if (check_ls_changes_other_than_lsp(changed_ls)) {
> +            goto fail;
> +        }
> +
> +        ls_change = xzalloc(sizeof *ls_change);
> +        ls_change->od = od;
> +        ovs_list_init(&ls_change->added_ports);
> +        ovs_list_init(&ls_change->deleted_ports);
> +        ovs_list_init(&ls_change->updated_ports);
> +
> +        struct ovn_port *op;
> +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
> +            op->visited = false;
> +        }
> +
> +        /* Compare the individual ports in the old and new Logical Switches */
> +        for (size_t j = 0; j < changed_ls->n_ports; ++j) {
> +            struct nbrec_logical_switch_port *new_nbsp = changed_ls->ports[j];
> +            op = ovn_port_find_in_datapath(od, new_nbsp->name);
> +
> +            if (!op) {
> +                if (!lsp_can_be_inc_processed(new_nbsp)) {
> +                    goto fail;
> +                }
> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> +                                    new_nbsp->name, new_nbsp, od, NULL,
> +                                    ni->sbrec_mirror_table,
> +                                    ni->sbrec_chassis_table,
> +                                    ni->sbrec_chassis_by_name,
> +                                    ni->sbrec_chassis_by_hostname);
> +                if (!op) {
> +                    goto fail;
> +                }
> +                ovs_list_push_back(&ls_change->added_ports,
> +                                   &op->list);
> +            } else if (ls_port_has_changed(op->nbsp, new_nbsp)) {
> +                /* Existing port updated */
> +                bool temp = false;
> +                if (lsp_is_type_changed(op->sb, new_nbsp, &temp) ||
> +                    !op->lsp_can_be_inc_processed ||
> +                    !lsp_can_be_inc_processed(new_nbsp)) {
> +                    goto fail;
> +                }
> +                const struct sbrec_port_binding *sb = op->sb;
> +                if (sset_contains(&nd->svc_monitor_lsps, new_nbsp->name)) {
> +                    /* This port is used for svc monitor, which may be impacted
> +                     * by this change. Fallback to recompute. */
> +                    goto fail;
> +                }
> +                ovn_port_destroy(&nd->ls_ports, op);
> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> +                                    new_nbsp->name, new_nbsp, od, sb,
> +                                    ni->sbrec_mirror_table,
> +                                    ni->sbrec_chassis_table,
> +                                    ni->sbrec_chassis_by_name,
> +                                    ni->sbrec_chassis_by_hostname);
> +                if (!op) {
> +                    goto fail;
> +                }
> +                ovs_list_push_back(&ls_change->updated_ports, &op->list);
> +            }
> +            op->visited = true;
> +        }
> +
> +        /* Check for deleted ports */
> +        HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
> +            if (!op->visited) {
> +                if (!op->lsp_can_be_inc_processed) {
> +                    goto fail;
> +                }
> +                if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
> +                    /* This port was used for svc monitor, which may be
> +                     * impacted by this deletion. Fallback to recompute. */
> +                    goto fail;
> +                }
> +                ovs_list_push_back(&ls_change->deleted_ports,
> +                                   &op->list);
> +                hmap_remove(&nd->ls_ports, &op->key_node);
> +                hmap_remove(&od->ports, &op->dp_node);
> +                sbrec_port_binding_delete(op->sb);
> +                delete_fdb_entry(ni->sbrec_fdb_by_dp_and_port, od->tunnel_key,
> +                                 op->tunnel_key);
> +            }
> +        }
> +
> +        if (!ovs_list_is_empty(&ls_change->added_ports) ||
> +            !ovs_list_is_empty(&ls_change->updated_ports) ||
> +            !ovs_list_is_empty(&ls_change->deleted_ports)) {
> +            ovs_list_push_back(&nd->tracked_ls_changes.updated,
> +                               &ls_change->list_node);
> +        } else {
> +            free(ls_change);
> +        }
> +    }
> +
> +    if (!ovs_list_is_empty(&nd->tracked_ls_changes.updated)) {
> +        nd->change_tracked = true;
> +    }
> +    return true;
> +
> +fail:
> +    free(ls_change);
> +    destroy_northd_data_tracked_changes(nd);
> +    return false;
> +}
> +
>  struct multicast_group {
>      const char *name;
>      uint16_t key;               /* OVN_MIN_MULTICAST...OVN_MAX_MULTICAST. */
> @@ -6072,7 +6476,7 @@ build_interconn_mcast_snoop_flows(struct ovn_datapath *od,
>
>      struct ovn_port *op;
>
> -    LIST_FOR_EACH (op, dp_node, &od->port_list) {
> +    HMAP_FOR_EACH (op, dp_node, &od->ports) {
>          if (!lsp_is_remote(op->nbsp)) {
>              continue;
>          }
> @@ -12394,7 +12798,7 @@ build_mcast_lookup_flows_for_lrouter(
>           * own mac addresses.
>           */
>          struct ovn_port *op;
> -        LIST_FOR_EACH (op, dp_node, &od->port_list) {
> +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
>              ds_clear(match);
>              ds_put_format(match, "eth.src == %s && igmp",
>                            op->lrp_networks.ea_s);
> @@ -16476,9 +16880,6 @@ destroy_datapaths_and_ports(struct ovn_datapaths *ls_datapaths,
>          }
>      }
>
> -    ovn_datapaths_destroy(ls_datapaths);
> -    ovn_datapaths_destroy(lr_datapaths);
> -
>      struct ovn_port *port;
>      HMAP_FOR_EACH_SAFE (port, key_node, ls_ports) {
>          ovn_port_destroy(ls_ports, port);
> @@ -16489,6 +16890,9 @@ destroy_datapaths_and_ports(struct ovn_datapaths *ls_datapaths,
>          ovn_port_destroy(lr_ports, port);
>      }
>      hmap_destroy(lr_ports);
> +
> +    ovn_datapaths_destroy(ls_datapaths);
> +    ovn_datapaths_destroy(lr_datapaths);
>  }
>
>  void
> @@ -16510,6 +16914,8 @@ northd_init(struct northd_data *data)
>      };
>      data->ovn_internal_version_changed = false;
>      sset_init(&data->svc_monitor_lsps);
> +    data->change_tracked = false;
> +    ovs_list_init(&data->tracked_ls_changes.updated);
>  }
>
>  void
> @@ -16976,6 +17382,7 @@ void northd_run(struct northd_input *input_data,
>                  struct ovsdb_idl_txn *ovnnb_txn,
>                  struct ovsdb_idl_txn *ovnsb_txn)
>  {
> +    COVERAGE_INC(northd_run);
>      stopwatch_start(OVNNB_DB_RUN_STOPWATCH_NAME, time_msec());
>      ovnnb_db_run(input_data, data, ovnnb_txn, ovnsb_txn);
>      stopwatch_stop(OVNNB_DB_RUN_STOPWATCH_NAME, time_msec());
> diff --git a/northd/northd.h b/northd/northd.h
> index b26828bb7111..1fd9bed477e7 100644
> --- a/northd/northd.h
> +++ b/northd/northd.h
> @@ -65,6 +65,7 @@ struct northd_input {
>      struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name;
>      struct ovsdb_idl_index *sbrec_ip_mcast_by_dp;
>      struct ovsdb_idl_index *sbrec_static_mac_binding_by_lport_ip;
> +    struct ovsdb_idl_index *sbrec_fdb_by_dp_and_port;
>  };
>
>  struct chassis_features {
> @@ -83,6 +84,22 @@ struct ovn_datapaths {
>      struct ovn_datapath **array;
>  };
>
> +/* Track what's changed for a single LS.
> + * Now only track port changes. */
> +struct ls_change {
> +    struct ovs_list list_node;
> +    struct ovn_datapath *od;
> +    struct ovs_list added_ports;
> +    struct ovs_list deleted_ports;
> +    struct ovs_list updated_ports;
> +};
> +
> +/* Track what's changed for logical switches.
> + * Now only track updated ones (added or deleted may be supported in the
> + * future). */
> +struct tracked_ls_changes {
> +    struct ovs_list updated; /* Contains struct ls_change */
> +};
>
>  struct northd_data {
>      /* Global state for 'en-northd'. */
> @@ -98,6 +115,8 @@ struct northd_data {
>      bool ovn_internal_version_changed;
>      struct chassis_features features;
>      struct sset svc_monitor_lsps;
> +    bool change_tracked;
> +    struct tracked_ls_changes tracked_ls_changes;
>  };
>
>  struct lflow_data {
> @@ -292,13 +311,19 @@ struct ovn_datapath {
>      /* Port groups related to the datapath, used only when nbs is NOT NULL. */
>      struct hmap nb_pgs;
>
> -    struct ovs_list port_list;
> +    /* Map of ovn_port objects belonging to this datapath.
> +     * This map doesn't include derived ports. */
> +    struct hmap ports;
>  };
>
>  void northd_run(struct northd_input *input_data,
>                  struct northd_data *data,
>                  struct ovsdb_idl_txn *ovnnb_txn,
>                  struct ovsdb_idl_txn *ovnsb_txn);
> +bool northd_handle_ls_changes(struct ovsdb_idl_txn *,
> +                              const struct northd_input *,
> +                              struct northd_data *);
> +void destroy_northd_data_tracked_changes(struct northd_data *);
>  void northd_destroy(struct northd_data *data);
>  void northd_init(struct northd_data *data);
>  void northd_indices_create(struct northd_data *data,
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 6736429ae201..74ae84530112 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -9481,3 +9481,61 @@ AT_CHECK([ovn-sbctl lflow-list sw | grep ls_out_pre_lb | grep priority=110 | gre
>
>  AT_CLEANUP
>  ])
> +
> +OVN_FOR_EACH_NORTHD_NO_HV([
> +AT_SETUP([LSP incremental processing])
> +ovn_start
> +
> +net_add n1
> +sim_add hv1
> +as hv1
> +ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.11
> +ovs-vsctl add-port br-int lsp0-0 -- set interface lsp0-0 external_ids:iface-id=lsp0-0
> +ovs-vsctl add-port br-int lsp0-1 -- set interface lsp0-1 external_ids:iface-id=lsp0-1
> +ovs-vsctl add-port br-int lsp0-2 -- set interface lsp0-2 external_ids:iface-id=lsp0-2
> +
> +check ovn-nbctl --wait=hv ls-add ls0
> +
> +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> +check ovn-nbctl --wait=hv lsp-add ls0 lsp0-0 -- lsp-set-addresses lsp0-0 "unknown"
> +OVS_WAIT_UNTIL([test `as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute` = 5])
> +OVS_WAIT_UNTIL([test `as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats lflow recompute` = 5])
> +
> +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> +check ovn-nbctl --wait=hv lsp-add ls0 lsp0-1 -- lsp-set-addresses lsp0-1 "aa:aa:aa:00:00:01 192.168.0.11"
> +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute], [0], [3
> +])
> +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats lflow recompute], [0], [6
> +])
> +
> +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> +check ovn-nbctl --wait=hv lsp-add ls0 lsp0-2 -- lsp-set-addresses lsp0-2 "aa:aa:aa:00:00:02 192.168.0.12"
> +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute], [0], [3
> +])
> +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats lflow recompute], [0], [6
> +])
> +
> +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> +check ovn-nbctl --wait=hv lsp-del lsp0-1
> +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute], [0], [1
> +])
> +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats lflow recompute], [0], [2
> +])
> +
> +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> +check ovn-nbctl --wait=hv lsp-set-addresses lsp0-2 "aa:aa:aa:00:00:88 192.168.0.88"
> +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute], [0], [1
> +])
> +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats lflow recompute], [0], [2
> +])
> +
> +# No change, no recompute
> +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> +check ovn-nbctl --wait=sb sync
> +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute], [0], [0
> +])
> +
> +OVN_CLEANUP([hv1])
> +AT_CLEANUP
> +])
> --
> 2.30.2
>
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Han Zhou June 8, 2023, 5:55 a.m. UTC | #3
On Wed, Jun 7, 2023 at 12:39 PM Numan Siddique <numans@ovn.org> wrote:
>
> On Fri, Jun 2, 2023 at 12:13 AM Han Zhou <hzhou@ovn.org> wrote:
> >
> > This patch introduces a change handler for NB logical_switch within the
> > 'northd' node. It specifically handles cases where logical switch ports
> > in the tracked logical switches are changed (added/updated/deleted).
> > Only regular logical switch ports - which are common for VMs/Pods - are
> > addressed. For other scenarios, it reverts to recompute.
> >
> > This update avoids recompute in the northd node (especially the
> > resource-intensive build_ports()) for NB changes of VIF
> > add/update/delete.  However, it does not eliminate the need for lflow
> > recompute.
> >
> > Below are the performance test results simulating an ovn-k8s topology of
> > 500 nodes x 50 lsp per node:
> >
> > Before:
> > ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> >     ovn-northd completion:          955ms
> > ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01
-- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> >     ovn-northd completion:          919ms
> >
> > After:
> > ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> >     ovn-northd completion:          776ms
> > ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01
-- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> >     ovn-northd completion:          773ms
> >
> > Both addition and deletion show ~20% reduction of ovn-northd completion
> > time.  Note: the test uses only 1 thread of ovn-northd for flow
> > recompute. Using multithread should show a larger percentage of
> > improvement.
> >
> > Signed-off-by: Han Zhou <hzhou@ovn.org>
> > Reviewed-by: Ales Musil <amusil@redhat.com>
>
> Few nits below.
>
> Acked-by: Numan Siddique <numans@ovn.org>
>
> Numan
>
>
> > ---
> >  lib/ovn-util.c           |  15 ++
> >  lib/ovn-util.h           |   1 +
> >  northd/en-northd.c       | 130 ++++++---
> >  northd/en-northd.h       |   2 +
> >  northd/inc-proc-northd.c |  12 +-
> >  northd/northd.c          | 567 +++++++++++++++++++++++++++++++++------
> >  northd/northd.h          |  27 +-
> >  tests/ovn-northd.at      |  58 ++++
> >  8 files changed, 685 insertions(+), 127 deletions(-)
> >
> > diff --git a/lib/ovn-util.c b/lib/ovn-util.c
> > index bffb521cfd9d..da2a9d45ac64 100644
> > --- a/lib/ovn-util.c
> > +++ b/lib/ovn-util.c
> > @@ -720,6 +720,21 @@ ovn_allocate_tnlid(struct hmap *set, const char
*name, uint32_t min,
> >      return 0;
> >  }
> >
> > +bool
> > +ovn_free_tnlid(struct hmap *tnlids, uint32_t tnlid)
> > +{
> > +    uint32_t hash = hash_int(tnlid, 0);
> > +    struct tnlid_node *node;
> > +    HMAP_FOR_EACH_IN_BUCKET (node, hmap_node, hash, tnlids) {
> > +        if (node->tnlid == tnlid) {
> > +            hmap_remove(tnlids, &node->hmap_node);
> > +            free(node);
> > +            return true;
> > +        }
> > +    }
> > +    return false;
> > +}
> > +
> >  char *
> >  ovn_chassis_redirect_name(const char *port_name)
> >  {
> > diff --git a/lib/ovn-util.h b/lib/ovn-util.h
> > index b17b0e2364c5..9681a12219a9 100644
> > --- a/lib/ovn-util.h
> > +++ b/lib/ovn-util.h
> > @@ -167,6 +167,7 @@ bool ovn_add_tnlid(struct hmap *set, uint32_t
tnlid);
> >  bool ovn_tnlid_present(struct hmap *tnlids, uint32_t tnlid);
> >  uint32_t ovn_allocate_tnlid(struct hmap *set, const char *name,
uint32_t min,
> >                              uint32_t max, uint32_t *hint);
> > +bool ovn_free_tnlid(struct hmap *tnlids, uint32_t tnlid);
> >
> >  static inline void
> >  get_unique_lport_key(uint64_t dp_tunnel_key, uint64_t lport_tunnel_key,
> > diff --git a/northd/en-northd.c b/northd/en-northd.c
> > index a3dc37e198e3..f2bf98f774b1 100644
> > --- a/northd/en-northd.c
> > +++ b/northd/en-northd.c
> > @@ -31,92 +31,103 @@
> >
> >  VLOG_DEFINE_THIS_MODULE(en_northd);
> >
> > -void en_northd_run(struct engine_node *node, void *data)
> > +static void
> > +northd_get_input_data(struct engine_node *node,
> > +                      struct northd_input *input_data)
> >  {
> > -    const struct engine_context *eng_ctx = engine_get_context();
> > -
> > -    struct northd_input input_data;
> > -
> > -    northd_destroy(data);
> > -    northd_init(data);
> > -
> > -    input_data.sbrec_chassis_by_name =
> > +    input_data->sbrec_chassis_by_name =
> >          engine_ovsdb_node_get_index(
> >              engine_get_input("SB_chassis", node),
> >              "sbrec_chassis_by_name");
> > -    input_data.sbrec_chassis_by_hostname =
> > +    input_data->sbrec_chassis_by_hostname =
> >          engine_ovsdb_node_get_index(
> >              engine_get_input("SB_chassis", node),
> >              "sbrec_chassis_by_hostname");
> > -    input_data.sbrec_ha_chassis_grp_by_name =
> > +    input_data->sbrec_ha_chassis_grp_by_name =
> >          engine_ovsdb_node_get_index(
> >              engine_get_input("SB_ha_chassis_group", node),
> >              "sbrec_ha_chassis_grp_by_name");
> > -    input_data.sbrec_ip_mcast_by_dp =
> > +    input_data->sbrec_ip_mcast_by_dp =
> >          engine_ovsdb_node_get_index(
> >              engine_get_input("SB_ip_multicast", node),
> >              "sbrec_ip_mcast_by_dp");
> > -    input_data.sbrec_static_mac_binding_by_lport_ip =
> > +    input_data->sbrec_static_mac_binding_by_lport_ip =
> >          engine_ovsdb_node_get_index(
> >              engine_get_input("SB_static_mac_binding", node),
> >              "sbrec_static_mac_binding_by_lport_ip");
> > +    input_data->sbrec_fdb_by_dp_and_port =
> > +        engine_ovsdb_node_get_index(
> > +            engine_get_input("SB_fdb", node),
> > +            "sbrec_fdb_by_dp_and_port");
> >
> > -    input_data.nbrec_nb_global_table =
> > +    input_data->nbrec_nb_global_table =
> >          EN_OVSDB_GET(engine_get_input("NB_nb_global", node));
> > -    input_data.nbrec_logical_switch_table =
> > +    input_data->nbrec_logical_switch_table =
> >          EN_OVSDB_GET(engine_get_input("NB_logical_switch", node));
> > -    input_data.nbrec_logical_router_table =
> > +    input_data->nbrec_logical_router_table =
> >          EN_OVSDB_GET(engine_get_input("NB_logical_router", node));
> > -    input_data.nbrec_load_balancer_table =
> > +    input_data->nbrec_load_balancer_table =
> >          EN_OVSDB_GET(engine_get_input("NB_load_balancer", node));
> > -    input_data.nbrec_load_balancer_group_table =
> > +    input_data->nbrec_load_balancer_group_table =
> >          EN_OVSDB_GET(engine_get_input("NB_load_balancer_group", node));
> > -    input_data.nbrec_port_group_table =
> > +    input_data->nbrec_port_group_table =
> >          EN_OVSDB_GET(engine_get_input("NB_port_group", node));
> > -    input_data.nbrec_meter_table =
> > +    input_data->nbrec_meter_table =
> >          EN_OVSDB_GET(engine_get_input("NB_meter", node));
> > -    input_data.nbrec_acl_table =
> > +    input_data->nbrec_acl_table =
> >          EN_OVSDB_GET(engine_get_input("NB_acl", node));
> > -    input_data.nbrec_static_mac_binding_table =
> > +    input_data->nbrec_static_mac_binding_table =
> >          EN_OVSDB_GET(engine_get_input("NB_static_mac_binding", node));
> > -    input_data.nbrec_chassis_template_var_table =
> > +    input_data->nbrec_chassis_template_var_table =
> >          EN_OVSDB_GET(engine_get_input("NB_chassis_template_var",
node));
> > -    input_data.nbrec_mirror_table =
> > +    input_data->nbrec_mirror_table =
> >          EN_OVSDB_GET(engine_get_input("NB_mirror", node));
> >
> > -    input_data.sbrec_sb_global_table =
> > +    input_data->sbrec_sb_global_table =
> >          EN_OVSDB_GET(engine_get_input("SB_sb_global", node));
> > -    input_data.sbrec_datapath_binding_table =
> > +    input_data->sbrec_datapath_binding_table =
> >          EN_OVSDB_GET(engine_get_input("SB_datapath_binding", node));
> > -    input_data.sbrec_port_binding_table =
> > +    input_data->sbrec_port_binding_table =
> >          EN_OVSDB_GET(engine_get_input("SB_port_binding", node));
> > -    input_data.sbrec_mac_binding_table =
> > +    input_data->sbrec_mac_binding_table =
> >          EN_OVSDB_GET(engine_get_input("SB_mac_binding", node));
> > -    input_data.sbrec_ha_chassis_group_table =
> > +    input_data->sbrec_ha_chassis_group_table =
> >          EN_OVSDB_GET(engine_get_input("SB_ha_chassis_group", node));
> > -    input_data.sbrec_chassis_table =
> > +    input_data->sbrec_chassis_table =
> >          EN_OVSDB_GET(engine_get_input("SB_chassis", node));
> > -    input_data.sbrec_fdb_table =
> > +    input_data->sbrec_fdb_table =
> >          EN_OVSDB_GET(engine_get_input("SB_fdb", node));
> > -    input_data.sbrec_load_balancer_table =
> > +    input_data->sbrec_load_balancer_table =
> >          EN_OVSDB_GET(engine_get_input("SB_load_balancer", node));
> > -    input_data.sbrec_service_monitor_table =
> > +    input_data->sbrec_service_monitor_table =
> >          EN_OVSDB_GET(engine_get_input("SB_service_monitor", node));
> > -    input_data.sbrec_port_group_table =
> > +    input_data->sbrec_port_group_table =
> >          EN_OVSDB_GET(engine_get_input("SB_port_group", node));
> > -    input_data.sbrec_meter_table =
> > +    input_data->sbrec_meter_table =
> >          EN_OVSDB_GET(engine_get_input("SB_meter", node));
> > -    input_data.sbrec_dns_table =
> > +    input_data->sbrec_dns_table =
> >          EN_OVSDB_GET(engine_get_input("SB_dns", node));
> > -    input_data.sbrec_ip_multicast_table =
> > +    input_data->sbrec_ip_multicast_table =
> >          EN_OVSDB_GET(engine_get_input("SB_ip_multicast", node));
> > -    input_data.sbrec_static_mac_binding_table =
> > +    input_data->sbrec_static_mac_binding_table =
> >          EN_OVSDB_GET(engine_get_input("SB_static_mac_binding", node));
> > -    input_data.sbrec_chassis_template_var_table =
> > +    input_data->sbrec_chassis_template_var_table =
> >          EN_OVSDB_GET(engine_get_input("SB_chassis_template_var",
node));
> > -    input_data.sbrec_mirror_table =
> > +    input_data->sbrec_mirror_table =
> >          EN_OVSDB_GET(engine_get_input("SB_mirror", node));
> > +}
> >
> > +void
> > +en_northd_run(struct engine_node *node, void *data)
> > +{
> > +    const struct engine_context *eng_ctx = engine_get_context();
> > +
> > +    struct northd_input input_data;
> > +
> > +    northd_destroy(data);
> > +    northd_init(data);
> > +
> > +    northd_get_input_data(node, &input_data);
> >      northd_run(&input_data, data,
> >                 eng_ctx->ovnnb_idl_txn,
> >                 eng_ctx->ovnsb_idl_txn);
> > @@ -148,17 +159,48 @@ northd_nb_nb_global_handler(struct engine_node
*node,
> >      return true;
> >  }
> >
> > -void *en_northd_init(struct engine_node *node OVS_UNUSED,
> > -                     struct engine_arg *arg OVS_UNUSED)
> > +bool
> > +northd_nb_logical_switch_handler(struct engine_node *node,
> > +                                 void *data)
> >  {
> > -    struct northd_data *data = xmalloc(sizeof *data);
> > +    const struct engine_context *eng_ctx = engine_get_context();
> > +    struct northd_data *nd = data;
> > +
> > +    struct northd_input input_data;
> > +
> > +    northd_get_input_data(node, &input_data);
> > +
> > +    if (!northd_handle_ls_changes(eng_ctx->ovnsb_idl_txn, &input_data,
nd)) {
> > +        return false;
> > +    }
> > +
> > +    if (nd->change_tracked) {
> > +        engine_set_node_state(node, EN_UPDATED);
> > +    }
> > +
> > +    return true;
> > +}
> > +
> > +void
> > +*en_northd_init(struct engine_node *node OVS_UNUSED,
> > +                struct engine_arg *arg OVS_UNUSED)
> > +{
> > +    struct northd_data *data = xzalloc(sizeof *data);
> >
> >      northd_init(data);
> >
> >      return data;
> >  }
> >
> > -void en_northd_cleanup(void *data)
> > +void
> > +en_northd_cleanup(void *data)
> >  {
> >      northd_destroy(data);
> >  }
> > +
> > +void
> > +en_northd_clear_tracked_data(void *data_)
> > +{
> > +    struct northd_data *data = data_;
> > +    destroy_northd_data_tracked_changes(data);
> > +}
> > diff --git a/northd/en-northd.h b/northd/en-northd.h
> > index 8d8343b459a6..a53a162bda48 100644
> > --- a/northd/en-northd.h
> > +++ b/northd/en-northd.h
> > @@ -13,6 +13,8 @@ void en_northd_run(struct engine_node *node
OVS_UNUSED, void *data OVS_UNUSED);
> >  void *en_northd_init(struct engine_node *node OVS_UNUSED,
> >                       struct engine_arg *arg);
> >  void en_northd_cleanup(void *data);
> > +void en_northd_clear_tracked_data(void *data);
> >  bool northd_nb_nb_global_handler(struct engine_node *, void *data
OVS_UNUSED);
> > +bool northd_nb_logical_switch_handler(struct engine_node *, void
*data);
> >
> >  #endif /* EN_NORTHD_H */
> > diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
> > index 863c9323c444..f992a9ec8420 100644
> > --- a/northd/inc-proc-northd.c
> > +++ b/northd/inc-proc-northd.c
> > @@ -128,7 +128,7 @@ enum sb_engine_node {
> >
> >  /* Define engine nodes for other nodes. They should be defined as
static to
> >   * avoid sparse errors. */
> > -static ENGINE_NODE(northd, "northd");
> > +static ENGINE_NODE_WITH_CLEAR_TRACK_DATA(northd, "northd");
> >  static ENGINE_NODE(lflow, "lflow");
> >  static ENGINE_NODE(mac_binding_aging, "mac_binding_aging");
> >  static ENGINE_NODE(mac_binding_aging_waker, "mac_binding_aging_waker");
> > @@ -143,7 +143,8 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
> >       * on the second argument */
> >      engine_add_input(&en_northd, &en_nb_nb_global,
> >                       northd_nb_nb_global_handler);
> > -    engine_add_input(&en_northd, &en_nb_logical_switch, NULL);
> > +    engine_add_input(&en_northd, &en_nb_logical_switch,
> > +                     northd_nb_logical_switch_handler);
> >      engine_add_input(&en_northd, &en_nb_port_group, NULL);
> >      engine_add_input(&en_northd, &en_nb_load_balancer, NULL);
> >      engine_add_input(&en_northd, &en_nb_load_balancer_group, NULL);
> > @@ -252,6 +253,13 @@ void inc_proc_northd_init(struct ovsdb_idl_loop
*nb,
> >                                  "sbrec_address_set_by_name",
> >                                  sbrec_address_set_by_name);
> >
> > +    struct ovsdb_idl_index *sbrec_fdb_by_dp_and_port
> > +        = ovsdb_idl_index_create2(sb->idl, &sbrec_fdb_col_dp_key,
> > +                                  &sbrec_fdb_col_port_key);
> > +    engine_ovsdb_node_add_index(&en_sb_fdb,
> > +                                "sbrec_fdb_by_dp_and_port",
> > +                                sbrec_fdb_by_dp_and_port);
> > +
> >      struct northd_data *northd_data =
> >          engine_get_internal_data(&en_northd);
> >      unixctl_command_register("debug/chassis-features-list", "", 0, 0,
> > diff --git a/northd/northd.c b/northd/northd.c
> > index d79f075681d9..1a4e24978642 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -19,6 +19,7 @@
> >
> >  #include "debug.h"
> >  #include "bitmap.h"
> > +#include "coverage.h"
> >  #include "dirs.h"
> >  #include "ipam.h"
> >  #include "openvswitch/dynamic-string.h"
> > @@ -59,6 +60,8 @@
> >
> >  VLOG_DEFINE_THIS_MODULE(northd);
> >
> > +COVERAGE_DEFINE(northd_run);
> > +
> >  static bool controller_event_en;
> >  static bool lflow_hash_lock_initialized = false;
> >
> > @@ -807,13 +810,20 @@ ovn_datapath_create(struct hmap *datapaths, const
struct uuid *key,
> >      od->port_key_hint = 0;
> >      hmap_insert(datapaths, &od->key_node, uuid_hash(&od->key));
> >      od->lr_group = NULL;
> > -    ovs_list_init(&od->port_list);
> > +    hmap_init(&od->ports);
> >      return od;
> >  }
> >
> >  static void ovn_ls_port_group_destroy(struct hmap *nb_pgs);
> >  static void destroy_mcast_info_for_datapath(struct ovn_datapath *od);
> >
> > +static void
> > +destroy_ports_for_datapath(struct ovn_datapath *od)
> > +{
> > +    ovs_assert(hmap_is_empty(&od->ports));
> > +    hmap_destroy(&od->ports);
> > +}
> > +
> >  static void
> >  ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od)
> >  {
> > @@ -834,6 +844,7 @@ ovn_datapath_destroy(struct hmap *datapaths, struct
ovn_datapath *od)
> >          free(od->l3dgw_ports);
> >          ovn_ls_port_group_destroy(&od->nb_pgs);
> >          destroy_mcast_info_for_datapath(od);
> > +        destroy_ports_for_datapath(od);
> >
> >          free(od);
> >      }
> > @@ -1480,6 +1491,9 @@ struct ovn_port {
> >      struct lport_addresses *ps_addrs;   /* Port security addresses. */
> >      unsigned int n_ps_addrs;
> >
> > +    bool lsp_can_be_inc_processed; /* If it can be incrementally
processed when
> > +                                      the port changes. */
> > +
> >      /* Logical router port data. */
> >      const struct nbrec_logical_router_port *nbrp; /* May be NULL. */
> >
> > @@ -1521,11 +1535,16 @@ struct ovn_port {
> >
> >      struct ovs_list list;       /* In list of similar records. */
> >
> > -    struct ovs_list dp_node;
> > +    struct hmap_node dp_node;   /* Node in od->ports. */
> >
> >      struct lport_addresses proxy_arp_addrs;
> > +
> > +    /* Temporarily used for traversing a list (or hmap) of ports. */
> > +    bool visited;
> >  };
> >
> > +static bool lsp_can_be_inc_processed(const struct
nbrec_logical_switch_port *);
> > +
> >  static bool
> >  is_l3dgw_port(const struct ovn_port *op)
> >  {
> > @@ -1585,6 +1604,9 @@ ovn_port_set_nb(struct ovn_port *op,
> >                  const struct nbrec_logical_router_port *nbrp)
> >  {
> >      op->nbsp = nbsp;
> > +    if (nbsp) {
> > +        op->lsp_can_be_inc_processed = lsp_can_be_inc_processed(nbsp);
> > +    }
> >      op->nbrp = nbrp;
> >      init_mcast_port_info(&op->mcast_info, op->nbsp, op->nbrp);
> >  }
> > @@ -1610,35 +1632,46 @@ ovn_port_create(struct hmap *ports, const char
*key,
> >  }
> >
> >  static void
> > -ovn_port_destroy(struct hmap *ports, struct ovn_port *port)
> > +ovn_port_destroy_orphan(struct ovn_port *port)
> >  {
> > -    if (port) {
> > -        /* Don't remove port->list.  It is used within build_ports()
as a
> > -         * private list and once we've exited that function it is not
safe to
> > -         * use it. */
> > -        hmap_remove(ports, &port->key_node);
> > +    if (port->tunnel_key) {
> > +        ovs_assert(port->od);
> > +        ovn_free_tnlid(&port->od->port_tnlids, port->tunnel_key);
> > +    }
> > +    for (int i = 0; i < port->n_lsp_addrs; i++) {
> > +        destroy_lport_addresses(&port->lsp_addrs[i]);
> > +    }
> > +    free(port->lsp_addrs);
> >
> > -        if (port->peer) {
> > -            port->peer->peer = NULL;
> > -        }
> > +    if (port->peer) {
> > +        port->peer->peer = NULL;
> > +    }
> >
> > -        for (int i = 0; i < port->n_lsp_addrs; i++) {
> > -            destroy_lport_addresses(&port->lsp_addrs[i]);
> > -        }
> > -        free(port->lsp_addrs);
> > +    for (int i = 0; i < port->n_ps_addrs; i++) {
> > +        destroy_lport_addresses(&port->ps_addrs[i]);
> > +    }
> > +    free(port->ps_addrs);
> >
> > -        for (int i = 0; i < port->n_ps_addrs; i++) {
> > -            destroy_lport_addresses(&port->ps_addrs[i]);
> > -        }
> > -        free(port->ps_addrs);
> > +    destroy_routable_addresses(&port->routables);
> >
> > -        destroy_routable_addresses(&port->routables);
> > +    destroy_lport_addresses(&port->lrp_networks);
> > +    destroy_lport_addresses(&port->proxy_arp_addrs);
> > +    free(port->json_key);
> > +    free(port->key);
> > +    free(port);
> > +}
> >
> > -        destroy_lport_addresses(&port->lrp_networks);
> > -        destroy_lport_addresses(&port->proxy_arp_addrs);
> > -        free(port->json_key);
> > -        free(port->key);
> > -        free(port);
> > +static void
> > +ovn_port_destroy(struct hmap *ports, struct ovn_port *port)
> > +{
> > +    if (port) {
> > +        /* Don't remove port->list. The node should be removed from
such lists
> > +         * before calling this function. */
>
> nit:  Please update the comment accordingly (from list to hmap)

This comment is indeed for the port->list. So I kept it as is.
>
>
>
> > +        hmap_remove(ports, &port->key_node);
> > +        if (port->od && !port->l3dgw_port) {
> > +            hmap_remove(&port->od->ports, &port->dp_node);
> > +        }
> > +        ovn_port_destroy_orphan(port);
> >      }
> >  }
> >
> > @@ -2408,6 +2441,53 @@ tag_alloc_create_new_tag(struct hmap
*tag_alloc_table,
> >  }
> >
> >
> > +static void
> > +parse_lsp_addrs(struct ovn_port *op)
> > +{
> > +    const struct nbrec_logical_switch_port *nbsp = op->nbsp;
> > +    ovs_assert(nbsp);
> > +    op->lsp_addrs
> > +        = xmalloc(sizeof *op->lsp_addrs * nbsp->n_addresses);
> > +    for (size_t j = 0; j < nbsp->n_addresses; j++) {
> > +        if (!strcmp(nbsp->addresses[j], "unknown")) {
> > +            op->has_unknown = true;
> > +            continue;
> > +        }
> > +        if (!strcmp(nbsp->addresses[j], "router")) {
> > +            continue;
> > +        }
> > +        if (is_dynamic_lsp_address(nbsp->addresses[j])) {
> > +            continue;
> > +        } else if (!extract_lsp_addresses(nbsp->addresses[j],
> > +                               &op->lsp_addrs[op->n_lsp_addrs])) {
> > +            static struct vlog_rate_limit rl
> > +                = VLOG_RATE_LIMIT_INIT(1, 1);
> > +            VLOG_INFO_RL(&rl, "invalid syntax '%s' in logical "
> > +                              "switch port addresses. No MAC "
> > +                              "address found",
> > +                              op->nbsp->addresses[j]);
> > +            continue;
> > +        }
> > +        op->n_lsp_addrs++;
> > +    }
> > +    op->n_lsp_non_router_addrs = op->n_lsp_addrs;
> > +
> > +    op->ps_addrs
> > +        = xmalloc(sizeof *op->ps_addrs * nbsp->n_port_security);
> > +    for (size_t j = 0; j < nbsp->n_port_security; j++) {
> > +        if (!extract_lsp_addresses(nbsp->port_security[j],
> > +                                   &op->ps_addrs[op->n_ps_addrs])) {
> > +            static struct vlog_rate_limit rl
> > +                = VLOG_RATE_LIMIT_INIT(1, 1);
> > +            VLOG_INFO_RL(&rl, "invalid syntax '%s' in port "
> > +                              "security. No MAC address found",
> > +                              op->nbsp->port_security[j]);
> > +            continue;
> > +        }
> > +        op->n_ps_addrs++;
> > +    }
> > +}
> > +
> >  static void
> >  join_logical_ports(const struct sbrec_port_binding_table
*sbrec_pb_table,
> >                     struct hmap *ls_datapaths, struct hmap
*lr_datapaths,
> > @@ -2504,49 +2584,11 @@ join_logical_ports(const struct
sbrec_port_binding_table *sbrec_pb_table,
> >                  od->has_vtep_lports = true;
> >              }
> >
> > -            op->lsp_addrs
> > -                = xmalloc(sizeof *op->lsp_addrs * nbsp->n_addresses);
> > -            for (size_t j = 0; j < nbsp->n_addresses; j++) {
> > -                if (!strcmp(nbsp->addresses[j], "unknown")) {
> > -                    op->has_unknown = true;
> > -                    continue;
> > -                }
> > -                if (!strcmp(nbsp->addresses[j], "router")) {
> > -                    continue;
> > -                }
> > -                if (is_dynamic_lsp_address(nbsp->addresses[j])) {
> > -                    continue;
> > -                } else if (!extract_lsp_addresses(nbsp->addresses[j],
> > -
&op->lsp_addrs[op->n_lsp_addrs])) {
> > -                    static struct vlog_rate_limit rl
> > -                        = VLOG_RATE_LIMIT_INIT(1, 1);
> > -                    VLOG_INFO_RL(&rl, "invalid syntax '%s' in logical "
> > -                                      "switch port addresses. No MAC "
> > -                                      "address found",
> > -                                      op->nbsp->addresses[j]);
> > -                    continue;
> > -                }
> > -                op->n_lsp_addrs++;
> > -            }
> > -            op->n_lsp_non_router_addrs = op->n_lsp_addrs;
> > -
> > -            op->ps_addrs
> > -                = xmalloc(sizeof *op->ps_addrs *
nbsp->n_port_security);
> > -            for (size_t j = 0; j < nbsp->n_port_security; j++) {
> > -                if (!extract_lsp_addresses(nbsp->port_security[j],
> > -
&op->ps_addrs[op->n_ps_addrs])) {
> > -                    static struct vlog_rate_limit rl
> > -                        = VLOG_RATE_LIMIT_INIT(1, 1);
> > -                    VLOG_INFO_RL(&rl, "invalid syntax '%s' in port "
> > -                                      "security. No MAC address found",
> > -                                      op->nbsp->port_security[j]);
> > -                    continue;
> > -                }
> > -                op->n_ps_addrs++;
> > -            }
> > +            parse_lsp_addrs(op);
> >
> >              op->od = od;
> > -            ovs_list_push_back(&od->port_list, &op->dp_node);
> > +            hmap_insert(&od->ports, &op->dp_node,
> > +                        hmap_node_hash(&op->key_node));
> >              tag_alloc_add_existing_tags(tag_alloc_table, nbsp);
> >          }
> >      }
> > @@ -2593,7 +2635,8 @@ join_logical_ports(const struct
sbrec_port_binding_table *sbrec_pb_table,
> >
> >              op->lrp_networks = lrp_networks;
> >              op->od = od;
> > -            ovs_list_push_back(&od->port_list, &op->dp_node);
> > +            hmap_insert(&od->ports, &op->dp_node,
> > +                        hmap_node_hash(&op->key_node));
> >
> >              if (!od->redirect_bridged) {
> >                  const char *redirect_type =
> > @@ -3314,6 +3357,10 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn
*ovnsb_txn,
> >          struct smap new;
> >          smap_init(&new);
> >          if (is_cr_port(op)) {
> > +            ovs_assert(sbrec_chassis_by_name);
> > +            ovs_assert(sbrec_chassis_by_hostname);
> > +            ovs_assert(sbrec_ha_chassis_grp_by_name);
> > +            ovs_assert(active_ha_chassis_grps);
> >              const char *redirect_type = smap_get(&op->nbrp->options,
> >                                                   "redirect-type");
> >
> > @@ -3423,8 +3470,10 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn
*ovnsb_txn,
> >              struct smap options;
> >
> >              if (has_qos && !queue_id) {
> > +                ovs_assert(queue_id_bitmap);
> >                  queue_id = allocate_queueid(queue_id_bitmap);
> >              } else if (!has_qos && queue_id) {
> > +                ovs_assert(queue_id_bitmap);
> >                  bitmap_set0(queue_id_bitmap, queue_id);
> >                  queue_id = 0;
> >              }
> > @@ -3473,6 +3522,10 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn
*ovnsb_txn,
> >              sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0);
> >
> >              if (!strcmp(op->nbsp->type, "external")) {
> > +                ovs_assert(sbrec_chassis_by_name);
> > +                ovs_assert(sbrec_chassis_by_hostname);
> > +                ovs_assert(sbrec_ha_chassis_grp_by_name);
> > +                ovs_assert(active_ha_chassis_grps);
> >                  if (op->nbsp->ha_chassis_group) {
> >                      sync_ha_chassis_group_for_sbpb(
> >                          ovnsb_txn, sbrec_chassis_by_name,
> > @@ -3729,6 +3782,24 @@ cleanup_stale_fdb_entries(const struct
sbrec_fdb_table *sbrec_fdb_table,
> >      }
> >  }
> >
> > +static void
> > +delete_fdb_entry(struct ovsdb_idl_index *sbrec_fdb_by_dp_and_port,
> > +                 uint32_t dp_key, uint32_t port_key)
> > +{
> > +    struct sbrec_fdb *target =
> > +        sbrec_fdb_index_init_row(sbrec_fdb_by_dp_and_port);
> > +    sbrec_fdb_index_set_dp_key(target, dp_key);
> > +    sbrec_fdb_index_set_port_key(target, port_key);
> > +
> > +    struct sbrec_fdb *fdb_e =
sbrec_fdb_index_find(sbrec_fdb_by_dp_and_port,
> > +                                                   target);
> > +    sbrec_fdb_index_destroy_row(target);
> > +
> > +    if (fdb_e) {
> > +        sbrec_fdb_delete(fdb_e);
> > +    }
> > +}
> > +
> >  struct service_monitor_info {
> >      struct hmap_node hmap_node;
> >      const struct sbrec_service_monitor *sbrec_mon;
> > @@ -3820,14 +3891,15 @@ ovn_lb_svc_create(struct ovsdb_idl_txn
*ovnsb_txn, struct ovn_northd_lb *lb,
> >              }
> >              ds_destroy(&key);
> >
> > -            backend_nb->op = op;
> > -            backend_nb->svc_mon_src_ip = svc_mon_src_ip;
> > -
> >              if (!lb_vip_nb->lb_health_check || !op || !svc_mon_src_ip
||
> >                  !lsp_is_enabled(op->nbsp)) {
> > +                free(svc_mon_src_ip);
> >                  continue;
> >              }
> >
> > +            backend_nb->op = op;
> > +            backend_nb->svc_mon_src_ip = svc_mon_src_ip;
> > +
> >              const char *protocol = lb->nlb->protocol;
> >              if (!protocol || !protocol[0]) {
> >                  protocol = "tcp";
> > @@ -4155,7 +4227,7 @@ build_lrouter_lb_reachable_ips(struct
ovn_datapath *od,
> >              ovs_be32 vip_ip4 =
in6_addr_get_mapped_ipv4(&lb->vips[i].vip);
> >              struct ovn_port *op;
> >
> > -            LIST_FOR_EACH (op, dp_node, &od->port_list) {
> > +            HMAP_FOR_EACH (op, dp_node, &od->ports) {
> >                  if (lrouter_port_ipv4_reachable(op, vip_ip4)) {
> >                      sset_add(&od->lb_ips->ips_v4_reachable,
> >                               lb->vips[i].vip_str);
> > @@ -4165,7 +4237,7 @@ build_lrouter_lb_reachable_ips(struct
ovn_datapath *od,
> >          } else {
> >              struct ovn_port *op;
> >
> > -            LIST_FOR_EACH (op, dp_node, &od->port_list) {
> > +            HMAP_FOR_EACH (op, dp_node, &od->ports) {
> >                  if (lrouter_port_ipv6_reachable(op, &lb->vips[i].vip))
{
> >                      sset_add(&od->lb_ips->ips_v6_reachable,
> >                               lb->vips[i].vip_str);
> > @@ -4538,7 +4610,10 @@ ovn_port_add_tnlid(struct ovn_port *op, uint32_t
tunnel_key)
> >      return added;
> >  }
> >
> > -static void
> > +/* Returns false if the requested key is confict with another
allocated key, so
> > + * that the I-P engine can fallback to recompute if needed; otherwise
return
> > + * true (even if the key is not allocated). */
> > +static bool
> >  ovn_port_assign_requested_tnl_id(
> >      const struct sbrec_chassis_table *sbrec_chassis_table, struct
ovn_port *op)
> >  {
> > @@ -4553,7 +4628,7 @@ ovn_port_assign_requested_tnl_id(
> >              VLOG_WARN_RL(&rl, "Tunnel key %"PRIu32" for port %s "
> >                           "is incompatible with VXLAN",
> >                           tunnel_key, op_get_name(op));
> > -            return;
> > +            return true;
> >          }
> >          if (!ovn_port_add_tnlid(op, tunnel_key)) {
> >              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
1);
> > @@ -4561,11 +4636,13 @@ ovn_port_assign_requested_tnl_id(
> >                           "%"PRIu32" as another LSP or LRP",
> >                           op->nbsp ? "switch" : "router",
> >                           op_get_name(op), tunnel_key);
> > +            return false;
> >          }
> >      }
> > +    return true;
> >  }
> >
> > -static void
> > +static bool
> >  ovn_port_allocate_key(const struct sbrec_chassis_table
*sbrec_chassis_table,
> >                        struct hmap *ports,
> >                        struct ovn_port *op)
> > @@ -4581,8 +4658,10 @@ ovn_port_allocate_key(const struct
sbrec_chassis_table *sbrec_chassis_table,
> >              }
> >              ovs_list_remove(&op->list);
> >              ovn_port_destroy(ports, op);
> > +            return false;
> >          }
> >      }
> > +    return true;
> >  }
> >
> >  /* Updates the southbound Port_Binding table so that it contains the
logical
> > @@ -4605,6 +4684,8 @@ build_ports(struct ovsdb_idl_txn *ovnsb_txn,
> >      struct hmap *ls_ports, struct hmap *lr_ports)
> >  {
> >      struct ovs_list sb_only, nb_only, both;
> > +    /* XXX: Add tag_alloc_table and queue_id_bitmap as part of
northd_data
> > +     * to improve I-P. */
> >      struct hmap tag_alloc_table = HMAP_INITIALIZER(&tag_alloc_table);
> >      unsigned long *queue_id_bitmap =
bitmap_allocate(QDISC_MAX_QUEUE_ID + 1);
> >      bitmap_set1(queue_id_bitmap, 0);
> > @@ -4711,6 +4792,329 @@ build_ports(struct ovsdb_idl_txn *ovnsb_txn,
> >      sset_destroy(&active_ha_chassis_grps);
> >  }
> >
> > +void
> > +destroy_northd_data_tracked_changes(struct northd_data *nd)
> > +{
> > +    struct ls_change *ls_change;
> > +    LIST_FOR_EACH_SAFE (ls_change, list_node,
> > +                        &nd->tracked_ls_changes.updated) {
> > +        struct ovn_port *op;
> > +        LIST_FOR_EACH (op, list, &ls_change->added_ports) {
> > +            ovs_list_remove(&op->list);
> > +        }
> > +        LIST_FOR_EACH (op, list, &ls_change->updated_ports) {
> > +            ovs_list_remove(&op->list);
> > +        }
> > +        LIST_FOR_EACH_SAFE (op, list, &ls_change->deleted_ports) {
> > +            ovs_list_remove(&op->list);
> > +            ovn_port_destroy_orphan(op);
> > +        }
> > +        ovs_list_remove(&ls_change->list_node);
> > +        free(ls_change);
> > +    }
> > +
> > +    nd->change_tracked = false;
> > +}
> > +
> > +/* Check if a changed LSP can be handled incrementally within the I-P
engine
> > + * node en_northd.
> > + */
> > +static bool
> > +lsp_can_be_inc_processed(const struct nbrec_logical_switch_port *nbsp)
> > +{
> > +    /* Currently only support normal VIF. */
> nit: s/Currently only support normal VIF/ Support only normal VIF for now
>
>
> > +    if (nbsp->type[0]) {
> > +        return false;
> > +    }
> > +
> > +    /* Currently not support tag allocation. */
> nit: s/Currently not support tag allocation/ Do not support tag
> allocation for now
>
> Same small nits for below comments.
>
> I'm fine if you want to ignore these.
>

Thanks for the suggestion! Updated.

Regards,
Han
>
> > +    if ((nbsp->parent_name && nbsp->parent_name[0]) || nbsp->tag ||
> > +        nbsp->tag_request) {
> > +        return false;
> > +    }
> > +
> > +    /* Currently not support port with qos settings (need special
handling for
> > +     * qdisc_queue_id sync. */
> > +    if (port_has_qos_params(&nbsp->options)) {
> > +        return false;
> > +    }
> > +
> > +    for (size_t j = 0; j < nbsp->n_addresses; j++) {
> > +        /* Currently not support dynamic address handling. */
> > +        if (is_dynamic_lsp_address(nbsp->addresses[j])) {
> > +            return false;
> > +        }
> > +        /* Currently not support "unknown" address handling.  XXX:
Need to
> > +         * handle od->has_unknown change and track it when the first
LSP with
> > +         * 'unknown' is added or when the last one is removed. */
> > +        if (!strcmp(nbsp->addresses[j], "unknown")) {
> > +            return false;
> > +        }
> > +    }
> > +
> > +    return true;
> > +}
> > +
> > +static bool
> > +ls_port_has_changed(const struct nbrec_logical_switch_port *old,
> > +                    const struct nbrec_logical_switch_port *new)
> > +{
> > +    if (old != new) {
> > +        return true;
> > +    }
> > +    /* XXX: Need a better OVSDB IDL interface for this check. */
> > +    return (nbrec_logical_switch_port_row_get_seqno(new,
> > +                                OVSDB_IDL_CHANGE_MODIFY) > 0);
> > +}
> > +
> > +static struct ovn_port *
> > +ovn_port_find_in_datapath(struct ovn_datapath *od, const char *name)
> > +{
> > +    struct ovn_port *op;
> > +    HMAP_FOR_EACH_WITH_HASH (op, dp_node, hash_string(name, 0),
&od->ports) {
> > +        if (!strcmp(op->key, name)) {
> > +            return op;
> > +        }
> > +    }
> > +    return NULL;
> > +}
> > +
> > +static struct ovn_port *
> > +ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct hmap *ls_ports,
> > +               const char *key, const struct nbrec_logical_switch_port
*nbsp,
> > +               struct ovn_datapath *od, const struct
sbrec_port_binding *sb,
> > +               const struct sbrec_mirror_table *sbrec_mirror_table,
> > +               const struct sbrec_chassis_table *sbrec_chassis_table,
> > +               struct ovsdb_idl_index *sbrec_chassis_by_name,
> > +               struct ovsdb_idl_index *sbrec_chassis_by_hostname)
> > +{
> > +    struct ovn_port *op = ovn_port_create(ls_ports, key, nbsp, NULL,
> > +                                          NULL);
> > +    parse_lsp_addrs(op);
> > +    op->od = od;
> > +    hmap_insert(&od->ports, &op->dp_node,
hmap_node_hash(&op->key_node));
> > +
> > +    /* Assign explicitly requested tunnel ids first. */
> > +    if (!ovn_port_assign_requested_tnl_id(sbrec_chassis_table, op)) {
> > +        return NULL;
> > +    }
> > +    if (sb) {
> > +        op->sb = sb;
> > +        /* Keep nonconflicting tunnel IDs that are already assigned. */
> > +        if (!op->tunnel_key) {
> > +            ovn_port_add_tnlid(op, op->sb->tunnel_key);
> > +        }
> > +    } else {
> > +        /* XXX: the new SB port_binding will change in IDL, so need to
handle
> > +         * SB port_binding updates incrementally to achieve end-to-end
> > +         * incremental processing. */
> > +        op->sb = sbrec_port_binding_insert(ovnsb_txn);
> > +        sbrec_port_binding_set_logical_port(op->sb, op->key);
> > +    }
> > +    /* Assign new tunnel ids where needed. */
> > +    if (!ovn_port_allocate_key(sbrec_chassis_table, ls_ports, op)) {
> > +        return NULL;
> > +    }
> > +    ovn_port_update_sbrec(ovnsb_txn, sbrec_chassis_by_name,
> > +                          sbrec_chassis_by_hostname, NULL,
sbrec_mirror_table,
> > +                          op, NULL, NULL);
> > +    return op;
> > +}
> > +
> > +static bool
> > +check_ls_changes_other_than_lsp(const struct nbrec_logical_switch *ls)
> > +{
> > +    /* Check if the columns are changed in this row. */
> > +    enum nbrec_logical_switch_column_id col;
> > +    for (col = 0; col < NBREC_LOGICAL_SWITCH_N_COLUMNS; col++) {
> > +        if (nbrec_logical_switch_is_updated(ls, col) &&
> > +            col != NBREC_LOGICAL_SWITCH_COL_PORTS) {
> > +            return true;
> > +        }
> > +    }
> > +
> > +    /* Check if the referenced rows are changed.
> > +       XXX: Need a better OVSDB IDL interface for this check. */
> > +    for (size_t i = 0; i < ls->n_acls; i++) {
> > +        if (nbrec_acl_row_get_seqno(ls->acls[i],
> > +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> > +            return true;
> > +        }
> > +    }
> > +    if (ls->copp && nbrec_copp_row_get_seqno(ls->copp,
> > +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> > +        return true;
> > +    }
> > +    for (size_t i = 0; i < ls->n_dns_records; i++) {
> > +        if (nbrec_dns_row_get_seqno(ls->dns_records[i],
> > +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> > +            return true;
> > +        }
> > +    }
> > +    for (size_t i = 0; i < ls->n_forwarding_groups; i++) {
> > +        if
(nbrec_forwarding_group_row_get_seqno(ls->forwarding_groups[i],
> > +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> > +            return true;
> > +        }
> > +    }
> > +    for (size_t i = 0; i < ls->n_load_balancer; i++) {
> > +        if (nbrec_load_balancer_row_get_seqno(ls->load_balancer[i],
> > +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> > +            return true;
> > +        }
> > +    }
> > +    for (size_t i = 0; i < ls->n_load_balancer_group; i++) {
> > +        if
(nbrec_load_balancer_group_row_get_seqno(ls->load_balancer_group[i],
> > +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> > +            return true;
> > +        }
> > +    }
> > +    for (size_t i = 0; i < ls->n_qos_rules; i++) {
> > +        if (nbrec_qos_row_get_seqno(ls->qos_rules[i],
> > +                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
> > +            return true;
> > +        }
> > +    }
> > +    return false;
> > +}
> > +
> > +/* Return true if changes are handled incrementally, false otherwise.
> > + * When there are any changes, try to track what's exactly changed and
set
> > + * northd_data->change_tracked accordingly: change tracked - true,
otherwise,
> > + * false. */
> > +bool
> > +northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
> > +                         const struct northd_input *ni,
> > +                         struct northd_data *nd)
> > +{
> > +    const struct nbrec_logical_switch *changed_ls;
> > +    struct ls_change *ls_change = NULL;
> > +
> > +    NBREC_LOGICAL_SWITCH_TABLE_FOR_EACH_TRACKED (changed_ls,
> > +
ni->nbrec_logical_switch_table) {
> > +        ls_change = NULL;
> > +        if (nbrec_logical_switch_is_new(changed_ls) ||
> > +            nbrec_logical_switch_is_deleted(changed_ls)) {
> > +            goto fail;
> > +        }
> > +        struct ovn_datapath *od = ovn_datapath_find(
> > +                                    &nd->ls_datapaths.datapaths,
> > +                                    &changed_ls->header_.uuid);
> > +        if (!od) {
> > +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
1);
> > +            VLOG_WARN_RL(&rl, "Internal error: a tracked updated LS
doesn't "
> > +                         "exist in ls_datapaths: "UUID_FMT,
> > +                         UUID_ARGS(&changed_ls->header_.uuid));
> > +            goto fail;
> > +        }
> > +
> > +        /* Now only able to handle lsp changes. */
> > +        if (check_ls_changes_other_than_lsp(changed_ls)) {
> > +            goto fail;
> > +        }
> > +
> > +        ls_change = xzalloc(sizeof *ls_change);
> > +        ls_change->od = od;
> > +        ovs_list_init(&ls_change->added_ports);
> > +        ovs_list_init(&ls_change->deleted_ports);
> > +        ovs_list_init(&ls_change->updated_ports);
> > +
> > +        struct ovn_port *op;
> > +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
> > +            op->visited = false;
> > +        }
> > +
> > +        /* Compare the individual ports in the old and new Logical
Switches */
> > +        for (size_t j = 0; j < changed_ls->n_ports; ++j) {
> > +            struct nbrec_logical_switch_port *new_nbsp =
changed_ls->ports[j];
> > +            op = ovn_port_find_in_datapath(od, new_nbsp->name);
> > +
> > +            if (!op) {
> > +                if (!lsp_can_be_inc_processed(new_nbsp)) {
> > +                    goto fail;
> > +                }
> > +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> > +                                    new_nbsp->name, new_nbsp, od, NULL,
> > +                                    ni->sbrec_mirror_table,
> > +                                    ni->sbrec_chassis_table,
> > +                                    ni->sbrec_chassis_by_name,
> > +                                    ni->sbrec_chassis_by_hostname);
> > +                if (!op) {
> > +                    goto fail;
> > +                }
> > +                ovs_list_push_back(&ls_change->added_ports,
> > +                                   &op->list);
> > +            } else if (ls_port_has_changed(op->nbsp, new_nbsp)) {
> > +                /* Existing port updated */
> > +                bool temp = false;
> > +                if (lsp_is_type_changed(op->sb, new_nbsp, &temp) ||
> > +                    !op->lsp_can_be_inc_processed ||
> > +                    !lsp_can_be_inc_processed(new_nbsp)) {
> > +                    goto fail;
> > +                }
> > +                const struct sbrec_port_binding *sb = op->sb;
> > +                if (sset_contains(&nd->svc_monitor_lsps,
new_nbsp->name)) {
> > +                    /* This port is used for svc monitor, which may be
impacted
> > +                     * by this change. Fallback to recompute. */
> > +                    goto fail;
> > +                }
> > +                ovn_port_destroy(&nd->ls_ports, op);
> > +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> > +                                    new_nbsp->name, new_nbsp, od, sb,
> > +                                    ni->sbrec_mirror_table,
> > +                                    ni->sbrec_chassis_table,
> > +                                    ni->sbrec_chassis_by_name,
> > +                                    ni->sbrec_chassis_by_hostname);
> > +                if (!op) {
> > +                    goto fail;
> > +                }
> > +                ovs_list_push_back(&ls_change->updated_ports,
&op->list);
> > +            }
> > +            op->visited = true;
> > +        }
> > +
> > +        /* Check for deleted ports */
> > +        HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
> > +            if (!op->visited) {
> > +                if (!op->lsp_can_be_inc_processed) {
> > +                    goto fail;
> > +                }
> > +                if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
> > +                    /* This port was used for svc monitor, which may be
> > +                     * impacted by this deletion. Fallback to
recompute. */
> > +                    goto fail;
> > +                }
> > +                ovs_list_push_back(&ls_change->deleted_ports,
> > +                                   &op->list);
> > +                hmap_remove(&nd->ls_ports, &op->key_node);
> > +                hmap_remove(&od->ports, &op->dp_node);
> > +                sbrec_port_binding_delete(op->sb);
> > +                delete_fdb_entry(ni->sbrec_fdb_by_dp_and_port,
od->tunnel_key,
> > +                                 op->tunnel_key);
> > +            }
> > +        }
> > +
> > +        if (!ovs_list_is_empty(&ls_change->added_ports) ||
> > +            !ovs_list_is_empty(&ls_change->updated_ports) ||
> > +            !ovs_list_is_empty(&ls_change->deleted_ports)) {
> > +            ovs_list_push_back(&nd->tracked_ls_changes.updated,
> > +                               &ls_change->list_node);
> > +        } else {
> > +            free(ls_change);
> > +        }
> > +    }
> > +
> > +    if (!ovs_list_is_empty(&nd->tracked_ls_changes.updated)) {
> > +        nd->change_tracked = true;
> > +    }
> > +    return true;
> > +
> > +fail:
> > +    free(ls_change);
> > +    destroy_northd_data_tracked_changes(nd);
> > +    return false;
> > +}
> > +
> >  struct multicast_group {
> >      const char *name;
> >      uint16_t key;               /*
OVN_MIN_MULTICAST...OVN_MAX_MULTICAST. */
> > @@ -6072,7 +6476,7 @@ build_interconn_mcast_snoop_flows(struct
ovn_datapath *od,
> >
> >      struct ovn_port *op;
> >
> > -    LIST_FOR_EACH (op, dp_node, &od->port_list) {
> > +    HMAP_FOR_EACH (op, dp_node, &od->ports) {
> >          if (!lsp_is_remote(op->nbsp)) {
> >              continue;
> >          }
> > @@ -12394,7 +12798,7 @@ build_mcast_lookup_flows_for_lrouter(
> >           * own mac addresses.
> >           */
> >          struct ovn_port *op;
> > -        LIST_FOR_EACH (op, dp_node, &od->port_list) {
> > +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
> >              ds_clear(match);
> >              ds_put_format(match, "eth.src == %s && igmp",
> >                            op->lrp_networks.ea_s);
> > @@ -16476,9 +16880,6 @@ destroy_datapaths_and_ports(struct
ovn_datapaths *ls_datapaths,
> >          }
> >      }
> >
> > -    ovn_datapaths_destroy(ls_datapaths);
> > -    ovn_datapaths_destroy(lr_datapaths);
> > -
> >      struct ovn_port *port;
> >      HMAP_FOR_EACH_SAFE (port, key_node, ls_ports) {
> >          ovn_port_destroy(ls_ports, port);
> > @@ -16489,6 +16890,9 @@ destroy_datapaths_and_ports(struct
ovn_datapaths *ls_datapaths,
> >          ovn_port_destroy(lr_ports, port);
> >      }
> >      hmap_destroy(lr_ports);
> > +
> > +    ovn_datapaths_destroy(ls_datapaths);
> > +    ovn_datapaths_destroy(lr_datapaths);
> >  }
> >
> >  void
> > @@ -16510,6 +16914,8 @@ northd_init(struct northd_data *data)
> >      };
> >      data->ovn_internal_version_changed = false;
> >      sset_init(&data->svc_monitor_lsps);
> > +    data->change_tracked = false;
> > +    ovs_list_init(&data->tracked_ls_changes.updated);
> >  }
> >
> >  void
> > @@ -16976,6 +17382,7 @@ void northd_run(struct northd_input *input_data,
> >                  struct ovsdb_idl_txn *ovnnb_txn,
> >                  struct ovsdb_idl_txn *ovnsb_txn)
> >  {
> > +    COVERAGE_INC(northd_run);
> >      stopwatch_start(OVNNB_DB_RUN_STOPWATCH_NAME, time_msec());
> >      ovnnb_db_run(input_data, data, ovnnb_txn, ovnsb_txn);
> >      stopwatch_stop(OVNNB_DB_RUN_STOPWATCH_NAME, time_msec());
> > diff --git a/northd/northd.h b/northd/northd.h
> > index b26828bb7111..1fd9bed477e7 100644
> > --- a/northd/northd.h
> > +++ b/northd/northd.h
> > @@ -65,6 +65,7 @@ struct northd_input {
> >      struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name;
> >      struct ovsdb_idl_index *sbrec_ip_mcast_by_dp;
> >      struct ovsdb_idl_index *sbrec_static_mac_binding_by_lport_ip;
> > +    struct ovsdb_idl_index *sbrec_fdb_by_dp_and_port;
> >  };
> >
> >  struct chassis_features {
> > @@ -83,6 +84,22 @@ struct ovn_datapaths {
> >      struct ovn_datapath **array;
> >  };
> >
> > +/* Track what's changed for a single LS.
> > + * Now only track port changes. */
> > +struct ls_change {
> > +    struct ovs_list list_node;
> > +    struct ovn_datapath *od;
> > +    struct ovs_list added_ports;
> > +    struct ovs_list deleted_ports;
> > +    struct ovs_list updated_ports;
> > +};
> > +
> > +/* Track what's changed for logical switches.
> > + * Now only track updated ones (added or deleted may be supported in
the
> > + * future). */
> > +struct tracked_ls_changes {
> > +    struct ovs_list updated; /* Contains struct ls_change */
> > +};
> >
> >  struct northd_data {
> >      /* Global state for 'en-northd'. */
> > @@ -98,6 +115,8 @@ struct northd_data {
> >      bool ovn_internal_version_changed;
> >      struct chassis_features features;
> >      struct sset svc_monitor_lsps;
> > +    bool change_tracked;
> > +    struct tracked_ls_changes tracked_ls_changes;
> >  };
> >
> >  struct lflow_data {
> > @@ -292,13 +311,19 @@ struct ovn_datapath {
> >      /* Port groups related to the datapath, used only when nbs is NOT
NULL. */
> >      struct hmap nb_pgs;
> >
> > -    struct ovs_list port_list;
> > +    /* Map of ovn_port objects belonging to this datapath.
> > +     * This map doesn't include derived ports. */
> > +    struct hmap ports;
> >  };
> >
> >  void northd_run(struct northd_input *input_data,
> >                  struct northd_data *data,
> >                  struct ovsdb_idl_txn *ovnnb_txn,
> >                  struct ovsdb_idl_txn *ovnsb_txn);
> > +bool northd_handle_ls_changes(struct ovsdb_idl_txn *,
> > +                              const struct northd_input *,
> > +                              struct northd_data *);
> > +void destroy_northd_data_tracked_changes(struct northd_data *);
> >  void northd_destroy(struct northd_data *data);
> >  void northd_init(struct northd_data *data);
> >  void northd_indices_create(struct northd_data *data,
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index 6736429ae201..74ae84530112 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -9481,3 +9481,61 @@ AT_CHECK([ovn-sbctl lflow-list sw | grep
ls_out_pre_lb | grep priority=110 | gre
> >
> >  AT_CLEANUP
> >  ])
> > +
> > +OVN_FOR_EACH_NORTHD_NO_HV([
> > +AT_SETUP([LSP incremental processing])
> > +ovn_start
> > +
> > +net_add n1
> > +sim_add hv1
> > +as hv1
> > +ovs-vsctl add-br br-phys
> > +ovn_attach n1 br-phys 192.168.0.11
> > +ovs-vsctl add-port br-int lsp0-0 -- set interface lsp0-0
external_ids:iface-id=lsp0-0
> > +ovs-vsctl add-port br-int lsp0-1 -- set interface lsp0-1
external_ids:iface-id=lsp0-1
> > +ovs-vsctl add-port br-int lsp0-2 -- set interface lsp0-2
external_ids:iface-id=lsp0-2
> > +
> > +check ovn-nbctl --wait=hv ls-add ls0
> > +
> > +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> > +check ovn-nbctl --wait=hv lsp-add ls0 lsp0-0 -- lsp-set-addresses
lsp0-0 "unknown"
> > +OVS_WAIT_UNTIL([test `as northd ovn-appctl -t NORTHD_TYPE
inc-engine/show-stats northd recompute` = 5])
> > +OVS_WAIT_UNTIL([test `as northd ovn-appctl -t NORTHD_TYPE
inc-engine/show-stats lflow recompute` = 5])
> > +
> > +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> > +check ovn-nbctl --wait=hv lsp-add ls0 lsp0-1 -- lsp-set-addresses
lsp0-1 "aa:aa:aa:00:00:01 192.168.0.11"
> > +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats
northd recompute], [0], [3
> > +])
> > +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats
lflow recompute], [0], [6
> > +])
> > +
> > +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> > +check ovn-nbctl --wait=hv lsp-add ls0 lsp0-2 -- lsp-set-addresses
lsp0-2 "aa:aa:aa:00:00:02 192.168.0.12"
> > +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats
northd recompute], [0], [3
> > +])
> > +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats
lflow recompute], [0], [6
> > +])
> > +
> > +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> > +check ovn-nbctl --wait=hv lsp-del lsp0-1
> > +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats
northd recompute], [0], [1
> > +])
> > +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats
lflow recompute], [0], [2
> > +])
> > +
> > +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> > +check ovn-nbctl --wait=hv lsp-set-addresses lsp0-2 "aa:aa:aa:00:00:88
192.168.0.88"
> > +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats
northd recompute], [0], [1
> > +])
> > +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats
lflow recompute], [0], [2
> > +])
> > +
> > +# No change, no recompute
> > +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> > +check ovn-nbctl --wait=sb sync
> > +AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats
northd recompute], [0], [0
> > +])
> > +
> > +OVN_CLEANUP([hv1])
> > +AT_CLEANUP
> > +])
> > --
> > 2.30.2
> >
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
Dumitru Ceara June 27, 2023, 1:51 p.m. UTC | #4
On 6/8/23 07:55, Han Zhou wrote:
> On Wed, Jun 7, 2023 at 12:39 PM Numan Siddique <numans@ovn.org> wrote:
>>
>> On Fri, Jun 2, 2023 at 12:13 AM Han Zhou <hzhou@ovn.org> wrote:
>>>
>>> This patch introduces a change handler for NB logical_switch within the
>>> 'northd' node. It specifically handles cases where logical switch ports
>>> in the tracked logical switches are changed (added/updated/deleted).
>>> Only regular logical switch ports - which are common for VMs/Pods - are
>>> addressed. For other scenarios, it reverts to recompute.
>>>
>>> This update avoids recompute in the northd node (especially the
>>> resource-intensive build_ports()) for NB changes of VIF
>>> add/update/delete.  However, it does not eliminate the need for lflow
>>> recompute.
>>>
>>> Below are the performance test results simulating an ovn-k8s topology of
>>> 500 nodes x 50 lsp per node:
>>>
>>> Before:
>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
>>>     ovn-northd completion:          955ms
>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01
> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
>>>     ovn-northd completion:          919ms
>>>
>>> After:
>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
>>>     ovn-northd completion:          776ms
>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01
> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
>>>     ovn-northd completion:          773ms
>>>
>>> Both addition and deletion show ~20% reduction of ovn-northd completion
>>> time.  Note: the test uses only 1 thread of ovn-northd for flow
>>> recompute. Using multithread should show a larger percentage of
>>> improvement.
>>>
>>> Signed-off-by: Han Zhou <hzhou@ovn.org>
>>> Reviewed-by: Ales Musil <amusil@redhat.com>
>>
>> Few nits below.
>>
>> Acked-by: Numan Siddique <numans@ovn.org>
>>
>> Numan
>>

Hi Han, Numan,

First of all, sorry for the late review, I had these patches on my list
for a while and only got to them after they were merged.

>>
>>> ---
>>>  lib/ovn-util.c           |  15 ++
>>>  lib/ovn-util.h           |   1 +
>>>  northd/en-northd.c       | 130 ++++++---
>>>  northd/en-northd.h       |   2 +
>>>  northd/inc-proc-northd.c |  12 +-
>>>  northd/northd.c          | 567 +++++++++++++++++++++++++++++++++------
>>>  northd/northd.h          |  27 +-
>>>  tests/ovn-northd.at      |  58 ++++
>>>  8 files changed, 685 insertions(+), 127 deletions(-)
>>>

[...]

>>> +bool
>>> +northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
>>> +                         const struct northd_input *ni,
>>> +                         struct northd_data *nd)
>>> +{
>>> +    const struct nbrec_logical_switch *changed_ls;
>>> +    struct ls_change *ls_change = NULL;
>>> +
>>> +    NBREC_LOGICAL_SWITCH_TABLE_FOR_EACH_TRACKED (changed_ls,
>>> +
> ni->nbrec_logical_switch_table) {
>>> +        ls_change = NULL;
>>> +        if (nbrec_logical_switch_is_new(changed_ls) ||
>>> +            nbrec_logical_switch_is_deleted(changed_ls)) {
>>> +            goto fail;
>>> +        }
>>> +        struct ovn_datapath *od = ovn_datapath_find(
>>> +                                    &nd->ls_datapaths.datapaths,
>>> +                                    &changed_ls->header_.uuid);
>>> +        if (!od) {
>>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 1);
>>> +            VLOG_WARN_RL(&rl, "Internal error: a tracked updated LS
> doesn't "
>>> +                         "exist in ls_datapaths: "UUID_FMT,
>>> +                         UUID_ARGS(&changed_ls->header_.uuid));
>>> +            goto fail;
>>> +        }
>>> +
>>> +        /* Now only able to handle lsp changes. */
>>> +        if (check_ls_changes_other_than_lsp(changed_ls)) {
>>> +            goto fail;
>>> +        }
>>> +
>>> +        ls_change = xzalloc(sizeof *ls_change);
>>> +        ls_change->od = od;
>>> +        ovs_list_init(&ls_change->added_ports);
>>> +        ovs_list_init(&ls_change->deleted_ports);
>>> +        ovs_list_init(&ls_change->updated_ports);
>>> +
>>> +        struct ovn_port *op;
>>> +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
>>> +            op->visited = false;
>>> +        }
>>> +
>>> +        /* Compare the individual ports in the old and new Logical
> Switches */
>>> +        for (size_t j = 0; j < changed_ls->n_ports; ++j) {
>>> +            struct nbrec_logical_switch_port *new_nbsp =
> changed_ls->ports[j];
>>> +            op = ovn_port_find_in_datapath(od, new_nbsp->name);
>>> +
>>> +            if (!op) {
>>> +                if (!lsp_can_be_inc_processed(new_nbsp)) {
>>> +                    goto fail;
>>> +                }
>>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
>>> +                                    new_nbsp->name, new_nbsp, od, NULL,
>>> +                                    ni->sbrec_mirror_table,
>>> +                                    ni->sbrec_chassis_table,
>>> +                                    ni->sbrec_chassis_by_name,
>>> +                                    ni->sbrec_chassis_by_hostname);
>>> +                if (!op) {
>>> +                    goto fail;
>>> +                }
>>> +                ovs_list_push_back(&ls_change->added_ports,
>>> +                                   &op->list);
>>> +            } else if (ls_port_has_changed(op->nbsp, new_nbsp)) {
>>> +                /* Existing port updated */
>>> +                bool temp = false;
>>> +                if (lsp_is_type_changed(op->sb, new_nbsp, &temp) ||
>>> +                    !op->lsp_can_be_inc_processed ||
>>> +                    !lsp_can_be_inc_processed(new_nbsp)) {
>>> +                    goto fail;
>>> +                }
>>> +                const struct sbrec_port_binding *sb = op->sb;
>>> +                if (sset_contains(&nd->svc_monitor_lsps,
> new_nbsp->name)) {
>>> +                    /* This port is used for svc monitor, which may be
> impacted
>>> +                     * by this change. Fallback to recompute. */
>>> +                    goto fail;
>>> +                }
>>> +                ovn_port_destroy(&nd->ls_ports, op);
>>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
>>> +                                    new_nbsp->name, new_nbsp, od, sb,
>>> +                                    ni->sbrec_mirror_table,
>>> +                                    ni->sbrec_chassis_table,
>>> +                                    ni->sbrec_chassis_by_name,
>>> +                                    ni->sbrec_chassis_by_hostname);
>>> +                if (!op) {
>>> +                    goto fail;
>>> +                }
>>> +                ovs_list_push_back(&ls_change->updated_ports,
> &op->list);
>>> +            }
>>> +            op->visited = true;
>>> +        }
>>> +
>>> +        /* Check for deleted ports */
>>> +        HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
>>> +            if (!op->visited) {
>>> +                if (!op->lsp_can_be_inc_processed) {
>>> +                    goto fail;
>>> +                }
>>> +                if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
>>> +                    /* This port was used for svc monitor, which may be
>>> +                     * impacted by this deletion. Fallback to
> recompute. */
>>> +                    goto fail;
>>> +                }
>>> +                ovs_list_push_back(&ls_change->deleted_ports,
>>> +                                   &op->list);
>>> +                hmap_remove(&nd->ls_ports, &op->key_node);
>>> +                hmap_remove(&od->ports, &op->dp_node);
>>> +                sbrec_port_binding_delete(op->sb);

Quoting from ovsdb-idl.c:

/* Deletes 'row_' from its table.  May free 'row_', so it must not be
 * accessed afterward.

So, in theory, we shouldn't call delete here because we might end up
failing for other ports/datapath and falling back to recompute.

As far as I can tell, only newly inserted rows may be deleted by the
current IDL implementation but I don't think it's safe to assume this
will never change in the future.

We should probably accumulate all records that need to be deleted and
delete them at the end of the whole I-P run.  Do we need some I-P infra
changes?  Maybe users can provide a callback that can be called whenever
an I-P run has finished?

What do you think?

Thanks,
Dumitru
Han Zhou June 27, 2023, 5:47 p.m. UTC | #5
On Tue, Jun 27, 2023 at 6:51 AM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On 6/8/23 07:55, Han Zhou wrote:
> > On Wed, Jun 7, 2023 at 12:39 PM Numan Siddique <numans@ovn.org> wrote:
> >>
> >> On Fri, Jun 2, 2023 at 12:13 AM Han Zhou <hzhou@ovn.org> wrote:
> >>>
> >>> This patch introduces a change handler for NB logical_switch within
the
> >>> 'northd' node. It specifically handles cases where logical switch
ports
> >>> in the tracked logical switches are changed (added/updated/deleted).
> >>> Only regular logical switch ports - which are common for VMs/Pods -
are
> >>> addressed. For other scenarios, it reverts to recompute.
> >>>
> >>> This update avoids recompute in the northd node (especially the
> >>> resource-intensive build_ports()) for NB changes of VIF
> >>> add/update/delete.  However, it does not eliminate the need for lflow
> >>> recompute.
> >>>
> >>> Below are the performance test results simulating an ovn-k8s topology
of
> >>> 500 nodes x 50 lsp per node:
> >>>
> >>> Before:
> >>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> >>>     ovn-northd completion:          955ms
> >>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01
> > -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
> > lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> >>>     ovn-northd completion:          919ms
> >>>
> >>> After:
> >>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> >>>     ovn-northd completion:          776ms
> >>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01
> > -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
> > lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> >>>     ovn-northd completion:          773ms
> >>>
> >>> Both addition and deletion show ~20% reduction of ovn-northd
completion
> >>> time.  Note: the test uses only 1 thread of ovn-northd for flow
> >>> recompute. Using multithread should show a larger percentage of
> >>> improvement.
> >>>
> >>> Signed-off-by: Han Zhou <hzhou@ovn.org>
> >>> Reviewed-by: Ales Musil <amusil@redhat.com>
> >>
> >> Few nits below.
> >>
> >> Acked-by: Numan Siddique <numans@ovn.org>
> >>
> >> Numan
> >>
>
> Hi Han, Numan,
>
> First of all, sorry for the late review, I had these patches on my list
> for a while and only got to them after they were merged.
>

Hi Dumitru, thanks for your diligent review and comments are always welcome
any time.

> >>
> >>> ---
> >>>  lib/ovn-util.c           |  15 ++
> >>>  lib/ovn-util.h           |   1 +
> >>>  northd/en-northd.c       | 130 ++++++---
> >>>  northd/en-northd.h       |   2 +
> >>>  northd/inc-proc-northd.c |  12 +-
> >>>  northd/northd.c          | 567
+++++++++++++++++++++++++++++++++------
> >>>  northd/northd.h          |  27 +-
> >>>  tests/ovn-northd.at      |  58 ++++
> >>>  8 files changed, 685 insertions(+), 127 deletions(-)
> >>>
>
> [...]
>
> >>> +bool
> >>> +northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
> >>> +                         const struct northd_input *ni,
> >>> +                         struct northd_data *nd)
> >>> +{
> >>> +    const struct nbrec_logical_switch *changed_ls;
> >>> +    struct ls_change *ls_change = NULL;
> >>> +
> >>> +    NBREC_LOGICAL_SWITCH_TABLE_FOR_EACH_TRACKED (changed_ls,
> >>> +
> > ni->nbrec_logical_switch_table) {
> >>> +        ls_change = NULL;
> >>> +        if (nbrec_logical_switch_is_new(changed_ls) ||
> >>> +            nbrec_logical_switch_is_deleted(changed_ls)) {
> >>> +            goto fail;
> >>> +        }
> >>> +        struct ovn_datapath *od = ovn_datapath_find(
> >>> +                                    &nd->ls_datapaths.datapaths,
> >>> +                                    &changed_ls->header_.uuid);
> >>> +        if (!od) {
> >>> +            static struct vlog_rate_limit rl =
VLOG_RATE_LIMIT_INIT(1,
> > 1);
> >>> +            VLOG_WARN_RL(&rl, "Internal error: a tracked updated LS
> > doesn't "
> >>> +                         "exist in ls_datapaths: "UUID_FMT,
> >>> +                         UUID_ARGS(&changed_ls->header_.uuid));
> >>> +            goto fail;
> >>> +        }
> >>> +
> >>> +        /* Now only able to handle lsp changes. */
> >>> +        if (check_ls_changes_other_than_lsp(changed_ls)) {
> >>> +            goto fail;
> >>> +        }
> >>> +
> >>> +        ls_change = xzalloc(sizeof *ls_change);
> >>> +        ls_change->od = od;
> >>> +        ovs_list_init(&ls_change->added_ports);
> >>> +        ovs_list_init(&ls_change->deleted_ports);
> >>> +        ovs_list_init(&ls_change->updated_ports);
> >>> +
> >>> +        struct ovn_port *op;
> >>> +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
> >>> +            op->visited = false;
> >>> +        }
> >>> +
> >>> +        /* Compare the individual ports in the old and new Logical
> > Switches */
> >>> +        for (size_t j = 0; j < changed_ls->n_ports; ++j) {
> >>> +            struct nbrec_logical_switch_port *new_nbsp =
> > changed_ls->ports[j];
> >>> +            op = ovn_port_find_in_datapath(od, new_nbsp->name);
> >>> +
> >>> +            if (!op) {
> >>> +                if (!lsp_can_be_inc_processed(new_nbsp)) {
> >>> +                    goto fail;
> >>> +                }
> >>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> >>> +                                    new_nbsp->name, new_nbsp, od,
NULL,
> >>> +                                    ni->sbrec_mirror_table,
> >>> +                                    ni->sbrec_chassis_table,
> >>> +                                    ni->sbrec_chassis_by_name,
> >>> +                                    ni->sbrec_chassis_by_hostname);
> >>> +                if (!op) {
> >>> +                    goto fail;
> >>> +                }
> >>> +                ovs_list_push_back(&ls_change->added_ports,
> >>> +                                   &op->list);
> >>> +            } else if (ls_port_has_changed(op->nbsp, new_nbsp)) {
> >>> +                /* Existing port updated */
> >>> +                bool temp = false;
> >>> +                if (lsp_is_type_changed(op->sb, new_nbsp, &temp) ||
> >>> +                    !op->lsp_can_be_inc_processed ||
> >>> +                    !lsp_can_be_inc_processed(new_nbsp)) {
> >>> +                    goto fail;
> >>> +                }
> >>> +                const struct sbrec_port_binding *sb = op->sb;
> >>> +                if (sset_contains(&nd->svc_monitor_lsps,
> > new_nbsp->name)) {
> >>> +                    /* This port is used for svc monitor, which may
be
> > impacted
> >>> +                     * by this change. Fallback to recompute. */
> >>> +                    goto fail;
> >>> +                }
> >>> +                ovn_port_destroy(&nd->ls_ports, op);
> >>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> >>> +                                    new_nbsp->name, new_nbsp, od, sb,
> >>> +                                    ni->sbrec_mirror_table,
> >>> +                                    ni->sbrec_chassis_table,
> >>> +                                    ni->sbrec_chassis_by_name,
> >>> +                                    ni->sbrec_chassis_by_hostname);
> >>> +                if (!op) {
> >>> +                    goto fail;
> >>> +                }
> >>> +                ovs_list_push_back(&ls_change->updated_ports,
> > &op->list);
> >>> +            }
> >>> +            op->visited = true;
> >>> +        }
> >>> +
> >>> +        /* Check for deleted ports */
> >>> +        HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
> >>> +            if (!op->visited) {
> >>> +                if (!op->lsp_can_be_inc_processed) {
> >>> +                    goto fail;
> >>> +                }
> >>> +                if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
> >>> +                    /* This port was used for svc monitor, which may
be
> >>> +                     * impacted by this deletion. Fallback to
> > recompute. */
> >>> +                    goto fail;
> >>> +                }
> >>> +                ovs_list_push_back(&ls_change->deleted_ports,
> >>> +                                   &op->list);
> >>> +                hmap_remove(&nd->ls_ports, &op->key_node);
> >>> +                hmap_remove(&od->ports, &op->dp_node);
> >>> +                sbrec_port_binding_delete(op->sb);
>
> Quoting from ovsdb-idl.c:
>
> /* Deletes 'row_' from its table.  May free 'row_', so it must not be
>  * accessed afterward.
>
> So, in theory, we shouldn't call delete here because we might end up
> failing for other ports/datapath and falling back to recompute.
>
> As far as I can tell, only newly inserted rows may be deleted by the
> current IDL implementation but I don't think it's safe to assume this
> will never change in the future.

In general, when falling back to recompute in the middle of I-P, the
recompute function will do a full sync between desired states and existing
states. Now the existing states are partially synced during the I-P before
recompute, the recompute function will just see the synced part as "already
in sync" and only apply for the rest of the changes. I think this mechanism
applies for add/update/delete the same way. Here for delete, if the record
is deleted from IDL, it won't be found by the recompute function, so there
is no way to access it again. Did I miss something?

>
> We should probably accumulate all records that need to be deleted and
> delete them at the end of the whole I-P run.  Do we need some I-P infra
> changes?  Maybe users can provide a callback that can be called whenever
> an I-P run has finished?
>
I thought about this approach, but not for the reason above. I would do the
same for all DB changes to avoid directly updating DBs in the I-P engine.
However, the change is going to be quite big, considering how the current
code base depends on the IDL. Changes to IDL are all over the place. In
addition to the potential memory and CPU cost to maintain the delta, there
may be other tricky things to handle, such as:
- Sometimes it is problematic to see the old data (instead of the updated
ones) in the IDL in the same round of I-P. (this has happened in
ovn-controller due to out-of-date IDL index, which was fixed accordingly)
- Merge accumulated changes to the same DB record.

So, I think it is better to avoid such a change unless it is really
necessary.

Thanks,
Han

> What do you think?
>
> Thanks,
> Dumitru
>
Dumitru Ceara June 28, 2023, 9:14 a.m. UTC | #6
On 6/27/23 19:47, Han Zhou wrote:
> On Tue, Jun 27, 2023 at 6:51 AM Dumitru Ceara <dceara@redhat.com> wrote:
>>
>> On 6/8/23 07:55, Han Zhou wrote:
>>> On Wed, Jun 7, 2023 at 12:39 PM Numan Siddique <numans@ovn.org> wrote:
>>>>
>>>> On Fri, Jun 2, 2023 at 12:13 AM Han Zhou <hzhou@ovn.org> wrote:
>>>>>
>>>>> This patch introduces a change handler for NB logical_switch within
> the
>>>>> 'northd' node. It specifically handles cases where logical switch
> ports
>>>>> in the tracked logical switches are changed (added/updated/deleted).
>>>>> Only regular logical switch ports - which are common for VMs/Pods -
> are
>>>>> addressed. For other scenarios, it reverts to recompute.
>>>>>
>>>>> This update avoids recompute in the northd node (especially the
>>>>> resource-intensive build_ports()) for NB changes of VIF
>>>>> add/update/delete.  However, it does not eliminate the need for lflow
>>>>> recompute.
>>>>>
>>>>> Below are the performance test results simulating an ovn-k8s topology
> of
>>>>> 500 nodes x 50 lsp per node:
>>>>>
>>>>> Before:
>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
>>>>>     ovn-northd completion:          955ms
>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01
>>> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
>>> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
>>>>>     ovn-northd completion:          919ms
>>>>>
>>>>> After:
>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
>>>>>     ovn-northd completion:          776ms
>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01
>>> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
>>> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
>>>>>     ovn-northd completion:          773ms
>>>>>
>>>>> Both addition and deletion show ~20% reduction of ovn-northd
> completion
>>>>> time.  Note: the test uses only 1 thread of ovn-northd for flow
>>>>> recompute. Using multithread should show a larger percentage of
>>>>> improvement.
>>>>>
>>>>> Signed-off-by: Han Zhou <hzhou@ovn.org>
>>>>> Reviewed-by: Ales Musil <amusil@redhat.com>
>>>>
>>>> Few nits below.
>>>>
>>>> Acked-by: Numan Siddique <numans@ovn.org>
>>>>
>>>> Numan
>>>>
>>
>> Hi Han, Numan,
>>
>> First of all, sorry for the late review, I had these patches on my list
>> for a while and only got to them after they were merged.
>>
> 
> Hi Dumitru, thanks for your diligent review and comments are always welcome
> any time.
> 
>>>>
>>>>> ---
>>>>>  lib/ovn-util.c           |  15 ++
>>>>>  lib/ovn-util.h           |   1 +
>>>>>  northd/en-northd.c       | 130 ++++++---
>>>>>  northd/en-northd.h       |   2 +
>>>>>  northd/inc-proc-northd.c |  12 +-
>>>>>  northd/northd.c          | 567
> +++++++++++++++++++++++++++++++++------
>>>>>  northd/northd.h          |  27 +-
>>>>>  tests/ovn-northd.at      |  58 ++++
>>>>>  8 files changed, 685 insertions(+), 127 deletions(-)
>>>>>
>>
>> [...]
>>
>>>>> +bool
>>>>> +northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
>>>>> +                         const struct northd_input *ni,
>>>>> +                         struct northd_data *nd)
>>>>> +{
>>>>> +    const struct nbrec_logical_switch *changed_ls;
>>>>> +    struct ls_change *ls_change = NULL;
>>>>> +
>>>>> +    NBREC_LOGICAL_SWITCH_TABLE_FOR_EACH_TRACKED (changed_ls,
>>>>> +
>>> ni->nbrec_logical_switch_table) {
>>>>> +        ls_change = NULL;
>>>>> +        if (nbrec_logical_switch_is_new(changed_ls) ||
>>>>> +            nbrec_logical_switch_is_deleted(changed_ls)) {
>>>>> +            goto fail;
>>>>> +        }
>>>>> +        struct ovn_datapath *od = ovn_datapath_find(
>>>>> +                                    &nd->ls_datapaths.datapaths,
>>>>> +                                    &changed_ls->header_.uuid);
>>>>> +        if (!od) {
>>>>> +            static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(1,
>>> 1);
>>>>> +            VLOG_WARN_RL(&rl, "Internal error: a tracked updated LS
>>> doesn't "
>>>>> +                         "exist in ls_datapaths: "UUID_FMT,
>>>>> +                         UUID_ARGS(&changed_ls->header_.uuid));
>>>>> +            goto fail;
>>>>> +        }
>>>>> +
>>>>> +        /* Now only able to handle lsp changes. */
>>>>> +        if (check_ls_changes_other_than_lsp(changed_ls)) {
>>>>> +            goto fail;
>>>>> +        }
>>>>> +
>>>>> +        ls_change = xzalloc(sizeof *ls_change);
>>>>> +        ls_change->od = od;
>>>>> +        ovs_list_init(&ls_change->added_ports);
>>>>> +        ovs_list_init(&ls_change->deleted_ports);
>>>>> +        ovs_list_init(&ls_change->updated_ports);
>>>>> +
>>>>> +        struct ovn_port *op;
>>>>> +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
>>>>> +            op->visited = false;
>>>>> +        }
>>>>> +
>>>>> +        /* Compare the individual ports in the old and new Logical
>>> Switches */
>>>>> +        for (size_t j = 0; j < changed_ls->n_ports; ++j) {
>>>>> +            struct nbrec_logical_switch_port *new_nbsp =
>>> changed_ls->ports[j];
>>>>> +            op = ovn_port_find_in_datapath(od, new_nbsp->name);
>>>>> +
>>>>> +            if (!op) {
>>>>> +                if (!lsp_can_be_inc_processed(new_nbsp)) {
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
>>>>> +                                    new_nbsp->name, new_nbsp, od,
> NULL,
>>>>> +                                    ni->sbrec_mirror_table,
>>>>> +                                    ni->sbrec_chassis_table,
>>>>> +                                    ni->sbrec_chassis_by_name,
>>>>> +                                    ni->sbrec_chassis_by_hostname);
>>>>> +                if (!op) {
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                ovs_list_push_back(&ls_change->added_ports,
>>>>> +                                   &op->list);
>>>>> +            } else if (ls_port_has_changed(op->nbsp, new_nbsp)) {
>>>>> +                /* Existing port updated */
>>>>> +                bool temp = false;
>>>>> +                if (lsp_is_type_changed(op->sb, new_nbsp, &temp) ||
>>>>> +                    !op->lsp_can_be_inc_processed ||
>>>>> +                    !lsp_can_be_inc_processed(new_nbsp)) {
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                const struct sbrec_port_binding *sb = op->sb;
>>>>> +                if (sset_contains(&nd->svc_monitor_lsps,
>>> new_nbsp->name)) {
>>>>> +                    /* This port is used for svc monitor, which may
> be
>>> impacted
>>>>> +                     * by this change. Fallback to recompute. */
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                ovn_port_destroy(&nd->ls_ports, op);
>>>>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
>>>>> +                                    new_nbsp->name, new_nbsp, od, sb,
>>>>> +                                    ni->sbrec_mirror_table,
>>>>> +                                    ni->sbrec_chassis_table,
>>>>> +                                    ni->sbrec_chassis_by_name,
>>>>> +                                    ni->sbrec_chassis_by_hostname);
>>>>> +                if (!op) {
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                ovs_list_push_back(&ls_change->updated_ports,
>>> &op->list);
>>>>> +            }
>>>>> +            op->visited = true;
>>>>> +        }
>>>>> +
>>>>> +        /* Check for deleted ports */
>>>>> +        HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
>>>>> +            if (!op->visited) {
>>>>> +                if (!op->lsp_can_be_inc_processed) {
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
>>>>> +                    /* This port was used for svc monitor, which may
> be
>>>>> +                     * impacted by this deletion. Fallback to
>>> recompute. */
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                ovs_list_push_back(&ls_change->deleted_ports,
>>>>> +                                   &op->list);
>>>>> +                hmap_remove(&nd->ls_ports, &op->key_node);
>>>>> +                hmap_remove(&od->ports, &op->dp_node);
>>>>> +                sbrec_port_binding_delete(op->sb);
>>
>> Quoting from ovsdb-idl.c:
>>
>> /* Deletes 'row_' from its table.  May free 'row_', so it must not be
>>  * accessed afterward.
>>
>> So, in theory, we shouldn't call delete here because we might end up
>> failing for other ports/datapath and falling back to recompute.
>>
>> As far as I can tell, only newly inserted rows may be deleted by the
>> current IDL implementation but I don't think it's safe to assume this
>> will never change in the future.
> 
> In general, when falling back to recompute in the middle of I-P, the
> recompute function will do a full sync between desired states and existing
> states. Now the existing states are partially synced during the I-P before
> recompute, the recompute function will just see the synced part as "already
> in sync" and only apply for the rest of the changes. I think this mechanism
> applies for add/update/delete the same way. Here for delete, if the record
> is deleted from IDL, it won't be found by the recompute function, so there
> is no way to access it again. Did I miss something?
> 

You're right, the ports we delete we first remove from the
datapath->ports hmap.  On the other hand, I think I spotted another
leak.

Deleted ports get pushed back to &ls_change->deleted_ports:

        /* Check for deleted ports */
        HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
            if (!op->visited) {
                if (!op->lsp_can_be_inc_processed) {
                    goto fail;
                }
                if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
                    /* This port was used for svc monitor, which may be
                     * impacted by this deletion. Fallback to recompute. */
                    goto fail;
                }
                ovs_list_push_back(&ls_change->deleted_ports,
                                   &op->list);
                hmap_remove(&nd->ls_ports, &op->key_node);
                hmap_remove(&od->ports, &op->dp_node);
                sbrec_port_binding_delete(op->sb);
                delete_fdb_entry(ni->sbrec_fdb_by_dp_and_port, od->tunnel_key,
                                 op->tunnel_key);
            }
        }

[...]

fail:
    free(ls_change);

If we deleted some ports and then got out of the loop early then here we
leak the ports stored in ls_change->deleted_ports.  Or am I missing
something?

>>
>> We should probably accumulate all records that need to be deleted and
>> delete them at the end of the whole I-P run.  Do we need some I-P infra
>> changes?  Maybe users can provide a callback that can be called whenever
>> an I-P run has finished?
>>
> I thought about this approach, but not for the reason above. I would do the
> same for all DB changes to avoid directly updating DBs in the I-P engine.
> However, the change is going to be quite big, considering how the current
> code base depends on the IDL. Changes to IDL are all over the place. In
> addition to the potential memory and CPU cost to maintain the delta, there
> may be other tricky things to handle, such as:
> - Sometimes it is problematic to see the old data (instead of the updated
> ones) in the IDL in the same round of I-P. (this has happened in
> ovn-controller due to out-of-date IDL index, which was fixed accordingly)
> - Merge accumulated changes to the same DB record.
> 

This is inline with an older proposal Mark had about abstracting out
the IDL, I think.  I agree it's a huge thing to do.  On the other hand,
the current IDL interactions that sometimes happen during I-P handlers
and sometimes during recomputes worry me a bit too.  It's hard to ensure
that we don't introduce tricky bugs.

Regards,
Dumitru

> So, I think it is better to avoid such a change unless it is really
> necessary.
> 
> Thanks,
> Han
> 
>> What do you think?
>>
>> Thanks,
>> Dumitru
>>
>
Han Zhou June 28, 2023, 5:28 p.m. UTC | #7
On Wed, Jun 28, 2023 at 2:14 AM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On 6/27/23 19:47, Han Zhou wrote:
> > On Tue, Jun 27, 2023 at 6:51 AM Dumitru Ceara <dceara@redhat.com> wrote:
> >>
> >> On 6/8/23 07:55, Han Zhou wrote:
> >>> On Wed, Jun 7, 2023 at 12:39 PM Numan Siddique <numans@ovn.org> wrote:
> >>>>
> >>>> On Fri, Jun 2, 2023 at 12:13 AM Han Zhou <hzhou@ovn.org> wrote:
> >>>>>
> >>>>> This patch introduces a change handler for NB logical_switch within
> > the
> >>>>> 'northd' node. It specifically handles cases where logical switch
> > ports
> >>>>> in the tracked logical switches are changed (added/updated/deleted).
> >>>>> Only regular logical switch ports - which are common for VMs/Pods -
> > are
> >>>>> addressed. For other scenarios, it reverts to recompute.
> >>>>>
> >>>>> This update avoids recompute in the northd node (especially the
> >>>>> resource-intensive build_ports()) for NB changes of VIF
> >>>>> add/update/delete.  However, it does not eliminate the need for
lflow
> >>>>> recompute.
> >>>>>
> >>>>> Below are the performance test results simulating an ovn-k8s
topology
> > of
> >>>>> 500 nodes x 50 lsp per node:
> >>>>>
> >>>>> Before:
> >>>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> >>>>>     ovn-northd completion:          955ms
> >>>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001
lsp_1_0001_01
> >>> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
> >>> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> >>>>>     ovn-northd completion:          919ms
> >>>>>
> >>>>> After:
> >>>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> >>>>>     ovn-northd completion:          776ms
> >>>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001
lsp_1_0001_01
> >>> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
> >>> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> >>>>>     ovn-northd completion:          773ms
> >>>>>
> >>>>> Both addition and deletion show ~20% reduction of ovn-northd
> > completion
> >>>>> time.  Note: the test uses only 1 thread of ovn-northd for flow
> >>>>> recompute. Using multithread should show a larger percentage of
> >>>>> improvement.
> >>>>>
> >>>>> Signed-off-by: Han Zhou <hzhou@ovn.org>
> >>>>> Reviewed-by: Ales Musil <amusil@redhat.com>
> >>>>
> >>>> Few nits below.
> >>>>
> >>>> Acked-by: Numan Siddique <numans@ovn.org>
> >>>>
> >>>> Numan
> >>>>
> >>
> >> Hi Han, Numan,
> >>
> >> First of all, sorry for the late review, I had these patches on my list
> >> for a while and only got to them after they were merged.
> >>
> >
> > Hi Dumitru, thanks for your diligent review and comments are always
welcome
> > any time.
> >
> >>>>
> >>>>> ---
> >>>>>  lib/ovn-util.c           |  15 ++
> >>>>>  lib/ovn-util.h           |   1 +
> >>>>>  northd/en-northd.c       | 130 ++++++---
> >>>>>  northd/en-northd.h       |   2 +
> >>>>>  northd/inc-proc-northd.c |  12 +-
> >>>>>  northd/northd.c          | 567
> > +++++++++++++++++++++++++++++++++------
> >>>>>  northd/northd.h          |  27 +-
> >>>>>  tests/ovn-northd.at      |  58 ++++
> >>>>>  8 files changed, 685 insertions(+), 127 deletions(-)
> >>>>>
> >>
> >> [...]
> >>
> >>>>> +bool
> >>>>> +northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
> >>>>> +                         const struct northd_input *ni,
> >>>>> +                         struct northd_data *nd)
> >>>>> +{
> >>>>> +    const struct nbrec_logical_switch *changed_ls;
> >>>>> +    struct ls_change *ls_change = NULL;
> >>>>> +
> >>>>> +    NBREC_LOGICAL_SWITCH_TABLE_FOR_EACH_TRACKED (changed_ls,
> >>>>> +
> >>> ni->nbrec_logical_switch_table) {
> >>>>> +        ls_change = NULL;
> >>>>> +        if (nbrec_logical_switch_is_new(changed_ls) ||
> >>>>> +            nbrec_logical_switch_is_deleted(changed_ls)) {
> >>>>> +            goto fail;
> >>>>> +        }
> >>>>> +        struct ovn_datapath *od = ovn_datapath_find(
> >>>>> +                                    &nd->ls_datapaths.datapaths,
> >>>>> +                                    &changed_ls->header_.uuid);
> >>>>> +        if (!od) {
> >>>>> +            static struct vlog_rate_limit rl =
> > VLOG_RATE_LIMIT_INIT(1,
> >>> 1);
> >>>>> +            VLOG_WARN_RL(&rl, "Internal error: a tracked updated LS
> >>> doesn't "
> >>>>> +                         "exist in ls_datapaths: "UUID_FMT,
> >>>>> +                         UUID_ARGS(&changed_ls->header_.uuid));
> >>>>> +            goto fail;
> >>>>> +        }
> >>>>> +
> >>>>> +        /* Now only able to handle lsp changes. */
> >>>>> +        if (check_ls_changes_other_than_lsp(changed_ls)) {
> >>>>> +            goto fail;
> >>>>> +        }
> >>>>> +
> >>>>> +        ls_change = xzalloc(sizeof *ls_change);
> >>>>> +        ls_change->od = od;
> >>>>> +        ovs_list_init(&ls_change->added_ports);
> >>>>> +        ovs_list_init(&ls_change->deleted_ports);
> >>>>> +        ovs_list_init(&ls_change->updated_ports);
> >>>>> +
> >>>>> +        struct ovn_port *op;
> >>>>> +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
> >>>>> +            op->visited = false;
> >>>>> +        }
> >>>>> +
> >>>>> +        /* Compare the individual ports in the old and new Logical
> >>> Switches */
> >>>>> +        for (size_t j = 0; j < changed_ls->n_ports; ++j) {
> >>>>> +            struct nbrec_logical_switch_port *new_nbsp =
> >>> changed_ls->ports[j];
> >>>>> +            op = ovn_port_find_in_datapath(od, new_nbsp->name);
> >>>>> +
> >>>>> +            if (!op) {
> >>>>> +                if (!lsp_can_be_inc_processed(new_nbsp)) {
> >>>>> +                    goto fail;
> >>>>> +                }
> >>>>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> >>>>> +                                    new_nbsp->name, new_nbsp, od,
> > NULL,
> >>>>> +                                    ni->sbrec_mirror_table,
> >>>>> +                                    ni->sbrec_chassis_table,
> >>>>> +                                    ni->sbrec_chassis_by_name,
> >>>>> +                                    ni->sbrec_chassis_by_hostname);
> >>>>> +                if (!op) {
> >>>>> +                    goto fail;
> >>>>> +                }
> >>>>> +                ovs_list_push_back(&ls_change->added_ports,
> >>>>> +                                   &op->list);
> >>>>> +            } else if (ls_port_has_changed(op->nbsp, new_nbsp)) {
> >>>>> +                /* Existing port updated */
> >>>>> +                bool temp = false;
> >>>>> +                if (lsp_is_type_changed(op->sb, new_nbsp, &temp) ||
> >>>>> +                    !op->lsp_can_be_inc_processed ||
> >>>>> +                    !lsp_can_be_inc_processed(new_nbsp)) {
> >>>>> +                    goto fail;
> >>>>> +                }
> >>>>> +                const struct sbrec_port_binding *sb = op->sb;
> >>>>> +                if (sset_contains(&nd->svc_monitor_lsps,
> >>> new_nbsp->name)) {
> >>>>> +                    /* This port is used for svc monitor, which may
> > be
> >>> impacted
> >>>>> +                     * by this change. Fallback to recompute. */
> >>>>> +                    goto fail;
> >>>>> +                }
> >>>>> +                ovn_port_destroy(&nd->ls_ports, op);
> >>>>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> >>>>> +                                    new_nbsp->name, new_nbsp, od,
sb,
> >>>>> +                                    ni->sbrec_mirror_table,
> >>>>> +                                    ni->sbrec_chassis_table,
> >>>>> +                                    ni->sbrec_chassis_by_name,
> >>>>> +                                    ni->sbrec_chassis_by_hostname);
> >>>>> +                if (!op) {
> >>>>> +                    goto fail;
> >>>>> +                }
> >>>>> +                ovs_list_push_back(&ls_change->updated_ports,
> >>> &op->list);
> >>>>> +            }
> >>>>> +            op->visited = true;
> >>>>> +        }
> >>>>> +
> >>>>> +        /* Check for deleted ports */
> >>>>> +        HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
> >>>>> +            if (!op->visited) {
> >>>>> +                if (!op->lsp_can_be_inc_processed) {
> >>>>> +                    goto fail;
> >>>>> +                }
> >>>>> +                if (sset_contains(&nd->svc_monitor_lsps, op->key))
{
> >>>>> +                    /* This port was used for svc monitor, which
may
> > be
> >>>>> +                     * impacted by this deletion. Fallback to
> >>> recompute. */
> >>>>> +                    goto fail;
> >>>>> +                }
> >>>>> +                ovs_list_push_back(&ls_change->deleted_ports,
> >>>>> +                                   &op->list);
> >>>>> +                hmap_remove(&nd->ls_ports, &op->key_node);
> >>>>> +                hmap_remove(&od->ports, &op->dp_node);
> >>>>> +                sbrec_port_binding_delete(op->sb);
> >>
> >> Quoting from ovsdb-idl.c:
> >>
> >> /* Deletes 'row_' from its table.  May free 'row_', so it must not be
> >>  * accessed afterward.
> >>
> >> So, in theory, we shouldn't call delete here because we might end up
> >> failing for other ports/datapath and falling back to recompute.
> >>
> >> As far as I can tell, only newly inserted rows may be deleted by the
> >> current IDL implementation but I don't think it's safe to assume this
> >> will never change in the future.
> >
> > In general, when falling back to recompute in the middle of I-P, the
> > recompute function will do a full sync between desired states and
existing
> > states. Now the existing states are partially synced during the I-P
before
> > recompute, the recompute function will just see the synced part as
"already
> > in sync" and only apply for the rest of the changes. I think this
mechanism
> > applies for add/update/delete the same way. Here for delete, if the
record
> > is deleted from IDL, it won't be found by the recompute function, so
there
> > is no way to access it again. Did I miss something?
> >
>
> You're right, the ports we delete we first remove from the
> datapath->ports hmap.  On the other hand, I think I spotted another
> leak.
>
> Deleted ports get pushed back to &ls_change->deleted_ports:
>
>         /* Check for deleted ports */
>         HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
>             if (!op->visited) {
>                 if (!op->lsp_can_be_inc_processed) {
>                     goto fail;
>                 }
>                 if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
>                     /* This port was used for svc monitor, which may be
>                      * impacted by this deletion. Fallback to recompute.
*/
>                     goto fail;
>                 }
>                 ovs_list_push_back(&ls_change->deleted_ports,
>                                    &op->list);
>                 hmap_remove(&nd->ls_ports, &op->key_node);
>                 hmap_remove(&od->ports, &op->dp_node);
>                 sbrec_port_binding_delete(op->sb);
>                 delete_fdb_entry(ni->sbrec_fdb_by_dp_and_port,
od->tunnel_key,
>                                  op->tunnel_key);
>             }
>         }
>
> [...]
>
> fail:
>     free(ls_change);
>
> If we deleted some ports and then got out of the loop early then here we
> leak the ports stored in ls_change->deleted_ports.  Or am I missing
> something?

The next line under fail:, after free(ls_change), is:
    destroy_northd_data_tracked_changes(nd);

The deleted ports will be destroyed here. Does this address your concern?

Thanks,
Han

>
> >>
> >> We should probably accumulate all records that need to be deleted and
> >> delete them at the end of the whole I-P run.  Do we need some I-P infra
> >> changes?  Maybe users can provide a callback that can be called
whenever
> >> an I-P run has finished?
> >>
> > I thought about this approach, but not for the reason above. I would do
the
> > same for all DB changes to avoid directly updating DBs in the I-P
engine.
> > However, the change is going to be quite big, considering how the
current
> > code base depends on the IDL. Changes to IDL are all over the place. In
> > addition to the potential memory and CPU cost to maintain the delta,
there
> > may be other tricky things to handle, such as:
> > - Sometimes it is problematic to see the old data (instead of the
updated
> > ones) in the IDL in the same round of I-P. (this has happened in
> > ovn-controller due to out-of-date IDL index, which was fixed
accordingly)
> > - Merge accumulated changes to the same DB record.
> >
>
> This is inline with an older proposal Mark had about abstracting out
> the IDL, I think.  I agree it's a huge thing to do.  On the other hand,
> the current IDL interactions that sometimes happen during I-P handlers
> and sometimes during recomputes worry me a bit too.  It's hard to ensure
> that we don't introduce tricky bugs.
>
> Regards,
> Dumitru
>
> > So, I think it is better to avoid such a change unless it is really
> > necessary.
> >
> > Thanks,
> > Han
> >
> >> What do you think?
> >>
> >> Thanks,
> >> Dumitru
> >>
> >
>
Dumitru Ceara June 28, 2023, 7:06 p.m. UTC | #8
On 6/28/23 19:28, Han Zhou wrote:
> On Wed, Jun 28, 2023 at 2:14 AM Dumitru Ceara <dceara@redhat.com> wrote:
>>
>> On 6/27/23 19:47, Han Zhou wrote:
>>> On Tue, Jun 27, 2023 at 6:51 AM Dumitru Ceara <dceara@redhat.com> wrote:
>>>>
>>>> On 6/8/23 07:55, Han Zhou wrote:
>>>>> On Wed, Jun 7, 2023 at 12:39 PM Numan Siddique <numans@ovn.org> wrote:
>>>>>>
>>>>>> On Fri, Jun 2, 2023 at 12:13 AM Han Zhou <hzhou@ovn.org> wrote:
>>>>>>>
>>>>>>> This patch introduces a change handler for NB logical_switch within
>>> the
>>>>>>> 'northd' node. It specifically handles cases where logical switch
>>> ports
>>>>>>> in the tracked logical switches are changed (added/updated/deleted).
>>>>>>> Only regular logical switch ports - which are common for VMs/Pods -
>>> are
>>>>>>> addressed. For other scenarios, it reverts to recompute.
>>>>>>>
>>>>>>> This update avoids recompute in the northd node (especially the
>>>>>>> resource-intensive build_ports()) for NB changes of VIF
>>>>>>> add/update/delete.  However, it does not eliminate the need for
> lflow
>>>>>>> recompute.
>>>>>>>
>>>>>>> Below are the performance test results simulating an ovn-k8s
> topology
>>> of
>>>>>>> 500 nodes x 50 lsp per node:
>>>>>>>
>>>>>>> Before:
>>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
>>>>>>>     ovn-northd completion:          955ms
>>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001
> lsp_1_0001_01
>>>>> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
>>>>> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
>>>>>>>     ovn-northd completion:          919ms
>>>>>>>
>>>>>>> After:
>>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
>>>>>>>     ovn-northd completion:          776ms
>>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001
> lsp_1_0001_01
>>>>> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
>>>>> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
>>>>>>>     ovn-northd completion:          773ms
>>>>>>>
>>>>>>> Both addition and deletion show ~20% reduction of ovn-northd
>>> completion
>>>>>>> time.  Note: the test uses only 1 thread of ovn-northd for flow
>>>>>>> recompute. Using multithread should show a larger percentage of
>>>>>>> improvement.
>>>>>>>
>>>>>>> Signed-off-by: Han Zhou <hzhou@ovn.org>
>>>>>>> Reviewed-by: Ales Musil <amusil@redhat.com>
>>>>>>
>>>>>> Few nits below.
>>>>>>
>>>>>> Acked-by: Numan Siddique <numans@ovn.org>
>>>>>>
>>>>>> Numan
>>>>>>
>>>>
>>>> Hi Han, Numan,
>>>>
>>>> First of all, sorry for the late review, I had these patches on my list
>>>> for a while and only got to them after they were merged.
>>>>
>>>
>>> Hi Dumitru, thanks for your diligent review and comments are always
> welcome
>>> any time.
>>>
>>>>>>
>>>>>>> ---
>>>>>>>  lib/ovn-util.c           |  15 ++
>>>>>>>  lib/ovn-util.h           |   1 +
>>>>>>>  northd/en-northd.c       | 130 ++++++---
>>>>>>>  northd/en-northd.h       |   2 +
>>>>>>>  northd/inc-proc-northd.c |  12 +-
>>>>>>>  northd/northd.c          | 567
>>> +++++++++++++++++++++++++++++++++------
>>>>>>>  northd/northd.h          |  27 +-
>>>>>>>  tests/ovn-northd.at      |  58 ++++
>>>>>>>  8 files changed, 685 insertions(+), 127 deletions(-)
>>>>>>>
>>>>
>>>> [...]
>>>>
>>>>>>> +bool
>>>>>>> +northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
>>>>>>> +                         const struct northd_input *ni,
>>>>>>> +                         struct northd_data *nd)
>>>>>>> +{
>>>>>>> +    const struct nbrec_logical_switch *changed_ls;
>>>>>>> +    struct ls_change *ls_change = NULL;
>>>>>>> +
>>>>>>> +    NBREC_LOGICAL_SWITCH_TABLE_FOR_EACH_TRACKED (changed_ls,
>>>>>>> +
>>>>> ni->nbrec_logical_switch_table) {
>>>>>>> +        ls_change = NULL;
>>>>>>> +        if (nbrec_logical_switch_is_new(changed_ls) ||
>>>>>>> +            nbrec_logical_switch_is_deleted(changed_ls)) {
>>>>>>> +            goto fail;
>>>>>>> +        }
>>>>>>> +        struct ovn_datapath *od = ovn_datapath_find(
>>>>>>> +                                    &nd->ls_datapaths.datapaths,
>>>>>>> +                                    &changed_ls->header_.uuid);
>>>>>>> +        if (!od) {
>>>>>>> +            static struct vlog_rate_limit rl =
>>> VLOG_RATE_LIMIT_INIT(1,
>>>>> 1);
>>>>>>> +            VLOG_WARN_RL(&rl, "Internal error: a tracked updated LS
>>>>> doesn't "
>>>>>>> +                         "exist in ls_datapaths: "UUID_FMT,
>>>>>>> +                         UUID_ARGS(&changed_ls->header_.uuid));
>>>>>>> +            goto fail;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        /* Now only able to handle lsp changes. */
>>>>>>> +        if (check_ls_changes_other_than_lsp(changed_ls)) {
>>>>>>> +            goto fail;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        ls_change = xzalloc(sizeof *ls_change);
>>>>>>> +        ls_change->od = od;
>>>>>>> +        ovs_list_init(&ls_change->added_ports);
>>>>>>> +        ovs_list_init(&ls_change->deleted_ports);
>>>>>>> +        ovs_list_init(&ls_change->updated_ports);
>>>>>>> +
>>>>>>> +        struct ovn_port *op;
>>>>>>> +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
>>>>>>> +            op->visited = false;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        /* Compare the individual ports in the old and new Logical
>>>>> Switches */
>>>>>>> +        for (size_t j = 0; j < changed_ls->n_ports; ++j) {
>>>>>>> +            struct nbrec_logical_switch_port *new_nbsp =
>>>>> changed_ls->ports[j];
>>>>>>> +            op = ovn_port_find_in_datapath(od, new_nbsp->name);
>>>>>>> +
>>>>>>> +            if (!op) {
>>>>>>> +                if (!lsp_can_be_inc_processed(new_nbsp)) {
>>>>>>> +                    goto fail;
>>>>>>> +                }
>>>>>>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
>>>>>>> +                                    new_nbsp->name, new_nbsp, od,
>>> NULL,
>>>>>>> +                                    ni->sbrec_mirror_table,
>>>>>>> +                                    ni->sbrec_chassis_table,
>>>>>>> +                                    ni->sbrec_chassis_by_name,
>>>>>>> +                                    ni->sbrec_chassis_by_hostname);
>>>>>>> +                if (!op) {
>>>>>>> +                    goto fail;
>>>>>>> +                }
>>>>>>> +                ovs_list_push_back(&ls_change->added_ports,
>>>>>>> +                                   &op->list);
>>>>>>> +            } else if (ls_port_has_changed(op->nbsp, new_nbsp)) {
>>>>>>> +                /* Existing port updated */
>>>>>>> +                bool temp = false;
>>>>>>> +                if (lsp_is_type_changed(op->sb, new_nbsp, &temp) ||
>>>>>>> +                    !op->lsp_can_be_inc_processed ||
>>>>>>> +                    !lsp_can_be_inc_processed(new_nbsp)) {
>>>>>>> +                    goto fail;
>>>>>>> +                }
>>>>>>> +                const struct sbrec_port_binding *sb = op->sb;
>>>>>>> +                if (sset_contains(&nd->svc_monitor_lsps,
>>>>> new_nbsp->name)) {
>>>>>>> +                    /* This port is used for svc monitor, which may
>>> be
>>>>> impacted
>>>>>>> +                     * by this change. Fallback to recompute. */
>>>>>>> +                    goto fail;
>>>>>>> +                }
>>>>>>> +                ovn_port_destroy(&nd->ls_ports, op);
>>>>>>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
>>>>>>> +                                    new_nbsp->name, new_nbsp, od,
> sb,
>>>>>>> +                                    ni->sbrec_mirror_table,
>>>>>>> +                                    ni->sbrec_chassis_table,
>>>>>>> +                                    ni->sbrec_chassis_by_name,
>>>>>>> +                                    ni->sbrec_chassis_by_hostname);
>>>>>>> +                if (!op) {
>>>>>>> +                    goto fail;
>>>>>>> +                }
>>>>>>> +                ovs_list_push_back(&ls_change->updated_ports,
>>>>> &op->list);
>>>>>>> +            }
>>>>>>> +            op->visited = true;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        /* Check for deleted ports */
>>>>>>> +        HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
>>>>>>> +            if (!op->visited) {
>>>>>>> +                if (!op->lsp_can_be_inc_processed) {
>>>>>>> +                    goto fail;
>>>>>>> +                }
>>>>>>> +                if (sset_contains(&nd->svc_monitor_lsps, op->key))
> {
>>>>>>> +                    /* This port was used for svc monitor, which
> may
>>> be
>>>>>>> +                     * impacted by this deletion. Fallback to
>>>>> recompute. */
>>>>>>> +                    goto fail;
>>>>>>> +                }
>>>>>>> +                ovs_list_push_back(&ls_change->deleted_ports,
>>>>>>> +                                   &op->list);
>>>>>>> +                hmap_remove(&nd->ls_ports, &op->key_node);
>>>>>>> +                hmap_remove(&od->ports, &op->dp_node);
>>>>>>> +                sbrec_port_binding_delete(op->sb);
>>>>
>>>> Quoting from ovsdb-idl.c:
>>>>
>>>> /* Deletes 'row_' from its table.  May free 'row_', so it must not be
>>>>  * accessed afterward.
>>>>
>>>> So, in theory, we shouldn't call delete here because we might end up
>>>> failing for other ports/datapath and falling back to recompute.
>>>>
>>>> As far as I can tell, only newly inserted rows may be deleted by the
>>>> current IDL implementation but I don't think it's safe to assume this
>>>> will never change in the future.
>>>
>>> In general, when falling back to recompute in the middle of I-P, the
>>> recompute function will do a full sync between desired states and
> existing
>>> states. Now the existing states are partially synced during the I-P
> before
>>> recompute, the recompute function will just see the synced part as
> "already
>>> in sync" and only apply for the rest of the changes. I think this
> mechanism
>>> applies for add/update/delete the same way. Here for delete, if the
> record
>>> is deleted from IDL, it won't be found by the recompute function, so
> there
>>> is no way to access it again. Did I miss something?
>>>
>>
>> You're right, the ports we delete we first remove from the
>> datapath->ports hmap.  On the other hand, I think I spotted another
>> leak.
>>
>> Deleted ports get pushed back to &ls_change->deleted_ports:
>>
>>         /* Check for deleted ports */
>>         HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
>>             if (!op->visited) {
>>                 if (!op->lsp_can_be_inc_processed) {
>>                     goto fail;
>>                 }
>>                 if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
>>                     /* This port was used for svc monitor, which may be
>>                      * impacted by this deletion. Fallback to recompute.
> */
>>                     goto fail;
>>                 }
>>                 ovs_list_push_back(&ls_change->deleted_ports,
>>                                    &op->list);
>>                 hmap_remove(&nd->ls_ports, &op->key_node);
>>                 hmap_remove(&od->ports, &op->dp_node);
>>                 sbrec_port_binding_delete(op->sb);
>>                 delete_fdb_entry(ni->sbrec_fdb_by_dp_and_port,
> od->tunnel_key,
>>                                  op->tunnel_key);
>>             }
>>         }
>>
>> [...]
>>
>> fail:
>>     free(ls_change);
>>
>> If we deleted some ports and then got out of the loop early then here we
>> leak the ports stored in ls_change->deleted_ports.  Or am I missing
>> something?
> 
> The next line under fail:, after free(ls_change), is:
>     destroy_northd_data_tracked_changes(nd);
> 
> The deleted ports will be destroyed here. Does this address your concern?
> 

Not really, let me highlight the code I mean:

https://github.com/ovn-org/ovn/blob/056f66e1db48c6d098ce3607ff596c1e3ce28dbb/northd/northd.c#L5132-L5171

Now, if there are a couple of ports being deleted at the same time and
the last one of them is used for svc monitor, then we remove these ports
from the od's "ports" map and from "nd->ls_ports" and we add them to the
"ls_change->deleted_ports" list (lines L5143-L5146).  These ports are
not referenced by anything else anymore except for
"ls_change->deleted_ports".  Also, they're not yet added to the northd
data "nd".  For the last one we break early because it's used for svc
monitoring.

We end up here:

fail:
    free(ls_change);
    destroy_northd_data_tracked_changes(nd);
    return false;

The ports that were in "ls_change->deleted_ports" were not completely
destroyed and are not reachable anymore so they're leaked.  Am I maybe
missing a different cleanup path?

Regards,
Dumitru

> Thanks,
> Han
> 
>>
>>>>
>>>> We should probably accumulate all records that need to be deleted and
>>>> delete them at the end of the whole I-P run.  Do we need some I-P infra
>>>> changes?  Maybe users can provide a callback that can be called
> whenever
>>>> an I-P run has finished?
>>>>
>>> I thought about this approach, but not for the reason above. I would do
> the
>>> same for all DB changes to avoid directly updating DBs in the I-P
> engine.
>>> However, the change is going to be quite big, considering how the
> current
>>> code base depends on the IDL. Changes to IDL are all over the place. In
>>> addition to the potential memory and CPU cost to maintain the delta,
> there
>>> may be other tricky things to handle, such as:
>>> - Sometimes it is problematic to see the old data (instead of the
> updated
>>> ones) in the IDL in the same round of I-P. (this has happened in
>>> ovn-controller due to out-of-date IDL index, which was fixed
> accordingly)
>>> - Merge accumulated changes to the same DB record.
>>>
>>
>> This is inline with an older proposal Mark had about abstracting out
>> the IDL, I think.  I agree it's a huge thing to do.  On the other hand,
>> the current IDL interactions that sometimes happen during I-P handlers
>> and sometimes during recomputes worry me a bit too.  It's hard to ensure
>> that we don't introduce tricky bugs.
>>
>> Regards,
>> Dumitru
>>
>>> So, I think it is better to avoid such a change unless it is really
>>> necessary.
>>>
>>> Thanks,
>>> Han
>>>
>>>> What do you think?
>>>>
>>>> Thanks,
>>>> Dumitru
>>>>
>>>
>>
>
Han Zhou June 28, 2023, 8:17 p.m. UTC | #9
On Wed, Jun 28, 2023 at 12:07 PM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On 6/28/23 19:28, Han Zhou wrote:
> > On Wed, Jun 28, 2023 at 2:14 AM Dumitru Ceara <dceara@redhat.com> wrote:
> >>
> >> On 6/27/23 19:47, Han Zhou wrote:
> >>> On Tue, Jun 27, 2023 at 6:51 AM Dumitru Ceara <dceara@redhat.com>
wrote:
> >>>>
> >>>> On 6/8/23 07:55, Han Zhou wrote:
> >>>>> On Wed, Jun 7, 2023 at 12:39 PM Numan Siddique <numans@ovn.org>
wrote:
> >>>>>>
> >>>>>> On Fri, Jun 2, 2023 at 12:13 AM Han Zhou <hzhou@ovn.org> wrote:
> >>>>>>>
> >>>>>>> This patch introduces a change handler for NB logical_switch
within
> >>> the
> >>>>>>> 'northd' node. It specifically handles cases where logical switch
> >>> ports
> >>>>>>> in the tracked logical switches are changed
(added/updated/deleted).
> >>>>>>> Only regular logical switch ports - which are common for VMs/Pods
-
> >>> are
> >>>>>>> addressed. For other scenarios, it reverts to recompute.
> >>>>>>>
> >>>>>>> This update avoids recompute in the northd node (especially the
> >>>>>>> resource-intensive build_ports()) for NB changes of VIF
> >>>>>>> add/update/delete.  However, it does not eliminate the need for
> > lflow
> >>>>>>> recompute.
> >>>>>>>
> >>>>>>> Below are the performance test results simulating an ovn-k8s
> > topology
> >>> of
> >>>>>>> 500 nodes x 50 lsp per node:
> >>>>>>>
> >>>>>>> Before:
> >>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> >>>>>>>     ovn-northd completion:          955ms
> >>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001
> > lsp_1_0001_01
> >>>>> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
> >>>>> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> >>>>>>>     ovn-northd completion:          919ms
> >>>>>>>
> >>>>>>> After:
> >>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> >>>>>>>     ovn-northd completion:          776ms
> >>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001
> > lsp_1_0001_01
> >>>>> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
> >>>>> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> >>>>>>>     ovn-northd completion:          773ms
> >>>>>>>
> >>>>>>> Both addition and deletion show ~20% reduction of ovn-northd
> >>> completion
> >>>>>>> time.  Note: the test uses only 1 thread of ovn-northd for flow
> >>>>>>> recompute. Using multithread should show a larger percentage of
> >>>>>>> improvement.
> >>>>>>>
> >>>>>>> Signed-off-by: Han Zhou <hzhou@ovn.org>
> >>>>>>> Reviewed-by: Ales Musil <amusil@redhat.com>
> >>>>>>
> >>>>>> Few nits below.
> >>>>>>
> >>>>>> Acked-by: Numan Siddique <numans@ovn.org>
> >>>>>>
> >>>>>> Numan
> >>>>>>
> >>>>
> >>>> Hi Han, Numan,
> >>>>
> >>>> First of all, sorry for the late review, I had these patches on my
list
> >>>> for a while and only got to them after they were merged.
> >>>>
> >>>
> >>> Hi Dumitru, thanks for your diligent review and comments are always
> > welcome
> >>> any time.
> >>>
> >>>>>>
> >>>>>>> ---
> >>>>>>>  lib/ovn-util.c           |  15 ++
> >>>>>>>  lib/ovn-util.h           |   1 +
> >>>>>>>  northd/en-northd.c       | 130 ++++++---
> >>>>>>>  northd/en-northd.h       |   2 +
> >>>>>>>  northd/inc-proc-northd.c |  12 +-
> >>>>>>>  northd/northd.c          | 567
> >>> +++++++++++++++++++++++++++++++++------
> >>>>>>>  northd/northd.h          |  27 +-
> >>>>>>>  tests/ovn-northd.at      |  58 ++++
> >>>>>>>  8 files changed, 685 insertions(+), 127 deletions(-)
> >>>>>>>
> >>>>
> >>>> [...]
> >>>>
> >>>>>>> +bool
> >>>>>>> +northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
> >>>>>>> +                         const struct northd_input *ni,
> >>>>>>> +                         struct northd_data *nd)
> >>>>>>> +{
> >>>>>>> +    const struct nbrec_logical_switch *changed_ls;
> >>>>>>> +    struct ls_change *ls_change = NULL;
> >>>>>>> +
> >>>>>>> +    NBREC_LOGICAL_SWITCH_TABLE_FOR_EACH_TRACKED (changed_ls,
> >>>>>>> +
> >>>>> ni->nbrec_logical_switch_table) {
> >>>>>>> +        ls_change = NULL;
> >>>>>>> +        if (nbrec_logical_switch_is_new(changed_ls) ||
> >>>>>>> +            nbrec_logical_switch_is_deleted(changed_ls)) {
> >>>>>>> +            goto fail;
> >>>>>>> +        }
> >>>>>>> +        struct ovn_datapath *od = ovn_datapath_find(
> >>>>>>> +                                    &nd->ls_datapaths.datapaths,
> >>>>>>> +                                    &changed_ls->header_.uuid);
> >>>>>>> +        if (!od) {
> >>>>>>> +            static struct vlog_rate_limit rl =
> >>> VLOG_RATE_LIMIT_INIT(1,
> >>>>> 1);
> >>>>>>> +            VLOG_WARN_RL(&rl, "Internal error: a tracked updated
LS
> >>>>> doesn't "
> >>>>>>> +                         "exist in ls_datapaths: "UUID_FMT,
> >>>>>>> +                         UUID_ARGS(&changed_ls->header_.uuid));
> >>>>>>> +            goto fail;
> >>>>>>> +        }
> >>>>>>> +
> >>>>>>> +        /* Now only able to handle lsp changes. */
> >>>>>>> +        if (check_ls_changes_other_than_lsp(changed_ls)) {
> >>>>>>> +            goto fail;
> >>>>>>> +        }
> >>>>>>> +
> >>>>>>> +        ls_change = xzalloc(sizeof *ls_change);
> >>>>>>> +        ls_change->od = od;
> >>>>>>> +        ovs_list_init(&ls_change->added_ports);
> >>>>>>> +        ovs_list_init(&ls_change->deleted_ports);
> >>>>>>> +        ovs_list_init(&ls_change->updated_ports);
> >>>>>>> +
> >>>>>>> +        struct ovn_port *op;
> >>>>>>> +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
> >>>>>>> +            op->visited = false;
> >>>>>>> +        }
> >>>>>>> +
> >>>>>>> +        /* Compare the individual ports in the old and new
Logical
> >>>>> Switches */
> >>>>>>> +        for (size_t j = 0; j < changed_ls->n_ports; ++j) {
> >>>>>>> +            struct nbrec_logical_switch_port *new_nbsp =
> >>>>> changed_ls->ports[j];
> >>>>>>> +            op = ovn_port_find_in_datapath(od, new_nbsp->name);
> >>>>>>> +
> >>>>>>> +            if (!op) {
> >>>>>>> +                if (!lsp_can_be_inc_processed(new_nbsp)) {
> >>>>>>> +                    goto fail;
> >>>>>>> +                }
> >>>>>>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> >>>>>>> +                                    new_nbsp->name, new_nbsp, od,
> >>> NULL,
> >>>>>>> +                                    ni->sbrec_mirror_table,
> >>>>>>> +                                    ni->sbrec_chassis_table,
> >>>>>>> +                                    ni->sbrec_chassis_by_name,
> >>>>>>> +
 ni->sbrec_chassis_by_hostname);
> >>>>>>> +                if (!op) {
> >>>>>>> +                    goto fail;
> >>>>>>> +                }
> >>>>>>> +                ovs_list_push_back(&ls_change->added_ports,
> >>>>>>> +                                   &op->list);
> >>>>>>> +            } else if (ls_port_has_changed(op->nbsp, new_nbsp)) {
> >>>>>>> +                /* Existing port updated */
> >>>>>>> +                bool temp = false;
> >>>>>>> +                if (lsp_is_type_changed(op->sb, new_nbsp, &temp)
||
> >>>>>>> +                    !op->lsp_can_be_inc_processed ||
> >>>>>>> +                    !lsp_can_be_inc_processed(new_nbsp)) {
> >>>>>>> +                    goto fail;
> >>>>>>> +                }
> >>>>>>> +                const struct sbrec_port_binding *sb = op->sb;
> >>>>>>> +                if (sset_contains(&nd->svc_monitor_lsps,
> >>>>> new_nbsp->name)) {
> >>>>>>> +                    /* This port is used for svc monitor, which
may
> >>> be
> >>>>> impacted
> >>>>>>> +                     * by this change. Fallback to recompute. */
> >>>>>>> +                    goto fail;
> >>>>>>> +                }
> >>>>>>> +                ovn_port_destroy(&nd->ls_ports, op);
> >>>>>>> +                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> >>>>>>> +                                    new_nbsp->name, new_nbsp, od,
> > sb,
> >>>>>>> +                                    ni->sbrec_mirror_table,
> >>>>>>> +                                    ni->sbrec_chassis_table,
> >>>>>>> +                                    ni->sbrec_chassis_by_name,
> >>>>>>> +
 ni->sbrec_chassis_by_hostname);
> >>>>>>> +                if (!op) {
> >>>>>>> +                    goto fail;
> >>>>>>> +                }
> >>>>>>> +                ovs_list_push_back(&ls_change->updated_ports,
> >>>>> &op->list);
> >>>>>>> +            }
> >>>>>>> +            op->visited = true;
> >>>>>>> +        }
> >>>>>>> +
> >>>>>>> +        /* Check for deleted ports */
> >>>>>>> +        HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
> >>>>>>> +            if (!op->visited) {
> >>>>>>> +                if (!op->lsp_can_be_inc_processed) {
> >>>>>>> +                    goto fail;
> >>>>>>> +                }
> >>>>>>> +                if (sset_contains(&nd->svc_monitor_lsps,
op->key))
> > {
> >>>>>>> +                    /* This port was used for svc monitor, which
> > may
> >>> be
> >>>>>>> +                     * impacted by this deletion. Fallback to
> >>>>> recompute. */
> >>>>>>> +                    goto fail;
> >>>>>>> +                }
> >>>>>>> +                ovs_list_push_back(&ls_change->deleted_ports,
> >>>>>>> +                                   &op->list);
> >>>>>>> +                hmap_remove(&nd->ls_ports, &op->key_node);
> >>>>>>> +                hmap_remove(&od->ports, &op->dp_node);
> >>>>>>> +                sbrec_port_binding_delete(op->sb);
> >>>>
> >>>> Quoting from ovsdb-idl.c:
> >>>>
> >>>> /* Deletes 'row_' from its table.  May free 'row_', so it must not be
> >>>>  * accessed afterward.
> >>>>
> >>>> So, in theory, we shouldn't call delete here because we might end up
> >>>> failing for other ports/datapath and falling back to recompute.
> >>>>
> >>>> As far as I can tell, only newly inserted rows may be deleted by the
> >>>> current IDL implementation but I don't think it's safe to assume this
> >>>> will never change in the future.
> >>>
> >>> In general, when falling back to recompute in the middle of I-P, the
> >>> recompute function will do a full sync between desired states and
> > existing
> >>> states. Now the existing states are partially synced during the I-P
> > before
> >>> recompute, the recompute function will just see the synced part as
> > "already
> >>> in sync" and only apply for the rest of the changes. I think this
> > mechanism
> >>> applies for add/update/delete the same way. Here for delete, if the
> > record
> >>> is deleted from IDL, it won't be found by the recompute function, so
> > there
> >>> is no way to access it again. Did I miss something?
> >>>
> >>
> >> You're right, the ports we delete we first remove from the
> >> datapath->ports hmap.  On the other hand, I think I spotted another
> >> leak.
> >>
> >> Deleted ports get pushed back to &ls_change->deleted_ports:
> >>
> >>         /* Check for deleted ports */
> >>         HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
> >>             if (!op->visited) {
> >>                 if (!op->lsp_can_be_inc_processed) {
> >>                     goto fail;
> >>                 }
> >>                 if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
> >>                     /* This port was used for svc monitor, which may be
> >>                      * impacted by this deletion. Fallback to
recompute.
> > */
> >>                     goto fail;
> >>                 }
> >>                 ovs_list_push_back(&ls_change->deleted_ports,
> >>                                    &op->list);
> >>                 hmap_remove(&nd->ls_ports, &op->key_node);
> >>                 hmap_remove(&od->ports, &op->dp_node);
> >>                 sbrec_port_binding_delete(op->sb);
> >>                 delete_fdb_entry(ni->sbrec_fdb_by_dp_and_port,
> > od->tunnel_key,
> >>                                  op->tunnel_key);
> >>             }
> >>         }
> >>
> >> [...]
> >>
> >> fail:
> >>     free(ls_change);
> >>
> >> If we deleted some ports and then got out of the loop early then here
we
> >> leak the ports stored in ls_change->deleted_ports.  Or am I missing
> >> something?
> >
> > The next line under fail:, after free(ls_change), is:
> >     destroy_northd_data_tracked_changes(nd);
> >
> > The deleted ports will be destroyed here. Does this address your
concern?
> >
>
> Not really, let me highlight the code I mean:
>
>
https://github.com/ovn-org/ovn/blob/056f66e1db48c6d098ce3607ff596c1e3ce28dbb/northd/northd.c#L5132-L5171
>
> Now, if there are a couple of ports being deleted at the same time and
> the last one of them is used for svc monitor, then we remove these ports
> from the od's "ports" map and from "nd->ls_ports" and we add them to the
> "ls_change->deleted_ports" list (lines L5143-L5146).  These ports are
> not referenced by anything else anymore except for
> "ls_change->deleted_ports".  Also, they're not yet added to the northd
> data "nd".  For the last one we break early because it's used for svc
> monitoring.
>
> We end up here:
>
> fail:
>     free(ls_change);
>     destroy_northd_data_tracked_changes(nd);
>     return false;
>
> The ports that were in "ls_change->deleted_ports" were not completely
> destroyed and are not reachable anymore so they're leaked.  Am I maybe
> missing a different cleanup path?
>
You are right!! Sorry for missing your point earlier. I will come up with a
fix and a test case to make sure this scenario is covered.
Thanks for pointing this out!
Han

> Regards,
> Dumitru
>
> > Thanks,
> > Han
> >
> >>
> >>>>
> >>>> We should probably accumulate all records that need to be deleted and
> >>>> delete them at the end of the whole I-P run.  Do we need some I-P
infra
> >>>> changes?  Maybe users can provide a callback that can be called
> > whenever
> >>>> an I-P run has finished?
> >>>>
> >>> I thought about this approach, but not for the reason above. I would
do
> > the
> >>> same for all DB changes to avoid directly updating DBs in the I-P
> > engine.
> >>> However, the change is going to be quite big, considering how the
> > current
> >>> code base depends on the IDL. Changes to IDL are all over the place.
In
> >>> addition to the potential memory and CPU cost to maintain the delta,
> > there
> >>> may be other tricky things to handle, such as:
> >>> - Sometimes it is problematic to see the old data (instead of the
> > updated
> >>> ones) in the IDL in the same round of I-P. (this has happened in
> >>> ovn-controller due to out-of-date IDL index, which was fixed
> > accordingly)
> >>> - Merge accumulated changes to the same DB record.
> >>>
> >>
> >> This is inline with an older proposal Mark had about abstracting out
> >> the IDL, I think.  I agree it's a huge thing to do.  On the other hand,
> >> the current IDL interactions that sometimes happen during I-P handlers
> >> and sometimes during recomputes worry me a bit too.  It's hard to
ensure
> >> that we don't introduce tricky bugs.
> >>
> >> Regards,
> >> Dumitru
> >>
> >>> So, I think it is better to avoid such a change unless it is really
> >>> necessary.
> >>>
> >>> Thanks,
> >>> Han
> >>>
> >>>> What do you think?
> >>>>
> >>>> Thanks,
> >>>> Dumitru
> >>>>
> >>>
> >>
> >
>
Han Zhou June 29, 2023, 2:20 a.m. UTC | #10
On Wed, Jun 28, 2023 at 1:17 PM Han Zhou <hzhou@ovn.org> wrote:
>
>
>
> On Wed, Jun 28, 2023 at 12:07 PM Dumitru Ceara <dceara@redhat.com> wrote:
> >
> > On 6/28/23 19:28, Han Zhou wrote:
> > > On Wed, Jun 28, 2023 at 2:14 AM Dumitru Ceara <dceara@redhat.com>
wrote:
> > >>
> > >> On 6/27/23 19:47, Han Zhou wrote:
> > >>> On Tue, Jun 27, 2023 at 6:51 AM Dumitru Ceara <dceara@redhat.com>
wrote:
> > >>>>
> > >>>> On 6/8/23 07:55, Han Zhou wrote:
> > >>>>> On Wed, Jun 7, 2023 at 12:39 PM Numan Siddique <numans@ovn.org>
wrote:
> > >>>>>>
> > >>>>>> On Fri, Jun 2, 2023 at 12:13 AM Han Zhou <hzhou@ovn.org> wrote:
> > >>>>>>>
> > >>>>>>> This patch introduces a change handler for NB logical_switch
within
> > >>> the
> > >>>>>>> 'northd' node. It specifically handles cases where logical
switch
> > >>> ports
> > >>>>>>> in the tracked logical switches are changed
(added/updated/deleted).
> > >>>>>>> Only regular logical switch ports - which are common for
VMs/Pods -
> > >>> are
> > >>>>>>> addressed. For other scenarios, it reverts to recompute.
> > >>>>>>>
> > >>>>>>> This update avoids recompute in the northd node (especially the
> > >>>>>>> resource-intensive build_ports()) for NB changes of VIF
> > >>>>>>> add/update/delete.  However, it does not eliminate the need for
> > > lflow
> > >>>>>>> recompute.
> > >>>>>>>
> > >>>>>>> Below are the performance test results simulating an ovn-k8s
> > > topology
> > >>> of
> > >>>>>>> 500 nodes x 50 lsp per node:
> > >>>>>>>
> > >>>>>>> Before:
> > >>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> > >>>>>>>     ovn-northd completion:          955ms
> > >>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001
> > > lsp_1_0001_01
> > >>>>> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
--
> > >>>>> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> > >>>>>>>     ovn-northd completion:          919ms
> > >>>>>>>
> > >>>>>>> After:
> > >>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> > >>>>>>>     ovn-northd completion:          776ms
> > >>>>>>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001
> > > lsp_1_0001_01
> > >>>>> -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
--
> > >>>>> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> > >>>>>>>     ovn-northd completion:          773ms
> > >>>>>>>
> > >>>>>>> Both addition and deletion show ~20% reduction of ovn-northd
> > >>> completion
> > >>>>>>> time.  Note: the test uses only 1 thread of ovn-northd for flow
> > >>>>>>> recompute. Using multithread should show a larger percentage of
> > >>>>>>> improvement.
> > >>>>>>>
> > >>>>>>> Signed-off-by: Han Zhou <hzhou@ovn.org>
> > >>>>>>> Reviewed-by: Ales Musil <amusil@redhat.com>
> > >>>>>>
> > >>>>>> Few nits below.
> > >>>>>>
> > >>>>>> Acked-by: Numan Siddique <numans@ovn.org>
> > >>>>>>
> > >>>>>> Numan
> > >>>>>>
> > >>>>
> > >>>> Hi Han, Numan,
> > >>>>
> > >>>> First of all, sorry for the late review, I had these patches on my
list
> > >>>> for a while and only got to them after they were merged.
> > >>>>
> > >>>
> > >>> Hi Dumitru, thanks for your diligent review and comments are always
> > > welcome
> > >>> any time.
> > >>>
> > >>>>>>
> > >>>>>>> ---
> > >>>>>>>  lib/ovn-util.c           |  15 ++
> > >>>>>>>  lib/ovn-util.h           |   1 +
> > >>>>>>>  northd/en-northd.c       | 130 ++++++---
> > >>>>>>>  northd/en-northd.h       |   2 +
> > >>>>>>>  northd/inc-proc-northd.c |  12 +-
> > >>>>>>>  northd/northd.c          | 567
> > >>> +++++++++++++++++++++++++++++++++------
> > >>>>>>>  northd/northd.h          |  27 +-
> > >>>>>>>  tests/ovn-northd.at      |  58 ++++
> > >>>>>>>  8 files changed, 685 insertions(+), 127 deletions(-)
> > >>>>>>>
> > >>>>
> > >>>> [...]
> > >>>>
> > >>>>>>> +bool
> > >>>>>>> +northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
> > >>>>>>> +                         const struct northd_input *ni,
> > >>>>>>> +                         struct northd_data *nd)
> > >>>>>>> +{
> > >>>>>>> +    const struct nbrec_logical_switch *changed_ls;
> > >>>>>>> +    struct ls_change *ls_change = NULL;
> > >>>>>>> +
> > >>>>>>> +    NBREC_LOGICAL_SWITCH_TABLE_FOR_EACH_TRACKED (changed_ls,
> > >>>>>>> +
> > >>>>> ni->nbrec_logical_switch_table) {
> > >>>>>>> +        ls_change = NULL;
> > >>>>>>> +        if (nbrec_logical_switch_is_new(changed_ls) ||
> > >>>>>>> +            nbrec_logical_switch_is_deleted(changed_ls)) {
> > >>>>>>> +            goto fail;
> > >>>>>>> +        }
> > >>>>>>> +        struct ovn_datapath *od = ovn_datapath_find(
> > >>>>>>> +
 &nd->ls_datapaths.datapaths,
> > >>>>>>> +                                    &changed_ls->header_.uuid);
> > >>>>>>> +        if (!od) {
> > >>>>>>> +            static struct vlog_rate_limit rl =
> > >>> VLOG_RATE_LIMIT_INIT(1,
> > >>>>> 1);
> > >>>>>>> +            VLOG_WARN_RL(&rl, "Internal error: a tracked
updated LS
> > >>>>> doesn't "
> > >>>>>>> +                         "exist in ls_datapaths: "UUID_FMT,
> > >>>>>>> +                         UUID_ARGS(&changed_ls->header_.uuid));
> > >>>>>>> +            goto fail;
> > >>>>>>> +        }
> > >>>>>>> +
> > >>>>>>> +        /* Now only able to handle lsp changes. */
> > >>>>>>> +        if (check_ls_changes_other_than_lsp(changed_ls)) {
> > >>>>>>> +            goto fail;
> > >>>>>>> +        }
> > >>>>>>> +
> > >>>>>>> +        ls_change = xzalloc(sizeof *ls_change);
> > >>>>>>> +        ls_change->od = od;
> > >>>>>>> +        ovs_list_init(&ls_change->added_ports);
> > >>>>>>> +        ovs_list_init(&ls_change->deleted_ports);
> > >>>>>>> +        ovs_list_init(&ls_change->updated_ports);
> > >>>>>>> +
> > >>>>>>> +        struct ovn_port *op;
> > >>>>>>> +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
> > >>>>>>> +            op->visited = false;
> > >>>>>>> +        }
> > >>>>>>> +
> > >>>>>>> +        /* Compare the individual ports in the old and new
Logical
> > >>>>> Switches */
> > >>>>>>> +        for (size_t j = 0; j < changed_ls->n_ports; ++j) {
> > >>>>>>> +            struct nbrec_logical_switch_port *new_nbsp =
> > >>>>> changed_ls->ports[j];
> > >>>>>>> +            op = ovn_port_find_in_datapath(od, new_nbsp->name);
> > >>>>>>> +
> > >>>>>>> +            if (!op) {
> > >>>>>>> +                if (!lsp_can_be_inc_processed(new_nbsp)) {
> > >>>>>>> +                    goto fail;
> > >>>>>>> +                }
> > >>>>>>> +                op = ls_port_create(ovnsb_idl_txn,
&nd->ls_ports,
> > >>>>>>> +                                    new_nbsp->name, new_nbsp,
od,
> > >>> NULL,
> > >>>>>>> +                                    ni->sbrec_mirror_table,
> > >>>>>>> +                                    ni->sbrec_chassis_table,
> > >>>>>>> +                                    ni->sbrec_chassis_by_name,
> > >>>>>>> +
 ni->sbrec_chassis_by_hostname);
> > >>>>>>> +                if (!op) {
> > >>>>>>> +                    goto fail;
> > >>>>>>> +                }
> > >>>>>>> +                ovs_list_push_back(&ls_change->added_ports,
> > >>>>>>> +                                   &op->list);
> > >>>>>>> +            } else if (ls_port_has_changed(op->nbsp,
new_nbsp)) {
> > >>>>>>> +                /* Existing port updated */
> > >>>>>>> +                bool temp = false;
> > >>>>>>> +                if (lsp_is_type_changed(op->sb, new_nbsp,
&temp) ||
> > >>>>>>> +                    !op->lsp_can_be_inc_processed ||
> > >>>>>>> +                    !lsp_can_be_inc_processed(new_nbsp)) {
> > >>>>>>> +                    goto fail;
> > >>>>>>> +                }
> > >>>>>>> +                const struct sbrec_port_binding *sb = op->sb;
> > >>>>>>> +                if (sset_contains(&nd->svc_monitor_lsps,
> > >>>>> new_nbsp->name)) {
> > >>>>>>> +                    /* This port is used for svc monitor,
which may
> > >>> be
> > >>>>> impacted
> > >>>>>>> +                     * by this change. Fallback to recompute.
*/
> > >>>>>>> +                    goto fail;
> > >>>>>>> +                }
> > >>>>>>> +                ovn_port_destroy(&nd->ls_ports, op);
> > >>>>>>> +                op = ls_port_create(ovnsb_idl_txn,
&nd->ls_ports,
> > >>>>>>> +                                    new_nbsp->name, new_nbsp,
od,
> > > sb,
> > >>>>>>> +                                    ni->sbrec_mirror_table,
> > >>>>>>> +                                    ni->sbrec_chassis_table,
> > >>>>>>> +                                    ni->sbrec_chassis_by_name,
> > >>>>>>> +
 ni->sbrec_chassis_by_hostname);
> > >>>>>>> +                if (!op) {
> > >>>>>>> +                    goto fail;
> > >>>>>>> +                }
> > >>>>>>> +                ovs_list_push_back(&ls_change->updated_ports,
> > >>>>> &op->list);
> > >>>>>>> +            }
> > >>>>>>> +            op->visited = true;
> > >>>>>>> +        }
> > >>>>>>> +
> > >>>>>>> +        /* Check for deleted ports */
> > >>>>>>> +        HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
> > >>>>>>> +            if (!op->visited) {
> > >>>>>>> +                if (!op->lsp_can_be_inc_processed) {
> > >>>>>>> +                    goto fail;
> > >>>>>>> +                }
> > >>>>>>> +                if (sset_contains(&nd->svc_monitor_lsps,
op->key))
> > > {
> > >>>>>>> +                    /* This port was used for svc monitor,
which
> > > may
> > >>> be
> > >>>>>>> +                     * impacted by this deletion. Fallback to
> > >>>>> recompute. */
> > >>>>>>> +                    goto fail;
> > >>>>>>> +                }
> > >>>>>>> +                ovs_list_push_back(&ls_change->deleted_ports,
> > >>>>>>> +                                   &op->list);
> > >>>>>>> +                hmap_remove(&nd->ls_ports, &op->key_node);
> > >>>>>>> +                hmap_remove(&od->ports, &op->dp_node);
> > >>>>>>> +                sbrec_port_binding_delete(op->sb);
> > >>>>
> > >>>> Quoting from ovsdb-idl.c:
> > >>>>
> > >>>> /* Deletes 'row_' from its table.  May free 'row_', so it must not
be
> > >>>>  * accessed afterward.
> > >>>>
> > >>>> So, in theory, we shouldn't call delete here because we might end
up
> > >>>> failing for other ports/datapath and falling back to recompute.
> > >>>>
> > >>>> As far as I can tell, only newly inserted rows may be deleted by
the
> > >>>> current IDL implementation but I don't think it's safe to assume
this
> > >>>> will never change in the future.
> > >>>
> > >>> In general, when falling back to recompute in the middle of I-P, the
> > >>> recompute function will do a full sync between desired states and
> > > existing
> > >>> states. Now the existing states are partially synced during the I-P
> > > before
> > >>> recompute, the recompute function will just see the synced part as
> > > "already
> > >>> in sync" and only apply for the rest of the changes. I think this
> > > mechanism
> > >>> applies for add/update/delete the same way. Here for delete, if the
> > > record
> > >>> is deleted from IDL, it won't be found by the recompute function, so
> > > there
> > >>> is no way to access it again. Did I miss something?
> > >>>
> > >>
> > >> You're right, the ports we delete we first remove from the
> > >> datapath->ports hmap.  On the other hand, I think I spotted another
> > >> leak.
> > >>
> > >> Deleted ports get pushed back to &ls_change->deleted_ports:
> > >>
> > >>         /* Check for deleted ports */
> > >>         HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
> > >>             if (!op->visited) {
> > >>                 if (!op->lsp_can_be_inc_processed) {
> > >>                     goto fail;
> > >>                 }
> > >>                 if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
> > >>                     /* This port was used for svc monitor, which may
be
> > >>                      * impacted by this deletion. Fallback to
recompute.
> > > */
> > >>                     goto fail;
> > >>                 }
> > >>                 ovs_list_push_back(&ls_change->deleted_ports,
> > >>                                    &op->list);
> > >>                 hmap_remove(&nd->ls_ports, &op->key_node);
> > >>                 hmap_remove(&od->ports, &op->dp_node);
> > >>                 sbrec_port_binding_delete(op->sb);
> > >>                 delete_fdb_entry(ni->sbrec_fdb_by_dp_and_port,
> > > od->tunnel_key,
> > >>                                  op->tunnel_key);
> > >>             }
> > >>         }
> > >>
> > >> [...]
> > >>
> > >> fail:
> > >>     free(ls_change);
> > >>
> > >> If we deleted some ports and then got out of the loop early then
here we
> > >> leak the ports stored in ls_change->deleted_ports.  Or am I missing
> > >> something?
> > >
> > > The next line under fail:, after free(ls_change), is:
> > >     destroy_northd_data_tracked_changes(nd);
> > >
> > > The deleted ports will be destroyed here. Does this address your
concern?
> > >
> >
> > Not really, let me highlight the code I mean:
> >
> >
https://github.com/ovn-org/ovn/blob/056f66e1db48c6d098ce3607ff596c1e3ce28dbb/northd/northd.c#L5132-L5171
> >
> > Now, if there are a couple of ports being deleted at the same time and
> > the last one of them is used for svc monitor, then we remove these ports
> > from the od's "ports" map and from "nd->ls_ports" and we add them to the
> > "ls_change->deleted_ports" list (lines L5143-L5146).  These ports are
> > not referenced by anything else anymore except for
> > "ls_change->deleted_ports".  Also, they're not yet added to the northd
> > data "nd".  For the last one we break early because it's used for svc
> > monitoring.
> >
> > We end up here:
> >
> > fail:
> >     free(ls_change);
> >     destroy_northd_data_tracked_changes(nd);
> >     return false;
> >
> > The ports that were in "ls_change->deleted_ports" were not completely
> > destroyed and are not reachable anymore so they're leaked.  Am I maybe
> > missing a different cleanup path?
> >
> You are right!! Sorry for missing your point earlier. I will come up with
a fix and a test case to make sure this scenario is covered.
> Thanks for pointing this out!
> Han
>
Fix posted here:
https://patchwork.ozlabs.org/project/ovn/patch/20230629021800.88257-1-hzhou@ovn.org/
PTAL. Thanks again for catching it!

> > Regards,
> > Dumitru
> >
> > > Thanks,
> > > Han
> > >
> > >>
> > >>>>
> > >>>> We should probably accumulate all records that need to be deleted
and
> > >>>> delete them at the end of the whole I-P run.  Do we need some I-P
infra
> > >>>> changes?  Maybe users can provide a callback that can be called
> > > whenever
> > >>>> an I-P run has finished?
> > >>>>
> > >>> I thought about this approach, but not for the reason above. I
would do
> > > the
> > >>> same for all DB changes to avoid directly updating DBs in the I-P
> > > engine.
> > >>> However, the change is going to be quite big, considering how the
> > > current
> > >>> code base depends on the IDL. Changes to IDL are all over the
place. In
> > >>> addition to the potential memory and CPU cost to maintain the delta,
> > > there
> > >>> may be other tricky things to handle, such as:
> > >>> - Sometimes it is problematic to see the old data (instead of the
> > > updated
> > >>> ones) in the IDL in the same round of I-P. (this has happened in
> > >>> ovn-controller due to out-of-date IDL index, which was fixed
> > > accordingly)
> > >>> - Merge accumulated changes to the same DB record.
> > >>>
> > >>
> > >> This is inline with an older proposal Mark had about abstracting out
> > >> the IDL, I think.  I agree it's a huge thing to do.  On the other
hand,
> > >> the current IDL interactions that sometimes happen during I-P
handlers
> > >> and sometimes during recomputes worry me a bit too.  It's hard to
ensure
> > >> that we don't introduce tricky bugs.
> > >>
> > >> Regards,
> > >> Dumitru
> > >>
> > >>> So, I think it is better to avoid such a change unless it is really
> > >>> necessary.
> > >>>
> > >>> Thanks,
> > >>> Han
> > >>>
> > >>>> What do you think?
> > >>>>
> > >>>> Thanks,
> > >>>> Dumitru
> > >>>>
> > >>>
> > >>
> > >
> >
diff mbox series

Patch

diff --git a/lib/ovn-util.c b/lib/ovn-util.c
index bffb521cfd9d..da2a9d45ac64 100644
--- a/lib/ovn-util.c
+++ b/lib/ovn-util.c
@@ -720,6 +720,21 @@  ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t min,
     return 0;
 }
 
+bool
+ovn_free_tnlid(struct hmap *tnlids, uint32_t tnlid)
+{
+    uint32_t hash = hash_int(tnlid, 0);
+    struct tnlid_node *node;
+    HMAP_FOR_EACH_IN_BUCKET (node, hmap_node, hash, tnlids) {
+        if (node->tnlid == tnlid) {
+            hmap_remove(tnlids, &node->hmap_node);
+            free(node);
+            return true;
+        }
+    }
+    return false;
+}
+
 char *
 ovn_chassis_redirect_name(const char *port_name)
 {
diff --git a/lib/ovn-util.h b/lib/ovn-util.h
index b17b0e2364c5..9681a12219a9 100644
--- a/lib/ovn-util.h
+++ b/lib/ovn-util.h
@@ -167,6 +167,7 @@  bool ovn_add_tnlid(struct hmap *set, uint32_t tnlid);
 bool ovn_tnlid_present(struct hmap *tnlids, uint32_t tnlid);
 uint32_t ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t min,
                             uint32_t max, uint32_t *hint);
+bool ovn_free_tnlid(struct hmap *tnlids, uint32_t tnlid);
 
 static inline void
 get_unique_lport_key(uint64_t dp_tunnel_key, uint64_t lport_tunnel_key,
diff --git a/northd/en-northd.c b/northd/en-northd.c
index a3dc37e198e3..f2bf98f774b1 100644
--- a/northd/en-northd.c
+++ b/northd/en-northd.c
@@ -31,92 +31,103 @@ 
 
 VLOG_DEFINE_THIS_MODULE(en_northd);
 
-void en_northd_run(struct engine_node *node, void *data)
+static void
+northd_get_input_data(struct engine_node *node,
+                      struct northd_input *input_data)
 {
-    const struct engine_context *eng_ctx = engine_get_context();
-
-    struct northd_input input_data;
-
-    northd_destroy(data);
-    northd_init(data);
-
-    input_data.sbrec_chassis_by_name =
+    input_data->sbrec_chassis_by_name =
         engine_ovsdb_node_get_index(
             engine_get_input("SB_chassis", node),
             "sbrec_chassis_by_name");
-    input_data.sbrec_chassis_by_hostname =
+    input_data->sbrec_chassis_by_hostname =
         engine_ovsdb_node_get_index(
             engine_get_input("SB_chassis", node),
             "sbrec_chassis_by_hostname");
-    input_data.sbrec_ha_chassis_grp_by_name =
+    input_data->sbrec_ha_chassis_grp_by_name =
         engine_ovsdb_node_get_index(
             engine_get_input("SB_ha_chassis_group", node),
             "sbrec_ha_chassis_grp_by_name");
-    input_data.sbrec_ip_mcast_by_dp =
+    input_data->sbrec_ip_mcast_by_dp =
         engine_ovsdb_node_get_index(
             engine_get_input("SB_ip_multicast", node),
             "sbrec_ip_mcast_by_dp");
-    input_data.sbrec_static_mac_binding_by_lport_ip =
+    input_data->sbrec_static_mac_binding_by_lport_ip =
         engine_ovsdb_node_get_index(
             engine_get_input("SB_static_mac_binding", node),
             "sbrec_static_mac_binding_by_lport_ip");
+    input_data->sbrec_fdb_by_dp_and_port =
+        engine_ovsdb_node_get_index(
+            engine_get_input("SB_fdb", node),
+            "sbrec_fdb_by_dp_and_port");
 
-    input_data.nbrec_nb_global_table =
+    input_data->nbrec_nb_global_table =
         EN_OVSDB_GET(engine_get_input("NB_nb_global", node));
-    input_data.nbrec_logical_switch_table =
+    input_data->nbrec_logical_switch_table =
         EN_OVSDB_GET(engine_get_input("NB_logical_switch", node));
-    input_data.nbrec_logical_router_table =
+    input_data->nbrec_logical_router_table =
         EN_OVSDB_GET(engine_get_input("NB_logical_router", node));
-    input_data.nbrec_load_balancer_table =
+    input_data->nbrec_load_balancer_table =
         EN_OVSDB_GET(engine_get_input("NB_load_balancer", node));
-    input_data.nbrec_load_balancer_group_table =
+    input_data->nbrec_load_balancer_group_table =
         EN_OVSDB_GET(engine_get_input("NB_load_balancer_group", node));
-    input_data.nbrec_port_group_table =
+    input_data->nbrec_port_group_table =
         EN_OVSDB_GET(engine_get_input("NB_port_group", node));
-    input_data.nbrec_meter_table =
+    input_data->nbrec_meter_table =
         EN_OVSDB_GET(engine_get_input("NB_meter", node));
-    input_data.nbrec_acl_table =
+    input_data->nbrec_acl_table =
         EN_OVSDB_GET(engine_get_input("NB_acl", node));
-    input_data.nbrec_static_mac_binding_table =
+    input_data->nbrec_static_mac_binding_table =
         EN_OVSDB_GET(engine_get_input("NB_static_mac_binding", node));
-    input_data.nbrec_chassis_template_var_table =
+    input_data->nbrec_chassis_template_var_table =
         EN_OVSDB_GET(engine_get_input("NB_chassis_template_var", node));
-    input_data.nbrec_mirror_table =
+    input_data->nbrec_mirror_table =
         EN_OVSDB_GET(engine_get_input("NB_mirror", node));
 
-    input_data.sbrec_sb_global_table =
+    input_data->sbrec_sb_global_table =
         EN_OVSDB_GET(engine_get_input("SB_sb_global", node));
-    input_data.sbrec_datapath_binding_table =
+    input_data->sbrec_datapath_binding_table =
         EN_OVSDB_GET(engine_get_input("SB_datapath_binding", node));
-    input_data.sbrec_port_binding_table =
+    input_data->sbrec_port_binding_table =
         EN_OVSDB_GET(engine_get_input("SB_port_binding", node));
-    input_data.sbrec_mac_binding_table =
+    input_data->sbrec_mac_binding_table =
         EN_OVSDB_GET(engine_get_input("SB_mac_binding", node));
-    input_data.sbrec_ha_chassis_group_table =
+    input_data->sbrec_ha_chassis_group_table =
         EN_OVSDB_GET(engine_get_input("SB_ha_chassis_group", node));
-    input_data.sbrec_chassis_table =
+    input_data->sbrec_chassis_table =
         EN_OVSDB_GET(engine_get_input("SB_chassis", node));
-    input_data.sbrec_fdb_table =
+    input_data->sbrec_fdb_table =
         EN_OVSDB_GET(engine_get_input("SB_fdb", node));
-    input_data.sbrec_load_balancer_table =
+    input_data->sbrec_load_balancer_table =
         EN_OVSDB_GET(engine_get_input("SB_load_balancer", node));
-    input_data.sbrec_service_monitor_table =
+    input_data->sbrec_service_monitor_table =
         EN_OVSDB_GET(engine_get_input("SB_service_monitor", node));
-    input_data.sbrec_port_group_table =
+    input_data->sbrec_port_group_table =
         EN_OVSDB_GET(engine_get_input("SB_port_group", node));
-    input_data.sbrec_meter_table =
+    input_data->sbrec_meter_table =
         EN_OVSDB_GET(engine_get_input("SB_meter", node));
-    input_data.sbrec_dns_table =
+    input_data->sbrec_dns_table =
         EN_OVSDB_GET(engine_get_input("SB_dns", node));
-    input_data.sbrec_ip_multicast_table =
+    input_data->sbrec_ip_multicast_table =
         EN_OVSDB_GET(engine_get_input("SB_ip_multicast", node));
-    input_data.sbrec_static_mac_binding_table =
+    input_data->sbrec_static_mac_binding_table =
         EN_OVSDB_GET(engine_get_input("SB_static_mac_binding", node));
-    input_data.sbrec_chassis_template_var_table =
+    input_data->sbrec_chassis_template_var_table =
         EN_OVSDB_GET(engine_get_input("SB_chassis_template_var", node));
-    input_data.sbrec_mirror_table =
+    input_data->sbrec_mirror_table =
         EN_OVSDB_GET(engine_get_input("SB_mirror", node));
+}
 
+void
+en_northd_run(struct engine_node *node, void *data)
+{
+    const struct engine_context *eng_ctx = engine_get_context();
+
+    struct northd_input input_data;
+
+    northd_destroy(data);
+    northd_init(data);
+
+    northd_get_input_data(node, &input_data);
     northd_run(&input_data, data,
                eng_ctx->ovnnb_idl_txn,
                eng_ctx->ovnsb_idl_txn);
@@ -148,17 +159,48 @@  northd_nb_nb_global_handler(struct engine_node *node,
     return true;
 }
 
-void *en_northd_init(struct engine_node *node OVS_UNUSED,
-                     struct engine_arg *arg OVS_UNUSED)
+bool
+northd_nb_logical_switch_handler(struct engine_node *node,
+                                 void *data)
 {
-    struct northd_data *data = xmalloc(sizeof *data);
+    const struct engine_context *eng_ctx = engine_get_context();
+    struct northd_data *nd = data;
+
+    struct northd_input input_data;
+
+    northd_get_input_data(node, &input_data);
+
+    if (!northd_handle_ls_changes(eng_ctx->ovnsb_idl_txn, &input_data, nd)) {
+        return false;
+    }
+
+    if (nd->change_tracked) {
+        engine_set_node_state(node, EN_UPDATED);
+    }
+
+    return true;
+}
+
+void
+*en_northd_init(struct engine_node *node OVS_UNUSED,
+                struct engine_arg *arg OVS_UNUSED)
+{
+    struct northd_data *data = xzalloc(sizeof *data);
 
     northd_init(data);
 
     return data;
 }
 
-void en_northd_cleanup(void *data)
+void
+en_northd_cleanup(void *data)
 {
     northd_destroy(data);
 }
+
+void
+en_northd_clear_tracked_data(void *data_)
+{
+    struct northd_data *data = data_;
+    destroy_northd_data_tracked_changes(data);
+}
diff --git a/northd/en-northd.h b/northd/en-northd.h
index 8d8343b459a6..a53a162bda48 100644
--- a/northd/en-northd.h
+++ b/northd/en-northd.h
@@ -13,6 +13,8 @@  void en_northd_run(struct engine_node *node OVS_UNUSED, void *data OVS_UNUSED);
 void *en_northd_init(struct engine_node *node OVS_UNUSED,
                      struct engine_arg *arg);
 void en_northd_cleanup(void *data);
+void en_northd_clear_tracked_data(void *data);
 bool northd_nb_nb_global_handler(struct engine_node *, void *data OVS_UNUSED);
+bool northd_nb_logical_switch_handler(struct engine_node *, void *data);
 
 #endif /* EN_NORTHD_H */
diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
index 863c9323c444..f992a9ec8420 100644
--- a/northd/inc-proc-northd.c
+++ b/northd/inc-proc-northd.c
@@ -128,7 +128,7 @@  enum sb_engine_node {
 
 /* Define engine nodes for other nodes. They should be defined as static to
  * avoid sparse errors. */
-static ENGINE_NODE(northd, "northd");
+static ENGINE_NODE_WITH_CLEAR_TRACK_DATA(northd, "northd");
 static ENGINE_NODE(lflow, "lflow");
 static ENGINE_NODE(mac_binding_aging, "mac_binding_aging");
 static ENGINE_NODE(mac_binding_aging_waker, "mac_binding_aging_waker");
@@ -143,7 +143,8 @@  void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
      * on the second argument */
     engine_add_input(&en_northd, &en_nb_nb_global,
                      northd_nb_nb_global_handler);
-    engine_add_input(&en_northd, &en_nb_logical_switch, NULL);
+    engine_add_input(&en_northd, &en_nb_logical_switch,
+                     northd_nb_logical_switch_handler);
     engine_add_input(&en_northd, &en_nb_port_group, NULL);
     engine_add_input(&en_northd, &en_nb_load_balancer, NULL);
     engine_add_input(&en_northd, &en_nb_load_balancer_group, NULL);
@@ -252,6 +253,13 @@  void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
                                 "sbrec_address_set_by_name",
                                 sbrec_address_set_by_name);
 
+    struct ovsdb_idl_index *sbrec_fdb_by_dp_and_port
+        = ovsdb_idl_index_create2(sb->idl, &sbrec_fdb_col_dp_key,
+                                  &sbrec_fdb_col_port_key);
+    engine_ovsdb_node_add_index(&en_sb_fdb,
+                                "sbrec_fdb_by_dp_and_port",
+                                sbrec_fdb_by_dp_and_port);
+
     struct northd_data *northd_data =
         engine_get_internal_data(&en_northd);
     unixctl_command_register("debug/chassis-features-list", "", 0, 0,
diff --git a/northd/northd.c b/northd/northd.c
index d79f075681d9..1a4e24978642 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -19,6 +19,7 @@ 
 
 #include "debug.h"
 #include "bitmap.h"
+#include "coverage.h"
 #include "dirs.h"
 #include "ipam.h"
 #include "openvswitch/dynamic-string.h"
@@ -59,6 +60,8 @@ 
 
 VLOG_DEFINE_THIS_MODULE(northd);
 
+COVERAGE_DEFINE(northd_run);
+
 static bool controller_event_en;
 static bool lflow_hash_lock_initialized = false;
 
@@ -807,13 +810,20 @@  ovn_datapath_create(struct hmap *datapaths, const struct uuid *key,
     od->port_key_hint = 0;
     hmap_insert(datapaths, &od->key_node, uuid_hash(&od->key));
     od->lr_group = NULL;
-    ovs_list_init(&od->port_list);
+    hmap_init(&od->ports);
     return od;
 }
 
 static void ovn_ls_port_group_destroy(struct hmap *nb_pgs);
 static void destroy_mcast_info_for_datapath(struct ovn_datapath *od);
 
+static void
+destroy_ports_for_datapath(struct ovn_datapath *od)
+{
+    ovs_assert(hmap_is_empty(&od->ports));
+    hmap_destroy(&od->ports);
+}
+
 static void
 ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od)
 {
@@ -834,6 +844,7 @@  ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od)
         free(od->l3dgw_ports);
         ovn_ls_port_group_destroy(&od->nb_pgs);
         destroy_mcast_info_for_datapath(od);
+        destroy_ports_for_datapath(od);
 
         free(od);
     }
@@ -1480,6 +1491,9 @@  struct ovn_port {
     struct lport_addresses *ps_addrs;   /* Port security addresses. */
     unsigned int n_ps_addrs;
 
+    bool lsp_can_be_inc_processed; /* If it can be incrementally processed when
+                                      the port changes. */
+
     /* Logical router port data. */
     const struct nbrec_logical_router_port *nbrp; /* May be NULL. */
 
@@ -1521,11 +1535,16 @@  struct ovn_port {
 
     struct ovs_list list;       /* In list of similar records. */
 
-    struct ovs_list dp_node;
+    struct hmap_node dp_node;   /* Node in od->ports. */
 
     struct lport_addresses proxy_arp_addrs;
+
+    /* Temporarily used for traversing a list (or hmap) of ports. */
+    bool visited;
 };
 
+static bool lsp_can_be_inc_processed(const struct nbrec_logical_switch_port *);
+
 static bool
 is_l3dgw_port(const struct ovn_port *op)
 {
@@ -1585,6 +1604,9 @@  ovn_port_set_nb(struct ovn_port *op,
                 const struct nbrec_logical_router_port *nbrp)
 {
     op->nbsp = nbsp;
+    if (nbsp) {
+        op->lsp_can_be_inc_processed = lsp_can_be_inc_processed(nbsp);
+    }
     op->nbrp = nbrp;
     init_mcast_port_info(&op->mcast_info, op->nbsp, op->nbrp);
 }
@@ -1610,35 +1632,46 @@  ovn_port_create(struct hmap *ports, const char *key,
 }
 
 static void
-ovn_port_destroy(struct hmap *ports, struct ovn_port *port)
+ovn_port_destroy_orphan(struct ovn_port *port)
 {
-    if (port) {
-        /* Don't remove port->list.  It is used within build_ports() as a
-         * private list and once we've exited that function it is not safe to
-         * use it. */
-        hmap_remove(ports, &port->key_node);
+    if (port->tunnel_key) {
+        ovs_assert(port->od);
+        ovn_free_tnlid(&port->od->port_tnlids, port->tunnel_key);
+    }
+    for (int i = 0; i < port->n_lsp_addrs; i++) {
+        destroy_lport_addresses(&port->lsp_addrs[i]);
+    }
+    free(port->lsp_addrs);
 
-        if (port->peer) {
-            port->peer->peer = NULL;
-        }
+    if (port->peer) {
+        port->peer->peer = NULL;
+    }
 
-        for (int i = 0; i < port->n_lsp_addrs; i++) {
-            destroy_lport_addresses(&port->lsp_addrs[i]);
-        }
-        free(port->lsp_addrs);
+    for (int i = 0; i < port->n_ps_addrs; i++) {
+        destroy_lport_addresses(&port->ps_addrs[i]);
+    }
+    free(port->ps_addrs);
 
-        for (int i = 0; i < port->n_ps_addrs; i++) {
-            destroy_lport_addresses(&port->ps_addrs[i]);
-        }
-        free(port->ps_addrs);
+    destroy_routable_addresses(&port->routables);
 
-        destroy_routable_addresses(&port->routables);
+    destroy_lport_addresses(&port->lrp_networks);
+    destroy_lport_addresses(&port->proxy_arp_addrs);
+    free(port->json_key);
+    free(port->key);
+    free(port);
+}
 
-        destroy_lport_addresses(&port->lrp_networks);
-        destroy_lport_addresses(&port->proxy_arp_addrs);
-        free(port->json_key);
-        free(port->key);
-        free(port);
+static void
+ovn_port_destroy(struct hmap *ports, struct ovn_port *port)
+{
+    if (port) {
+        /* Don't remove port->list. The node should be removed from such lists
+         * before calling this function. */
+        hmap_remove(ports, &port->key_node);
+        if (port->od && !port->l3dgw_port) {
+            hmap_remove(&port->od->ports, &port->dp_node);
+        }
+        ovn_port_destroy_orphan(port);
     }
 }
 
@@ -2408,6 +2441,53 @@  tag_alloc_create_new_tag(struct hmap *tag_alloc_table,
 }
 
 
+static void
+parse_lsp_addrs(struct ovn_port *op)
+{
+    const struct nbrec_logical_switch_port *nbsp = op->nbsp;
+    ovs_assert(nbsp);
+    op->lsp_addrs
+        = xmalloc(sizeof *op->lsp_addrs * nbsp->n_addresses);
+    for (size_t j = 0; j < nbsp->n_addresses; j++) {
+        if (!strcmp(nbsp->addresses[j], "unknown")) {
+            op->has_unknown = true;
+            continue;
+        }
+        if (!strcmp(nbsp->addresses[j], "router")) {
+            continue;
+        }
+        if (is_dynamic_lsp_address(nbsp->addresses[j])) {
+            continue;
+        } else if (!extract_lsp_addresses(nbsp->addresses[j],
+                               &op->lsp_addrs[op->n_lsp_addrs])) {
+            static struct vlog_rate_limit rl
+                = VLOG_RATE_LIMIT_INIT(1, 1);
+            VLOG_INFO_RL(&rl, "invalid syntax '%s' in logical "
+                              "switch port addresses. No MAC "
+                              "address found",
+                              op->nbsp->addresses[j]);
+            continue;
+        }
+        op->n_lsp_addrs++;
+    }
+    op->n_lsp_non_router_addrs = op->n_lsp_addrs;
+
+    op->ps_addrs
+        = xmalloc(sizeof *op->ps_addrs * nbsp->n_port_security);
+    for (size_t j = 0; j < nbsp->n_port_security; j++) {
+        if (!extract_lsp_addresses(nbsp->port_security[j],
+                                   &op->ps_addrs[op->n_ps_addrs])) {
+            static struct vlog_rate_limit rl
+                = VLOG_RATE_LIMIT_INIT(1, 1);
+            VLOG_INFO_RL(&rl, "invalid syntax '%s' in port "
+                              "security. No MAC address found",
+                              op->nbsp->port_security[j]);
+            continue;
+        }
+        op->n_ps_addrs++;
+    }
+}
+
 static void
 join_logical_ports(const struct sbrec_port_binding_table *sbrec_pb_table,
                    struct hmap *ls_datapaths, struct hmap *lr_datapaths,
@@ -2504,49 +2584,11 @@  join_logical_ports(const struct sbrec_port_binding_table *sbrec_pb_table,
                 od->has_vtep_lports = true;
             }
 
-            op->lsp_addrs
-                = xmalloc(sizeof *op->lsp_addrs * nbsp->n_addresses);
-            for (size_t j = 0; j < nbsp->n_addresses; j++) {
-                if (!strcmp(nbsp->addresses[j], "unknown")) {
-                    op->has_unknown = true;
-                    continue;
-                }
-                if (!strcmp(nbsp->addresses[j], "router")) {
-                    continue;
-                }
-                if (is_dynamic_lsp_address(nbsp->addresses[j])) {
-                    continue;
-                } else if (!extract_lsp_addresses(nbsp->addresses[j],
-                                       &op->lsp_addrs[op->n_lsp_addrs])) {
-                    static struct vlog_rate_limit rl
-                        = VLOG_RATE_LIMIT_INIT(1, 1);
-                    VLOG_INFO_RL(&rl, "invalid syntax '%s' in logical "
-                                      "switch port addresses. No MAC "
-                                      "address found",
-                                      op->nbsp->addresses[j]);
-                    continue;
-                }
-                op->n_lsp_addrs++;
-            }
-            op->n_lsp_non_router_addrs = op->n_lsp_addrs;
-
-            op->ps_addrs
-                = xmalloc(sizeof *op->ps_addrs * nbsp->n_port_security);
-            for (size_t j = 0; j < nbsp->n_port_security; j++) {
-                if (!extract_lsp_addresses(nbsp->port_security[j],
-                                           &op->ps_addrs[op->n_ps_addrs])) {
-                    static struct vlog_rate_limit rl
-                        = VLOG_RATE_LIMIT_INIT(1, 1);
-                    VLOG_INFO_RL(&rl, "invalid syntax '%s' in port "
-                                      "security. No MAC address found",
-                                      op->nbsp->port_security[j]);
-                    continue;
-                }
-                op->n_ps_addrs++;
-            }
+            parse_lsp_addrs(op);
 
             op->od = od;
-            ovs_list_push_back(&od->port_list, &op->dp_node);
+            hmap_insert(&od->ports, &op->dp_node,
+                        hmap_node_hash(&op->key_node));
             tag_alloc_add_existing_tags(tag_alloc_table, nbsp);
         }
     }
@@ -2593,7 +2635,8 @@  join_logical_ports(const struct sbrec_port_binding_table *sbrec_pb_table,
 
             op->lrp_networks = lrp_networks;
             op->od = od;
-            ovs_list_push_back(&od->port_list, &op->dp_node);
+            hmap_insert(&od->ports, &op->dp_node,
+                        hmap_node_hash(&op->key_node));
 
             if (!od->redirect_bridged) {
                 const char *redirect_type =
@@ -3314,6 +3357,10 @@  ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn,
         struct smap new;
         smap_init(&new);
         if (is_cr_port(op)) {
+            ovs_assert(sbrec_chassis_by_name);
+            ovs_assert(sbrec_chassis_by_hostname);
+            ovs_assert(sbrec_ha_chassis_grp_by_name);
+            ovs_assert(active_ha_chassis_grps);
             const char *redirect_type = smap_get(&op->nbrp->options,
                                                  "redirect-type");
 
@@ -3423,8 +3470,10 @@  ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn,
             struct smap options;
 
             if (has_qos && !queue_id) {
+                ovs_assert(queue_id_bitmap);
                 queue_id = allocate_queueid(queue_id_bitmap);
             } else if (!has_qos && queue_id) {
+                ovs_assert(queue_id_bitmap);
                 bitmap_set0(queue_id_bitmap, queue_id);
                 queue_id = 0;
             }
@@ -3473,6 +3522,10 @@  ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn,
             sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0);
 
             if (!strcmp(op->nbsp->type, "external")) {
+                ovs_assert(sbrec_chassis_by_name);
+                ovs_assert(sbrec_chassis_by_hostname);
+                ovs_assert(sbrec_ha_chassis_grp_by_name);
+                ovs_assert(active_ha_chassis_grps);
                 if (op->nbsp->ha_chassis_group) {
                     sync_ha_chassis_group_for_sbpb(
                         ovnsb_txn, sbrec_chassis_by_name,
@@ -3729,6 +3782,24 @@  cleanup_stale_fdb_entries(const struct sbrec_fdb_table *sbrec_fdb_table,
     }
 }
 
+static void
+delete_fdb_entry(struct ovsdb_idl_index *sbrec_fdb_by_dp_and_port,
+                 uint32_t dp_key, uint32_t port_key)
+{
+    struct sbrec_fdb *target =
+        sbrec_fdb_index_init_row(sbrec_fdb_by_dp_and_port);
+    sbrec_fdb_index_set_dp_key(target, dp_key);
+    sbrec_fdb_index_set_port_key(target, port_key);
+
+    struct sbrec_fdb *fdb_e = sbrec_fdb_index_find(sbrec_fdb_by_dp_and_port,
+                                                   target);
+    sbrec_fdb_index_destroy_row(target);
+
+    if (fdb_e) {
+        sbrec_fdb_delete(fdb_e);
+    }
+}
+
 struct service_monitor_info {
     struct hmap_node hmap_node;
     const struct sbrec_service_monitor *sbrec_mon;
@@ -3820,14 +3891,15 @@  ovn_lb_svc_create(struct ovsdb_idl_txn *ovnsb_txn, struct ovn_northd_lb *lb,
             }
             ds_destroy(&key);
 
-            backend_nb->op = op;
-            backend_nb->svc_mon_src_ip = svc_mon_src_ip;
-
             if (!lb_vip_nb->lb_health_check || !op || !svc_mon_src_ip ||
                 !lsp_is_enabled(op->nbsp)) {
+                free(svc_mon_src_ip);
                 continue;
             }
 
+            backend_nb->op = op;
+            backend_nb->svc_mon_src_ip = svc_mon_src_ip;
+
             const char *protocol = lb->nlb->protocol;
             if (!protocol || !protocol[0]) {
                 protocol = "tcp";
@@ -4155,7 +4227,7 @@  build_lrouter_lb_reachable_ips(struct ovn_datapath *od,
             ovs_be32 vip_ip4 = in6_addr_get_mapped_ipv4(&lb->vips[i].vip);
             struct ovn_port *op;
 
-            LIST_FOR_EACH (op, dp_node, &od->port_list) {
+            HMAP_FOR_EACH (op, dp_node, &od->ports) {
                 if (lrouter_port_ipv4_reachable(op, vip_ip4)) {
                     sset_add(&od->lb_ips->ips_v4_reachable,
                              lb->vips[i].vip_str);
@@ -4165,7 +4237,7 @@  build_lrouter_lb_reachable_ips(struct ovn_datapath *od,
         } else {
             struct ovn_port *op;
 
-            LIST_FOR_EACH (op, dp_node, &od->port_list) {
+            HMAP_FOR_EACH (op, dp_node, &od->ports) {
                 if (lrouter_port_ipv6_reachable(op, &lb->vips[i].vip)) {
                     sset_add(&od->lb_ips->ips_v6_reachable,
                              lb->vips[i].vip_str);
@@ -4538,7 +4610,10 @@  ovn_port_add_tnlid(struct ovn_port *op, uint32_t tunnel_key)
     return added;
 }
 
-static void
+/* Returns false if the requested key is confict with another allocated key, so
+ * that the I-P engine can fallback to recompute if needed; otherwise return
+ * true (even if the key is not allocated). */
+static bool
 ovn_port_assign_requested_tnl_id(
     const struct sbrec_chassis_table *sbrec_chassis_table, struct ovn_port *op)
 {
@@ -4553,7 +4628,7 @@  ovn_port_assign_requested_tnl_id(
             VLOG_WARN_RL(&rl, "Tunnel key %"PRIu32" for port %s "
                          "is incompatible with VXLAN",
                          tunnel_key, op_get_name(op));
-            return;
+            return true;
         }
         if (!ovn_port_add_tnlid(op, tunnel_key)) {
             static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
@@ -4561,11 +4636,13 @@  ovn_port_assign_requested_tnl_id(
                          "%"PRIu32" as another LSP or LRP",
                          op->nbsp ? "switch" : "router",
                          op_get_name(op), tunnel_key);
+            return false;
         }
     }
+    return true;
 }
 
-static void
+static bool
 ovn_port_allocate_key(const struct sbrec_chassis_table *sbrec_chassis_table,
                       struct hmap *ports,
                       struct ovn_port *op)
@@ -4581,8 +4658,10 @@  ovn_port_allocate_key(const struct sbrec_chassis_table *sbrec_chassis_table,
             }
             ovs_list_remove(&op->list);
             ovn_port_destroy(ports, op);
+            return false;
         }
     }
+    return true;
 }
 
 /* Updates the southbound Port_Binding table so that it contains the logical
@@ -4605,6 +4684,8 @@  build_ports(struct ovsdb_idl_txn *ovnsb_txn,
     struct hmap *ls_ports, struct hmap *lr_ports)
 {
     struct ovs_list sb_only, nb_only, both;
+    /* XXX: Add tag_alloc_table and queue_id_bitmap as part of northd_data
+     * to improve I-P. */
     struct hmap tag_alloc_table = HMAP_INITIALIZER(&tag_alloc_table);
     unsigned long *queue_id_bitmap = bitmap_allocate(QDISC_MAX_QUEUE_ID + 1);
     bitmap_set1(queue_id_bitmap, 0);
@@ -4711,6 +4792,329 @@  build_ports(struct ovsdb_idl_txn *ovnsb_txn,
     sset_destroy(&active_ha_chassis_grps);
 }
 
+void
+destroy_northd_data_tracked_changes(struct northd_data *nd)
+{
+    struct ls_change *ls_change;
+    LIST_FOR_EACH_SAFE (ls_change, list_node,
+                        &nd->tracked_ls_changes.updated) {
+        struct ovn_port *op;
+        LIST_FOR_EACH (op, list, &ls_change->added_ports) {
+            ovs_list_remove(&op->list);
+        }
+        LIST_FOR_EACH (op, list, &ls_change->updated_ports) {
+            ovs_list_remove(&op->list);
+        }
+        LIST_FOR_EACH_SAFE (op, list, &ls_change->deleted_ports) {
+            ovs_list_remove(&op->list);
+            ovn_port_destroy_orphan(op);
+        }
+        ovs_list_remove(&ls_change->list_node);
+        free(ls_change);
+    }
+
+    nd->change_tracked = false;
+}
+
+/* Check if a changed LSP can be handled incrementally within the I-P engine
+ * node en_northd.
+ */
+static bool
+lsp_can_be_inc_processed(const struct nbrec_logical_switch_port *nbsp)
+{
+    /* Currently only support normal VIF. */
+    if (nbsp->type[0]) {
+        return false;
+    }
+
+    /* Currently not support tag allocation. */
+    if ((nbsp->parent_name && nbsp->parent_name[0]) || nbsp->tag ||
+        nbsp->tag_request) {
+        return false;
+    }
+
+    /* Currently not support port with qos settings (need special handling for
+     * qdisc_queue_id sync. */
+    if (port_has_qos_params(&nbsp->options)) {
+        return false;
+    }
+
+    for (size_t j = 0; j < nbsp->n_addresses; j++) {
+        /* Currently not support dynamic address handling. */
+        if (is_dynamic_lsp_address(nbsp->addresses[j])) {
+            return false;
+        }
+        /* Currently not support "unknown" address handling.  XXX: Need to
+         * handle od->has_unknown change and track it when the first LSP with
+         * 'unknown' is added or when the last one is removed. */
+        if (!strcmp(nbsp->addresses[j], "unknown")) {
+            return false;
+        }
+    }
+
+    return true;
+}
+
+static bool
+ls_port_has_changed(const struct nbrec_logical_switch_port *old,
+                    const struct nbrec_logical_switch_port *new)
+{
+    if (old != new) {
+        return true;
+    }
+    /* XXX: Need a better OVSDB IDL interface for this check. */
+    return (nbrec_logical_switch_port_row_get_seqno(new,
+                                OVSDB_IDL_CHANGE_MODIFY) > 0);
+}
+
+static struct ovn_port *
+ovn_port_find_in_datapath(struct ovn_datapath *od, const char *name)
+{
+    struct ovn_port *op;
+    HMAP_FOR_EACH_WITH_HASH (op, dp_node, hash_string(name, 0), &od->ports) {
+        if (!strcmp(op->key, name)) {
+            return op;
+        }
+    }
+    return NULL;
+}
+
+static struct ovn_port *
+ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct hmap *ls_ports,
+               const char *key, const struct nbrec_logical_switch_port *nbsp,
+               struct ovn_datapath *od, const struct sbrec_port_binding *sb,
+               const struct sbrec_mirror_table *sbrec_mirror_table,
+               const struct sbrec_chassis_table *sbrec_chassis_table,
+               struct ovsdb_idl_index *sbrec_chassis_by_name,
+               struct ovsdb_idl_index *sbrec_chassis_by_hostname)
+{
+    struct ovn_port *op = ovn_port_create(ls_ports, key, nbsp, NULL,
+                                          NULL);
+    parse_lsp_addrs(op);
+    op->od = od;
+    hmap_insert(&od->ports, &op->dp_node, hmap_node_hash(&op->key_node));
+
+    /* Assign explicitly requested tunnel ids first. */
+    if (!ovn_port_assign_requested_tnl_id(sbrec_chassis_table, op)) {
+        return NULL;
+    }
+    if (sb) {
+        op->sb = sb;
+        /* Keep nonconflicting tunnel IDs that are already assigned. */
+        if (!op->tunnel_key) {
+            ovn_port_add_tnlid(op, op->sb->tunnel_key);
+        }
+    } else {
+        /* XXX: the new SB port_binding will change in IDL, so need to handle
+         * SB port_binding updates incrementally to achieve end-to-end
+         * incremental processing. */
+        op->sb = sbrec_port_binding_insert(ovnsb_txn);
+        sbrec_port_binding_set_logical_port(op->sb, op->key);
+    }
+    /* Assign new tunnel ids where needed. */
+    if (!ovn_port_allocate_key(sbrec_chassis_table, ls_ports, op)) {
+        return NULL;
+    }
+    ovn_port_update_sbrec(ovnsb_txn, sbrec_chassis_by_name,
+                          sbrec_chassis_by_hostname, NULL, sbrec_mirror_table,
+                          op, NULL, NULL);
+    return op;
+}
+
+static bool
+check_ls_changes_other_than_lsp(const struct nbrec_logical_switch *ls)
+{
+    /* Check if the columns are changed in this row. */
+    enum nbrec_logical_switch_column_id col;
+    for (col = 0; col < NBREC_LOGICAL_SWITCH_N_COLUMNS; col++) {
+        if (nbrec_logical_switch_is_updated(ls, col) &&
+            col != NBREC_LOGICAL_SWITCH_COL_PORTS) {
+            return true;
+        }
+    }
+
+    /* Check if the referenced rows are changed.
+       XXX: Need a better OVSDB IDL interface for this check. */
+    for (size_t i = 0; i < ls->n_acls; i++) {
+        if (nbrec_acl_row_get_seqno(ls->acls[i],
+                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
+            return true;
+        }
+    }
+    if (ls->copp && nbrec_copp_row_get_seqno(ls->copp,
+                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
+        return true;
+    }
+    for (size_t i = 0; i < ls->n_dns_records; i++) {
+        if (nbrec_dns_row_get_seqno(ls->dns_records[i],
+                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
+            return true;
+        }
+    }
+    for (size_t i = 0; i < ls->n_forwarding_groups; i++) {
+        if (nbrec_forwarding_group_row_get_seqno(ls->forwarding_groups[i],
+                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
+            return true;
+        }
+    }
+    for (size_t i = 0; i < ls->n_load_balancer; i++) {
+        if (nbrec_load_balancer_row_get_seqno(ls->load_balancer[i],
+                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
+            return true;
+        }
+    }
+    for (size_t i = 0; i < ls->n_load_balancer_group; i++) {
+        if (nbrec_load_balancer_group_row_get_seqno(ls->load_balancer_group[i],
+                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
+            return true;
+        }
+    }
+    for (size_t i = 0; i < ls->n_qos_rules; i++) {
+        if (nbrec_qos_row_get_seqno(ls->qos_rules[i],
+                                OVSDB_IDL_CHANGE_MODIFY) > 0) {
+            return true;
+        }
+    }
+    return false;
+}
+
+/* Return true if changes are handled incrementally, false otherwise.
+ * When there are any changes, try to track what's exactly changed and set
+ * northd_data->change_tracked accordingly: change tracked - true, otherwise,
+ * false. */
+bool
+northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
+                         const struct northd_input *ni,
+                         struct northd_data *nd)
+{
+    const struct nbrec_logical_switch *changed_ls;
+    struct ls_change *ls_change = NULL;
+
+    NBREC_LOGICAL_SWITCH_TABLE_FOR_EACH_TRACKED (changed_ls,
+                                             ni->nbrec_logical_switch_table) {
+        ls_change = NULL;
+        if (nbrec_logical_switch_is_new(changed_ls) ||
+            nbrec_logical_switch_is_deleted(changed_ls)) {
+            goto fail;
+        }
+        struct ovn_datapath *od = ovn_datapath_find(
+                                    &nd->ls_datapaths.datapaths,
+                                    &changed_ls->header_.uuid);
+        if (!od) {
+            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+            VLOG_WARN_RL(&rl, "Internal error: a tracked updated LS doesn't "
+                         "exist in ls_datapaths: "UUID_FMT,
+                         UUID_ARGS(&changed_ls->header_.uuid));
+            goto fail;
+        }
+
+        /* Now only able to handle lsp changes. */
+        if (check_ls_changes_other_than_lsp(changed_ls)) {
+            goto fail;
+        }
+
+        ls_change = xzalloc(sizeof *ls_change);
+        ls_change->od = od;
+        ovs_list_init(&ls_change->added_ports);
+        ovs_list_init(&ls_change->deleted_ports);
+        ovs_list_init(&ls_change->updated_ports);
+
+        struct ovn_port *op;
+        HMAP_FOR_EACH (op, dp_node, &od->ports) {
+            op->visited = false;
+        }
+
+        /* Compare the individual ports in the old and new Logical Switches */
+        for (size_t j = 0; j < changed_ls->n_ports; ++j) {
+            struct nbrec_logical_switch_port *new_nbsp = changed_ls->ports[j];
+            op = ovn_port_find_in_datapath(od, new_nbsp->name);
+
+            if (!op) {
+                if (!lsp_can_be_inc_processed(new_nbsp)) {
+                    goto fail;
+                }
+                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
+                                    new_nbsp->name, new_nbsp, od, NULL,
+                                    ni->sbrec_mirror_table,
+                                    ni->sbrec_chassis_table,
+                                    ni->sbrec_chassis_by_name,
+                                    ni->sbrec_chassis_by_hostname);
+                if (!op) {
+                    goto fail;
+                }
+                ovs_list_push_back(&ls_change->added_ports,
+                                   &op->list);
+            } else if (ls_port_has_changed(op->nbsp, new_nbsp)) {
+                /* Existing port updated */
+                bool temp = false;
+                if (lsp_is_type_changed(op->sb, new_nbsp, &temp) ||
+                    !op->lsp_can_be_inc_processed ||
+                    !lsp_can_be_inc_processed(new_nbsp)) {
+                    goto fail;
+                }
+                const struct sbrec_port_binding *sb = op->sb;
+                if (sset_contains(&nd->svc_monitor_lsps, new_nbsp->name)) {
+                    /* This port is used for svc monitor, which may be impacted
+                     * by this change. Fallback to recompute. */
+                    goto fail;
+                }
+                ovn_port_destroy(&nd->ls_ports, op);
+                op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
+                                    new_nbsp->name, new_nbsp, od, sb,
+                                    ni->sbrec_mirror_table,
+                                    ni->sbrec_chassis_table,
+                                    ni->sbrec_chassis_by_name,
+                                    ni->sbrec_chassis_by_hostname);
+                if (!op) {
+                    goto fail;
+                }
+                ovs_list_push_back(&ls_change->updated_ports, &op->list);
+            }
+            op->visited = true;
+        }
+
+        /* Check for deleted ports */
+        HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
+            if (!op->visited) {
+                if (!op->lsp_can_be_inc_processed) {
+                    goto fail;
+                }
+                if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
+                    /* This port was used for svc monitor, which may be
+                     * impacted by this deletion. Fallback to recompute. */
+                    goto fail;
+                }
+                ovs_list_push_back(&ls_change->deleted_ports,
+                                   &op->list);
+                hmap_remove(&nd->ls_ports, &op->key_node);
+                hmap_remove(&od->ports, &op->dp_node);
+                sbrec_port_binding_delete(op->sb);
+                delete_fdb_entry(ni->sbrec_fdb_by_dp_and_port, od->tunnel_key,
+                                 op->tunnel_key);
+            }
+        }
+
+        if (!ovs_list_is_empty(&ls_change->added_ports) ||
+            !ovs_list_is_empty(&ls_change->updated_ports) ||
+            !ovs_list_is_empty(&ls_change->deleted_ports)) {
+            ovs_list_push_back(&nd->tracked_ls_changes.updated,
+                               &ls_change->list_node);
+        } else {
+            free(ls_change);
+        }
+    }
+
+    if (!ovs_list_is_empty(&nd->tracked_ls_changes.updated)) {
+        nd->change_tracked = true;
+    }
+    return true;
+
+fail:
+    free(ls_change);
+    destroy_northd_data_tracked_changes(nd);
+    return false;
+}
+
 struct multicast_group {
     const char *name;
     uint16_t key;               /* OVN_MIN_MULTICAST...OVN_MAX_MULTICAST. */
@@ -6072,7 +6476,7 @@  build_interconn_mcast_snoop_flows(struct ovn_datapath *od,
 
     struct ovn_port *op;
 
-    LIST_FOR_EACH (op, dp_node, &od->port_list) {
+    HMAP_FOR_EACH (op, dp_node, &od->ports) {
         if (!lsp_is_remote(op->nbsp)) {
             continue;
         }
@@ -12394,7 +12798,7 @@  build_mcast_lookup_flows_for_lrouter(
          * own mac addresses.
          */
         struct ovn_port *op;
-        LIST_FOR_EACH (op, dp_node, &od->port_list) {
+        HMAP_FOR_EACH (op, dp_node, &od->ports) {
             ds_clear(match);
             ds_put_format(match, "eth.src == %s && igmp",
                           op->lrp_networks.ea_s);
@@ -16476,9 +16880,6 @@  destroy_datapaths_and_ports(struct ovn_datapaths *ls_datapaths,
         }
     }
 
-    ovn_datapaths_destroy(ls_datapaths);
-    ovn_datapaths_destroy(lr_datapaths);
-
     struct ovn_port *port;
     HMAP_FOR_EACH_SAFE (port, key_node, ls_ports) {
         ovn_port_destroy(ls_ports, port);
@@ -16489,6 +16890,9 @@  destroy_datapaths_and_ports(struct ovn_datapaths *ls_datapaths,
         ovn_port_destroy(lr_ports, port);
     }
     hmap_destroy(lr_ports);
+
+    ovn_datapaths_destroy(ls_datapaths);
+    ovn_datapaths_destroy(lr_datapaths);
 }
 
 void
@@ -16510,6 +16914,8 @@  northd_init(struct northd_data *data)
     };
     data->ovn_internal_version_changed = false;
     sset_init(&data->svc_monitor_lsps);
+    data->change_tracked = false;
+    ovs_list_init(&data->tracked_ls_changes.updated);
 }
 
 void
@@ -16976,6 +17382,7 @@  void northd_run(struct northd_input *input_data,
                 struct ovsdb_idl_txn *ovnnb_txn,
                 struct ovsdb_idl_txn *ovnsb_txn)
 {
+    COVERAGE_INC(northd_run);
     stopwatch_start(OVNNB_DB_RUN_STOPWATCH_NAME, time_msec());
     ovnnb_db_run(input_data, data, ovnnb_txn, ovnsb_txn);
     stopwatch_stop(OVNNB_DB_RUN_STOPWATCH_NAME, time_msec());
diff --git a/northd/northd.h b/northd/northd.h
index b26828bb7111..1fd9bed477e7 100644
--- a/northd/northd.h
+++ b/northd/northd.h
@@ -65,6 +65,7 @@  struct northd_input {
     struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name;
     struct ovsdb_idl_index *sbrec_ip_mcast_by_dp;
     struct ovsdb_idl_index *sbrec_static_mac_binding_by_lport_ip;
+    struct ovsdb_idl_index *sbrec_fdb_by_dp_and_port;
 };
 
 struct chassis_features {
@@ -83,6 +84,22 @@  struct ovn_datapaths {
     struct ovn_datapath **array;
 };
 
+/* Track what's changed for a single LS.
+ * Now only track port changes. */
+struct ls_change {
+    struct ovs_list list_node;
+    struct ovn_datapath *od;
+    struct ovs_list added_ports;
+    struct ovs_list deleted_ports;
+    struct ovs_list updated_ports;
+};
+
+/* Track what's changed for logical switches.
+ * Now only track updated ones (added or deleted may be supported in the
+ * future). */
+struct tracked_ls_changes {
+    struct ovs_list updated; /* Contains struct ls_change */
+};
 
 struct northd_data {
     /* Global state for 'en-northd'. */
@@ -98,6 +115,8 @@  struct northd_data {
     bool ovn_internal_version_changed;
     struct chassis_features features;
     struct sset svc_monitor_lsps;
+    bool change_tracked;
+    struct tracked_ls_changes tracked_ls_changes;
 };
 
 struct lflow_data {
@@ -292,13 +311,19 @@  struct ovn_datapath {
     /* Port groups related to the datapath, used only when nbs is NOT NULL. */
     struct hmap nb_pgs;
 
-    struct ovs_list port_list;
+    /* Map of ovn_port objects belonging to this datapath.
+     * This map doesn't include derived ports. */
+    struct hmap ports;
 };
 
 void northd_run(struct northd_input *input_data,
                 struct northd_data *data,
                 struct ovsdb_idl_txn *ovnnb_txn,
                 struct ovsdb_idl_txn *ovnsb_txn);
+bool northd_handle_ls_changes(struct ovsdb_idl_txn *,
+                              const struct northd_input *,
+                              struct northd_data *);
+void destroy_northd_data_tracked_changes(struct northd_data *);
 void northd_destroy(struct northd_data *data);
 void northd_init(struct northd_data *data);
 void northd_indices_create(struct northd_data *data,
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 6736429ae201..74ae84530112 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -9481,3 +9481,61 @@  AT_CHECK([ovn-sbctl lflow-list sw | grep ls_out_pre_lb | grep priority=110 | gre
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([LSP incremental processing])
+ovn_start
+
+net_add n1
+sim_add hv1
+as hv1
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.11
+ovs-vsctl add-port br-int lsp0-0 -- set interface lsp0-0 external_ids:iface-id=lsp0-0
+ovs-vsctl add-port br-int lsp0-1 -- set interface lsp0-1 external_ids:iface-id=lsp0-1
+ovs-vsctl add-port br-int lsp0-2 -- set interface lsp0-2 external_ids:iface-id=lsp0-2
+
+check ovn-nbctl --wait=hv ls-add ls0
+
+check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
+check ovn-nbctl --wait=hv lsp-add ls0 lsp0-0 -- lsp-set-addresses lsp0-0 "unknown"
+OVS_WAIT_UNTIL([test `as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute` = 5])
+OVS_WAIT_UNTIL([test `as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats lflow recompute` = 5])
+
+check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
+check ovn-nbctl --wait=hv lsp-add ls0 lsp0-1 -- lsp-set-addresses lsp0-1 "aa:aa:aa:00:00:01 192.168.0.11"
+AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute], [0], [3
+])
+AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats lflow recompute], [0], [6
+])
+
+check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
+check ovn-nbctl --wait=hv lsp-add ls0 lsp0-2 -- lsp-set-addresses lsp0-2 "aa:aa:aa:00:00:02 192.168.0.12"
+AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute], [0], [3
+])
+AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats lflow recompute], [0], [6
+])
+
+check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
+check ovn-nbctl --wait=hv lsp-del lsp0-1
+AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute], [0], [1
+])
+AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats lflow recompute], [0], [2
+])
+
+check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
+check ovn-nbctl --wait=hv lsp-set-addresses lsp0-2 "aa:aa:aa:00:00:88 192.168.0.88"
+AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute], [0], [1
+])
+AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats lflow recompute], [0], [2
+])
+
+# No change, no recompute
+check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
+check ovn-nbctl --wait=sb sync
+AT_CHECK([as northd ovn-appctl -t NORTHD_TYPE inc-engine/show-stats northd recompute], [0], [0
+])
+
+OVN_CLEANUP([hv1])
+AT_CLEANUP
+])