diff mbox series

[ovs-dev,v2] controller: avoid recomputes triggered by SBDB Port_Binding updates.

Message ID 20220615094610.2343717-1-xsimonar@redhat.com
State Superseded
Headers show
Series [ovs-dev,v2] controller: avoid recomputes triggered by SBDB Port_Binding updates. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Xavier Simonart June 15, 2022, 9:46 a.m. UTC
When VIF ports are claimed on a chassis, SBDB Port_Binding table is updated.
If the SBDB IDL is still is read-only ("in transaction") when such a update
is required, the update is not possible and recompute is triggered through
I+P failure.

This situation can happen:
- after updating Port_Binding->chassis to SBDB for one port, in a following
  iteration, ovn-controller handles Interface:external_ids:ovn-installed
  (for the same port) while SBDB is still read-only.
- after updating Port_Binding->chassis to SBDB for one port, in a following
  iteration, ovn-controller updates Port_Binding->chassis for another port,
  while SBDB is still read-only.

This patch prevent the recompute, by having the if-status module
updating the Port_Binding chassis (if needed) when possible.
This does not delay Port_Binding chassis update compared to before this patch.
- With the patch, Port_Binding chassis will be updated as soon as SBDB is
again writable, without recompute.
- Without the patch, Port_Binding chassis was updated as soon as SBDB was
again writable, through a recompute.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059253
Signed-off-by: Xavier Simonart <xsimonar@redhat.com>

---
v2:  - handled Dumitru's comments.
     - handled Han's comments, mainly ensure we moved out of CLAIMED state
       only after updating pb->chassis to guarentee physical flows are installed
       when ovn-installed is updated in OVS.
     - slighly reorganize the code to isolate 'notify_up = false' cases in
       claim_port (i.e. ports such as virtual ports), in the idea of making
       future patch preventing recomputes when virtual ports are claimed.
     - updated test case to cause more race conditions.
     - rebased on origin/main
     - note that "additional chassis" as now supported by
       "Support LSP:options:requested-chassis as a list" might still cause
       recomputes.
     - fixed missing flows when Port_Binding chassis was updated by mgr_update
       w/o any lflow recalculation.
---
 controller/binding.c        | 154 ++++++++++++++++++++++---------
 controller/binding.h        |  15 ++-
 controller/if-status.c      | 179 ++++++++++++++++++++++++++++++++----
 controller/if-status.h      |  17 +++-
 controller/ovn-controller.c | 121 +++++++++++++++++++++++-
 tests/ovn.at                | 113 ++++++++++++++++++++++-
 6 files changed, 528 insertions(+), 71 deletions(-)

Comments

Dumitru Ceara June 16, 2022, 10:55 a.m. UTC | #1
On 6/15/22 11:46, Xavier Simonart wrote:
> When VIF ports are claimed on a chassis, SBDB Port_Binding table is updated.
> If the SBDB IDL is still is read-only ("in transaction") when such a update
> is required, the update is not possible and recompute is triggered through
> I+P failure.
> 
> This situation can happen:
> - after updating Port_Binding->chassis to SBDB for one port, in a following
>   iteration, ovn-controller handles Interface:external_ids:ovn-installed
>   (for the same port) while SBDB is still read-only.
> - after updating Port_Binding->chassis to SBDB for one port, in a following
>   iteration, ovn-controller updates Port_Binding->chassis for another port,
>   while SBDB is still read-only.
> 
> This patch prevent the recompute, by having the if-status module
> updating the Port_Binding chassis (if needed) when possible.
> This does not delay Port_Binding chassis update compared to before this patch.
> - With the patch, Port_Binding chassis will be updated as soon as SBDB is
> again writable, without recompute.
> - Without the patch, Port_Binding chassis was updated as soon as SBDB was
> again writable, through a recompute.
> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059253
> Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
> 
> ---

Hi Xavier,

Overall I think the approach is good.  I do have a question related to
I-P of tracked datapaths/ports below and some more rather minor comments.

> v2:  - handled Dumitru's comments.
>      - handled Han's comments, mainly ensure we moved out of CLAIMED state
>        only after updating pb->chassis to guarentee physical flows are installed
>        when ovn-installed is updated in OVS.
>      - slighly reorganize the code to isolate 'notify_up = false' cases in
>        claim_port (i.e. ports such as virtual ports), in the idea of making
>        future patch preventing recomputes when virtual ports are claimed.
>      - updated test case to cause more race conditions.
>      - rebased on origin/main
>      - note that "additional chassis" as now supported by
>        "Support LSP:options:requested-chassis as a list" might still cause
>        recomputes.
>      - fixed missing flows when Port_Binding chassis was updated by mgr_update
>        w/o any lflow recalculation.
> ---
>  controller/binding.c        | 154 ++++++++++++++++++++++---------
>  controller/binding.h        |  15 ++-
>  controller/if-status.c      | 179 ++++++++++++++++++++++++++++++++----
>  controller/if-status.h      |  17 +++-
>  controller/ovn-controller.c | 121 +++++++++++++++++++++++-
>  tests/ovn.at                | 113 ++++++++++++++++++++++-
>  6 files changed, 528 insertions(+), 71 deletions(-)
> 
> diff --git a/controller/binding.c b/controller/binding.c
> index 2279570f9..b21577f71 100644
> --- a/controller/binding.c
> +++ b/controller/binding.c
> @@ -644,11 +644,17 @@ local_binding_get_lport_ofport(const struct shash *local_bindings,
>  }
>  
>  bool
> -local_binding_is_up(struct shash *local_bindings, const char *pb_name)
> +local_binding_is_up(struct shash *local_bindings, const char *pb_name,
> +                    const struct sbrec_chassis *chassis_rec)
>  {
>      struct local_binding *lbinding =
>          local_binding_find(local_bindings, pb_name);
>      struct binding_lport *b_lport = local_binding_get_primary_lport(lbinding);
> +
> +    if (b_lport && b_lport->pb->chassis != chassis_rec) {
> +        return false;
> +    }
> +
>      if (lbinding && b_lport && lbinding->iface) {
>          if (b_lport->pb->n_up && !b_lport->pb->up[0]) {
>              return false;
> @@ -660,13 +666,23 @@ local_binding_is_up(struct shash *local_bindings, const char *pb_name)
>  }
>  
>  bool
> -local_binding_is_down(struct shash *local_bindings, const char *pb_name)
> +local_binding_is_down(struct shash *local_bindings, const char *pb_name,
> +                      const struct sbrec_chassis *chassis_rec)
>  {
>      struct local_binding *lbinding =
>          local_binding_find(local_bindings, pb_name);
>  
>      struct binding_lport *b_lport = local_binding_get_primary_lport(lbinding);
>  
> +    if (b_lport) {
> +        if (b_lport->pb->chassis == chassis_rec) {
> +            return false;
> +        } else if (b_lport->pb->chassis) {
> +            VLOG_DBG("lport %s already claimed by other chassis",
> +                     b_lport->pb->logical_port);
> +        }
> +    }
> +
>      if (!lbinding) {
>          return true;
>      }
> @@ -884,37 +900,80 @@ get_lport_type_str(enum en_lport_type lport_type)
>      OVS_NOT_REACHED();
>  }
>  
> -/* For newly claimed ports, if 'notify_up' is 'false':
> +void
> +set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,
> +                        const struct sbrec_chassis *chassis_rec,
> +                        bool is_set)
> +{
> +    if (pb->chassis != chassis_rec) {
> +         if (is_set) {
> +            if (pb->chassis) {
> +                VLOG_INFO("Changing chassis for lport %s from %s to %s.",
> +                          pb->logical_port, pb->chassis->name,
> +                          chassis_rec->name);
> +            } else {
> +                VLOG_INFO("Claiming lport %s for this chassis.",
> +                          pb->logical_port);
> +            }
> +            for (int i = 0; i < pb->n_mac; i++) {
> +                VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]);
> +            }
> +            sbrec_port_binding_set_chassis(pb, chassis_rec);
> +        }
> +    } else if (!is_set) {
> +        sbrec_port_binding_set_chassis(pb, NULL);
> +    }
> +}
> +
> +void
> +local_binding_set_pb(struct shash *local_bindings, const char *pb_name,
> +                     const struct sbrec_chassis *chassis_rec,
> +                     struct hmap *tracked_datapaths, bool is_set)
> +{
> +    struct local_binding *lbinding =
> +        local_binding_find(local_bindings, pb_name);
> +    struct binding_lport *b_lport = local_binding_get_primary_lport(lbinding);
> +
> +    if (b_lport) {
> +        set_pb_chassis_in_sbrec(b_lport->pb, chassis_rec, is_set);
> +        if (tracked_datapaths) {
> +            update_lport_tracking(b_lport->pb, tracked_datapaths, true);
> +        }
> +    }
> +}
> +
> +/* For newly claimed ports:
>   * - set the 'pb.up' field to true if 'pb' has no 'parent_pb'.
>   * - set the 'pb.up' field to true if 'parent_pb.up' is 'true' (e.g., for
>   *   container and virtual ports).
> - * Otherwise request a notification to be sent when the OVS flows
> - * corresponding to 'pb' have been installed.
> + *
> + * Returns false if lport is not claimed due to 'sb_readonly'.
> + * Returns true otherwise.
>   *
>   * Note:
> - *   Updates (directly or through a notification) the 'pb->up' field only if
> - *   it's explicitly set to 'false'.
> + *   Updates the 'pb->up' field only if it's explicitly set to 'false'.
>   *   This is to ensure compatibility with older versions of ovn-northd.
>   */
> -static void
> +static bool
>  claimed_lport_set_up(const struct sbrec_port_binding *pb,
>                       const struct sbrec_port_binding *parent_pb,
> -                     const struct sbrec_chassis *chassis_rec,
> -                     bool notify_up, struct if_status_mgr *if_mgr)
> +                     bool sb_readonly)
>  {
> -    if (!notify_up) {
> -        bool up = true;
> -        if (!parent_pb || (parent_pb->n_up && parent_pb->up[0])) {
> +    /* When notify_up is false in claim_port(), no state is created
> +     * by if_status_mgr. In such cases, return false (i.e. trigger recompute)
> +     * if we can't update sb (because it is readonly).
> +     */
> +    bool up = true;
> +    if (!parent_pb || (parent_pb->n_up && parent_pb->up[0])) {
> +        if (!sb_readonly) {
>              if (pb->n_up) {
>                  sbrec_port_binding_set_up(pb, &up, 1);
>              }
> +        } else if (pb->n_up && !pb->up[0]) {
> +            return false;
>          }
> -        return;
> -    }
> -
> -    if (pb->chassis != chassis_rec || (pb->n_up && !pb->up[0])) {
> -        if_status_mgr_claim_iface(if_mgr, pb->logical_port);
>      }
> +    return true;
>  }
>  
>  typedef void (*set_func)(const struct sbrec_port_binding *pb,
> @@ -1057,37 +1116,35 @@ claim_lport(const struct sbrec_port_binding *pb,
>              struct hmap *tracked_datapaths,
>              struct if_status_mgr *if_mgr)
>  {
> -    if (!sb_readonly) {
> -        claimed_lport_set_up(pb, parent_pb, chassis_rec, notify_up, if_mgr);
> -    }
> -
>      enum can_bind can_bind = lport_can_bind_on_this_chassis(chassis_rec, pb);
>      bool update_tracked = false;
>  
>      if (can_bind == CAN_BIND_AS_MAIN) {
>          if (pb->chassis != chassis_rec) {
> -            if (sb_readonly) {
> -                return false;
> -            }
> -
> -            if (pb->chassis) {
> -                VLOG_INFO("Changing chassis for lport %s from %s to %s.",
> -                        pb->logical_port, pb->chassis->name,
> -                        chassis_rec->name);
> -            } else {
> -                VLOG_INFO("Claiming lport %s for this chassis.",
> -                          pb->logical_port);
> -            }
> -            for (size_t i = 0; i < pb->n_mac; i++) {
> -                VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]);
> -            }
> -
> -            sbrec_port_binding_set_chassis(pb, chassis_rec);
>              if (is_additional_chassis(pb, chassis_rec)) {
> +                if (sb_readonly) {
> +                    return false;
> +                }
>                  remove_additional_chassis(pb, chassis_rec);
>              }
>              update_tracked = true;
>          }
> +        if (!notify_up) {
> +            if (!claimed_lport_set_up(pb, parent_pb, sb_readonly)) {
> +                return false;
> +            }
> +            if (pb->chassis != chassis_rec) {
> +                if (sb_readonly) {
> +                    return false;
> +                }
> +                set_pb_chassis_in_sbrec(pb, chassis_rec, true);
> +            }
> +        } else {
> +            if ((pb->chassis != chassis_rec) || (pb->n_up && !pb->up[0])) {
> +                if_status_mgr_claim_iface(if_mgr, pb, chassis_rec,
> +                                          sb_readonly);
> +            }
> +        }
>      } else if (can_bind == CAN_BIND_AS_ADDITIONAL) {
>          if (!is_additional_chassis(pb, chassis_rec)) {
>              if (sb_readonly) {
> @@ -1132,7 +1189,8 @@ claim_lport(const struct sbrec_port_binding *pb,
>   */
>  static bool
>  release_lport_main_chassis(const struct sbrec_port_binding *pb,
> -                           bool sb_readonly)
> +                           bool sb_readonly,
> +                           struct if_status_mgr *if_mgr)
>  {
>      if (pb->encap) {
>          if (sb_readonly) {
> @@ -1141,11 +1199,13 @@ release_lport_main_chassis(const struct sbrec_port_binding *pb,
>          sbrec_port_binding_set_encap(pb, NULL);
>      }
>  
> +    /* If sb readonly, pb->chassis unset through if-status if present. */
>      if (pb->chassis) {
> -        if (sb_readonly) {
> +        if (!sb_readonly) {
> +            sbrec_port_binding_set_chassis(pb, NULL);
> +        } else if (!if_status_mgr_iface_is_present(if_mgr, pb->logical_port)) {

I would really love it if we could get rid of the !notify_up special
case for ports with parents in the future.  That will also allow us to
get rid of this check too, right?  It can be a future patch though.

>              return false;
>          }
> -        sbrec_port_binding_set_chassis(pb, NULL);
>      }
>  
>      if (pb->virtual_parent) {
> @@ -1155,7 +1215,8 @@ release_lport_main_chassis(const struct sbrec_port_binding *pb,
>          sbrec_port_binding_set_virtual_parent(pb, NULL);
>      }
>  
> -    VLOG_INFO("Releasing lport %s from this chassis.", pb->logical_port);
> +    VLOG_INFO("Releasing lport %s from this chassis (sb_readonly=%d)",
> +              pb->logical_port, sb_readonly);
>      return true;
>  }
>  
> @@ -1189,7 +1250,7 @@ release_lport(const struct sbrec_port_binding *pb,
>                struct hmap *tracked_datapaths, struct if_status_mgr *if_mgr)
>  {
>      if (pb->chassis == chassis_rec) {
> -        if (!release_lport_main_chassis(pb, sb_readonly)) {
> +        if (!release_lport_main_chassis(pb, sb_readonly, if_mgr)) {
>              return false;
>          }
>      } else if (is_additional_chassis(pb, chassis_rec)) {
> @@ -1271,7 +1332,7 @@ consider_vif_lport_(const struct sbrec_port_binding *pb,
>                               b_lport->lbinding->iface,
>                               !b_ctx_in->ovnsb_idl_txn,
>                               !parent_pb, b_ctx_out->tracked_dp_bindings,
> -                             b_ctx_out->if_mgr)){
> +                             b_ctx_out->if_mgr)) {
>                  return false;
>              }
>  
> @@ -1527,7 +1588,8 @@ consider_localport(const struct sbrec_port_binding *pb,
>      enum can_bind can_bind = lport_can_bind_on_this_chassis(
>          b_ctx_in->chassis_rec, pb);
>      if (can_bind == CAN_BIND_AS_MAIN) {
> -        if (!release_lport_main_chassis(pb, !b_ctx_in->ovnsb_idl_txn)) {
> +        if (!release_lport_main_chassis(pb, !b_ctx_in->ovnsb_idl_txn,
> +            b_ctx_out->if_mgr)) {
>              return false;
>          }
>      } else if (can_bind == CAN_BIND_AS_ADDITIONAL) {
> diff --git a/controller/binding.h b/controller/binding.h
> index 1fed06674..d20659b0b 100644
> --- a/controller/binding.h
> +++ b/controller/binding.h
> @@ -151,8 +151,10 @@ const struct sbrec_port_binding *local_binding_get_primary_pb(
>  ofp_port_t local_binding_get_lport_ofport(const struct shash *local_bindings,
>                                            const char *pb_name);
>  
> -bool local_binding_is_up(struct shash *local_bindings, const char *pb_name);
> -bool local_binding_is_down(struct shash *local_bindings, const char *pb_name);
> +bool local_binding_is_up(struct shash *local_bindings, const char *pb_name,
> +                         const struct sbrec_chassis *);
> +bool local_binding_is_down(struct shash *local_bindings, const char *pb_name,
> +                           const struct sbrec_chassis *);
>  void local_binding_set_up(struct shash *local_bindings, const char *pb_name,
>                            const struct sbrec_chassis *chassis_rec,
>                            const char *ts_now_str, bool sb_readonly,
> @@ -160,7 +162,10 @@ void local_binding_set_up(struct shash *local_bindings, const char *pb_name,
>  void local_binding_set_down(struct shash *local_bindings, const char *pb_name,
>                              const struct sbrec_chassis *chassis_rec,
>                              bool sb_readonly, bool ovs_readonly);
> -
> +void local_binding_set_pb(struct shash *local_bindings, const char *pb_name,
> +                          const struct sbrec_chassis *chassis_rec,
> +                          struct hmap *tracked_datapaths,
> +                          bool is_set);
>  void binding_register_ovs_idl(struct ovsdb_idl *);
>  void binding_run(struct binding_ctx_in *, struct binding_ctx_out *);
>  bool binding_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn,
> @@ -178,6 +183,10 @@ void binding_dump_local_bindings(struct local_binding_data *, struct ds *);
>  bool is_additional_chassis(const struct sbrec_port_binding *pb,
>                             const struct sbrec_chassis *chassis_rec);
>  
> +void set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,
> +                             const struct sbrec_chassis *chassis_rec,
> +                             bool is_set);
> +
>  /* Corresponds to each Port_Binding.type. */
>  enum en_lport_type {
>      LP_UNKNOWN,
> diff --git a/controller/if-status.c b/controller/if-status.c
> index ad61844d8..5b0eef859 100644
> --- a/controller/if-status.c
> +++ b/controller/if-status.c
> @@ -24,6 +24,7 @@
>  #include "lib/util.h"
>  #include "timeval.h"
>  #include "openvswitch/vlog.h"
> +#include "lib/ovn-sb-idl.h"
>  
>  VLOG_DEFINE_THIS_MODULE(if_status);
>  
> @@ -53,9 +54,11 @@ VLOG_DEFINE_THIS_MODULE(if_status);
>   */
>  
>  enum if_state {
> -    OIF_CLAIMED,       /* Newly claimed interface. */
> -    OIF_INSTALL_FLOWS, /* Already claimed interface for which flows are still
> -                        * being installed.
> +    OIF_CLAIMED,       /* Newly claimed interface. pb->chassis not yet updated.
> +                        */
> +    OIF_INSTALL_FLOWS, /* Claimed interface with pb->chassis successfully
> +                        * updated in SB and for which flows are still being
> +                        * installed.
>                          */
>      OIF_MARK_UP,       /* Interface with flows successfully installed in OVS
>                          * but not yet marked "up" in the binding module (in
> @@ -78,6 +81,55 @@ static const char *if_state_names[] = {
>      [OIF_INSTALLED]     = "INSTALLED",
>  };
>  
> +/*
> + *       +----------------------+
> + * +---> |                      |
> + * | +-> |         NULL         | <--------------------------------------+++-+
> + * | |   +----------------------+                                            |
> + * | |     ^ release_iface   | claim_iface                                   |
> + * | |     |                 V - sbrec_update_chassis(if sb is rw)           |
> + * | |   +----------------------+                                            |
> + * | |   |                      | <----------------------------------------+ |
> + * | |   |       CLAIMED        | <--------------------------------------+ | |
> + * | |   +----------------------+                                        | | |
> + * | |                  | mgr_update(when sb is rw)                      | | |
> + * | | release_iface    |  - sbrec_update_chassis                        | | |
> + * | |                  |  - request seqno                               | | |
> + * | |                  V                                                | | |
> + * | |   +----------------------+                                        | | |
> + * | +-- |                      |  mgr_run(seqno not rcvd)               | | |
> + * |     |    INSTALL_FLOWS     |   - set port down in sb                | | |
> + * |     |                      |  mgr_update()                          | | |
> + * |     +----------------------+   - sbrec_update_chassis if needed     | | |
> + * |                    |                                                | | |
> + * |                    |  mgr_run(seqno rcvd)                           | | |
> + * |                    |  - set port up in sb                           | | |
> + * | release_iface      |  - set ovn-installed in ovs                    | | |
> + * |                    V                                                | | |
> + * |   +----------------------+                                          | | |
> + * |   |                      |  mgr_run()                               | | |
> + * +-- |       MARK_UP        |  - set port up in sb                     | | |
> + *     |                      |  - set ovn-installed in ovs              | | |
> + *     |                      |  mgr_update()                            | | |
> + *     +----------------------+  - sbrec_update_chassis if needed        | | |
> + *              |                                                        | | |
> + *              | mgr_update(rcvd port up / ovn_installed & chassis set) | | |
> + *              V                                                        | | |
> + *     +----------------------+                                          | | |
> + *     |      INSTALLED       | ------------> claim_iface ---------------+ | |
> + *     +----------------------+                                            | |
> + *              |                                                          | |
> + *              | release_iface                                            | |
> + *              V                                                          | |
> + *     +----------------------+                                            | |
> + *     |                      | ------------> claim_iface -----------------+ |
> + *     |      MARK_DOWN       | ------> mgr_update(rcvd port down) ----------+
> + *     |                      | mgr_run()
> + *     |                      | - set port down in sb
> + *     |                      | mgr_update()
> + *     +----------------------+ - sbrec_update_chassis(NULL)
> + */
> +

Thanks for adding this, it's really useful!

>  struct ovs_iface {
>      char *id;               /* Extracted from OVS external_ids.iface_id. */
>      enum if_state state;    /* State of the interface in the state machine. */
> @@ -85,6 +137,7 @@ struct ovs_iface {
>                               * be fully programmed in OVS.  Only used in state
>                               * OIF_INSTALL_FLOWS.
>                               */
> +    bool chassis_update_required;  /* If true, pb->chassis must be updated. */
>  };
>  
>  static uint64_t ifaces_usage;
> @@ -158,14 +211,23 @@ if_status_mgr_destroy(struct if_status_mgr *mgr)
>  }
>  
>  void
> -if_status_mgr_claim_iface(struct if_status_mgr *mgr, const char *iface_id)
> +if_status_mgr_claim_iface(struct if_status_mgr *mgr,
> +                          const struct sbrec_port_binding *pb,
> +                          const struct sbrec_chassis *chassis_rec,
> +                          bool sb_readonly)
>  {
> +    const char *iface_id = pb->logical_port;
>      struct ovs_iface *iface = shash_find_data(&mgr->ifaces, iface_id);
>  
>      if (!iface) {
>          iface = ovs_iface_create(mgr, iface_id, OIF_CLAIMED);
>      }
> -
> +    if (!sb_readonly) {
> +        set_pb_chassis_in_sbrec(pb, chassis_rec, true);
> +        iface->chassis_update_required = false;
> +    } else {
> +        iface->chassis_update_required = true;
> +    }
>      switch (iface->state) {
>      case OIF_CLAIMED:
>      case OIF_INSTALL_FLOWS:
> @@ -182,6 +244,13 @@ if_status_mgr_claim_iface(struct if_status_mgr *mgr, const char *iface_id)
>      }
>  }
>  
> +bool
> +if_status_mgr_iface_is_present(struct if_status_mgr *mgr, const char *iface_id)
> +{
> +    struct ovs_iface *iface = shash_find_data(&mgr->ifaces, iface_id);
> +    return (!!iface);

Nit: I'd rewrite this as:

return !!shash_find_data(&mgr->ifaces, iface_id);


> +}
> +
>  void
>  if_status_mgr_release_iface(struct if_status_mgr *mgr, const char *iface_id)
>  {
> @@ -246,9 +315,43 @@ if_status_mgr_delete_iface(struct if_status_mgr *mgr, const char *iface_id)
>      }
>  }
>  
> +bool
> +if_status_handle_claims(struct if_status_mgr *mgr,
> +                        struct local_binding_data *binding_data,
> +                        const struct sbrec_chassis *chassis_rec,
> +                        struct hmap *tracked_datapath,
> +                        bool sb_readonly)
> +{
> +    if (!binding_data) {
> +        return false;
> +    }
> +
> +    struct shash *bindings = &binding_data->bindings;
> +    struct hmapx_node *node;
> +
> +    bool rc = false;
> +    HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED]) {

We don't need the _SAFE here as far as I can tell, right?

> +        struct ovs_iface *iface = node->data;
> +        if (sb_readonly) {
> +            return false;
> +        }
> +        if (iface->chassis_update_required) {
> +            VLOG_INFO("if_status_handle_claims for %s", iface->id);
> +            local_binding_set_pb(bindings, iface->id, chassis_rec,
> +                                 tracked_datapath, true);
> +            rc = true;
> +        }
> +        iface->chassis_update_required = false;
> +    }
> +    return rc;
> +}
> +
>  void
>  if_status_mgr_update(struct if_status_mgr *mgr,
> -                     struct local_binding_data *binding_data)
> +                     struct local_binding_data *binding_data,
> +                     const struct sbrec_chassis *chassis_rec,
> +                     struct hmap *tracked_datapath,
> +                     bool sb_readonly)
>  {
>      if (!binding_data) {
>          return;
> @@ -257,13 +360,25 @@ if_status_mgr_update(struct if_status_mgr *mgr,
>      struct shash *bindings = &binding_data->bindings;
>      struct hmapx_node *node;
>  
> +    /* Interfaces in OIF_MARK_UP state have already set their pb->chassis.
> +     * However, it might have been reset by another hv.
> +     */
>      /* Move all interfaces that have been confirmed "up" by the binding module,
>       * from OIF_MARK_UP to OIF_INSTALLED.
>       */
>      HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_MARK_UP]) {
>          struct ovs_iface *iface = node->data;
>  
> -        if (local_binding_is_up(bindings, iface->id)) {
> +        if (iface->chassis_update_required) {
> +            if (!sb_readonly) {
> +                iface->chassis_update_required = false;
> +                local_binding_set_pb(bindings, iface->id, chassis_rec,
> +                                     tracked_datapath, true);
> +            } else {
> +                continue;
> +            }
> +        }
> +        if (local_binding_is_up(bindings, iface->id, chassis_rec)) {
>              ovs_iface_set_state(mgr, iface, OIF_INSTALLED);
>          }
>      }
> @@ -274,23 +389,53 @@ if_status_mgr_update(struct if_status_mgr *mgr,
>      HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_MARK_DOWN]) {
>          struct ovs_iface *iface = node->data;
>  
> -        if (local_binding_is_down(bindings, iface->id)) {
> +        if (!sb_readonly) {
> +            local_binding_set_pb(bindings, iface->id, chassis_rec,
> +                                 tracked_datapath, false);
> +        }
> +        if (local_binding_is_down(bindings, iface->id, chassis_rec)) {
>              ovs_iface_destroy(mgr, iface);
>          }
>      }
>  
> -    /* Register for a notification about flows being installed in OVS for all
> -     * newly claimed interfaces.
> +    if (!sb_readonly) {
> +        HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_INSTALL_FLOWS]) {
> +            struct ovs_iface *iface = node->data;
> +
> +            if (iface->chassis_update_required) {
> +                iface->chassis_update_required = false;
> +                local_binding_set_pb(bindings, iface->id, chassis_rec,
> +                                     tracked_datapath, true);
> +            }
> +        }
> +    }
> +
> +    /* Update Port_Binding->chassis for newly claimed interfaces
> +     * Register for a notification about flows being installed in OVS for all
> +     * newly claimed interfaces for which we could update pb->chassis.
>       *
>       * Move them from OIF_CLAIMED to OIF_INSTALL_FLOWS.
>       */
> -    bool new_ifaces = false;
> -    HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED]) {
> -        struct ovs_iface *iface = node->data;
>  
> -        ovs_iface_set_state(mgr, iface, OIF_INSTALL_FLOWS);
> -        iface->install_seqno = mgr->iface_seqno + 1;
> -        new_ifaces = true;
> +    bool new_ifaces = false;
> +    if (!sb_readonly) {
> +        HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED]) {
> +            struct ovs_iface *iface = node->data;
> +            if (iface->chassis_update_required) {
> +                local_binding_set_pb(bindings, iface->id, chassis_rec,
> +                                     tracked_datapath, true);
> +            }
> +            ovs_iface_set_state(mgr, iface, OIF_INSTALL_FLOWS);
> +            iface->install_seqno = mgr->iface_seqno + 1;
> +            iface->chassis_update_required = false;
> +            new_ifaces = true;
> +        }
> +    } else {
> +        HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED]) {
> +            struct ovs_iface *iface = node->data;
> +            VLOG_INFO("Not updating pb chassis for %s now as sb is readonly",
> +                     iface->id);

Should we rate limit this?
Nit: indentation.

> +        }
>      }
>  
>      /* Request a seqno update when the flows for new interfaces have been
> @@ -403,7 +548,7 @@ if_status_mgr_update_bindings(struct if_status_mgr *mgr,
>      struct hmapx_node *node;
>  
>      /* Notify the binding module to set "down" all bindings that are still
> -     * in the process of being installed in OVS, i.e., are not yet instsalled.
> +     * in the process of being installed in OVS, i.e., are not yet installed.
>       */
>      HMAPX_FOR_EACH (node, &mgr->ifaces_per_state[OIF_INSTALL_FLOWS]) {
>          struct ovs_iface *iface = node->data;
> diff --git a/controller/if-status.h b/controller/if-status.h
> index bb8a3950d..9b1200ff0 100644
> --- a/controller/if-status.h
> +++ b/controller/if-status.h
> @@ -27,15 +27,28 @@ struct if_status_mgr *if_status_mgr_create(void);
>  void if_status_mgr_clear(struct if_status_mgr *);
>  void if_status_mgr_destroy(struct if_status_mgr *);
>  
> -void if_status_mgr_claim_iface(struct if_status_mgr *, const char *iface_id);
> +void if_status_mgr_claim_iface(struct if_status_mgr *,
> +                               const struct sbrec_port_binding *pb,
> +                               const struct sbrec_chassis *chassis_rec,
> +                               bool sb_readonly);
>  void if_status_mgr_release_iface(struct if_status_mgr *, const char *iface_id);
>  void if_status_mgr_delete_iface(struct if_status_mgr *, const char *iface_id);
>  
> -void if_status_mgr_update(struct if_status_mgr *, struct local_binding_data *);
> +void if_status_mgr_update(struct if_status_mgr *, struct local_binding_data *,
> +                          const struct sbrec_chassis *chassis,
> +                          struct hmap *tracked_dp_bindings,
> +                          bool sb_readonly);
>  void if_status_mgr_run(struct if_status_mgr *mgr, struct local_binding_data *,
>                         const struct sbrec_chassis *,
>                         bool sb_readonly, bool ovs_readonly);
>  void if_status_mgr_get_memory_usage(struct if_status_mgr *mgr,
>                                      struct simap *usage);
> +bool if_status_mgr_iface_is_present(struct if_status_mgr *mgr,
> +                                    const char *iface_id);
> +bool if_status_handle_claims(struct if_status_mgr *mgr,
> +                             struct local_binding_data *binding_data,
> +                             const struct sbrec_chassis *chassis_rec,
> +                             struct hmap *tracked_datapath,
> +                             bool sb_readonly);
>  
>  # endif /* controller/if-status.h */
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 2793c8687..1c3cac2c1 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -1417,6 +1417,111 @@ en_runtime_data_run(struct engine_node *node, void *data)
>      engine_set_node_state(node, EN_UPDATED);
>  }
>  
> +struct ed_type_sb_ro {
> +    bool sb_readonly;
> +};
> +
> +static void *
> +en_sb_ro_init(struct engine_node *node OVS_UNUSED,
> +              struct engine_arg *arg OVS_UNUSED)
> +{
> +    struct ed_type_sb_ro *data = xzalloc(sizeof *data);
> +    return data;
> +}
> +
> +static void
> +en_sb_ro_run(struct engine_node *node, void *data)
> +{
> +    struct ed_type_sb_ro *sb_ro_data = data;
> +    bool sb_readonly = !engine_get_context()->ovnsb_idl_txn;
> +    if (sb_ro_data->sb_readonly != sb_readonly) {
> +        sb_ro_data->sb_readonly = sb_readonly;
> +        if (!sb_ro_data->sb_readonly) {
> +            engine_set_node_state(node, EN_UPDATED);
> +        }
> +    }
> +}
> +
> +static void
> +en_sb_ro_cleanup(void *data OVS_UNUSED)
> +{
> +}
> +
> +static bool
> +pb_claims_sb_ro_handler(struct engine_node *node, void *data OVS_UNUSED)
> +{
> +   engine_set_node_state(node, EN_UPDATED);
> +   return true;
> +}
> +
> +static void *
> +en_pb_claims_init(struct engine_node *node OVS_UNUSED,
> +                  struct engine_arg *arg OVS_UNUSED)
> +{
> +    return NULL;
> +}
> +
> +static void
> +en_pb_claims_run(struct engine_node *node OVS_UNUSED, void *data_ OVS_UNUSED)
> +{
> +}
> +
> +static void
> +en_pb_claims_cleanup(void *data OVS_UNUSED)
> +{
> +}

Not really a problem of this patch but maybe we can find a way to avoid
having to explcitly define cleanup/run callbacks that don't do much.
Maybe some sensible defaults for engine node callbacks would make sense.
 Han, what do you think?

> +
> +static bool
> +pb_claims_runtime_data_handler(struct engine_node *node OVS_UNUSED,
> +                               void *data OVS_UNUSED)
> +{
> +    return true;
> +}
> +
> +static bool
> +lflow_output_runtime_data_handler(struct engine_node *node,
> +                                  void *data);
> +static bool
> +lflow_output_pb_claims_handler(struct engine_node *node, void *data OVS_UNUSED)
> +{
> +    const struct sbrec_chassis *chassis = NULL;
> +
> +    struct ovsrec_open_vswitch_table *ovs_table =
> +        (struct ovsrec_open_vswitch_table *)EN_OVSDB_GET(
> +            engine_get_input("OVS_open_vswitch", node));
> +
> +    const char *chassis_id = get_ovs_chassis_id(ovs_table);
> +
> +    struct ovsdb_idl_index *sbrec_chassis_by_name =
> +        engine_ovsdb_node_get_index(
> +                engine_get_input("SB_chassis", node),
> +                "name");
> +
> +    if (chassis_id) {
> +        chassis = chassis_lookup_by_name(sbrec_chassis_by_name, chassis_id);
> +    }
> +    if (chassis) {
> +        struct ed_type_runtime_data *runtime_data =
> +            engine_get_input_data("runtime_data", node);
> +        bool sb_readonly = !engine_get_context()->ovnsb_idl_txn;
> +        struct controller_engine_ctx *ctrl_ctx =
> +            engine_get_context()->client_ctx;
> +
> +        if (if_status_handle_claims(ctrl_ctx->if_mgr,
> +                                    &runtime_data->lbinding_data,
> +                                    chassis,
> +                                    &runtime_data->tracked_dp_bindings,
> +                                    sb_readonly)) {
> +            struct engine_node *rt_node =
> +                engine_get_input("runtime_data", node);
> +            if (!engine_node_changed(rt_node)) {
> +                lflow_output_runtime_data_handler(node, data);

Instead of explicitly calling another input's handler here I think I
would factor out the logic of the handler into a separate helper
function which we could call from lflow_output_runtime_data_handler()
and here.  Based on its return value we could set the node state, if needed.

> +            }
> +        }
> +    }
> +    return true;
> +}
> +
>  static bool
>  runtime_data_ovs_interface_shadow_handler(struct engine_node *node, void *data)
>  {
> @@ -3438,6 +3543,8 @@ main(int argc, char *argv[])
>      stopwatch_create(VIF_PLUG_RUN_STOPWATCH_NAME, SW_MS);
>  
>      /* Define inc-proc-engine nodes. */
> +    ENGINE_NODE(sb_ro, "sb_ro");
> +    ENGINE_NODE(pb_claims, "pb_claims");
>      ENGINE_NODE_WITH_CLEAR_TRACK_DATA_IS_VALID(ct_zones, "ct_zones");
>      ENGINE_NODE_WITH_CLEAR_TRACK_DATA(ovs_interface_shadow,
>                                        "ovs_interface_shadow");
> @@ -3477,6 +3584,13 @@ main(int argc, char *argv[])
>      engine_add_input(&en_non_vif_data, &en_ovs_interface,
>                       non_vif_data_ovs_iface_handler);
>  
> +    /* This ensures we run after runtime data. */
> +    engine_add_input(&en_pb_claims, &en_runtime_data,
> +                     pb_claims_runtime_data_handler);

This is neat but I think we need to at least add a comment to
lflow_output_pb_claims_handler() to explain why it's ok to access the
runtime_data input there.

Also, we can use engine_noop_handler() instead of defining an explicit one.

> +
> +    /* This ensures we always run when sb becomes writable. */
> +    engine_add_input(&en_pb_claims, &en_sb_ro, pb_claims_sb_ro_handler);
> +
>      /* Note: The order of inputs is important, all OVS interface changes must
>       * be handled before any ct_zone changes.
>       */
> @@ -3518,6 +3632,8 @@ main(int argc, char *argv[])
>                       lflow_output_port_groups_handler);
>      engine_add_input(&en_lflow_output, &en_runtime_data,
>                       lflow_output_runtime_data_handler);
> +    engine_add_input(&en_lflow_output, &en_pb_claims,
> +                     lflow_output_pb_claims_handler);
>      engine_add_input(&en_lflow_output, &en_non_vif_data,
>                       NULL);
>  
> @@ -3999,7 +4115,10 @@ main(int argc, char *argv[])
>                          runtime_data ? &runtime_data->lbinding_data : NULL;
>                      stopwatch_start(IF_STATUS_MGR_UPDATE_STOPWATCH_NAME,
>                                      time_msec());
> -                    if_status_mgr_update(if_mgr, binding_data);
> +                    if_status_mgr_update(if_mgr, binding_data, chassis,
> +                                         (runtime_data
> +                                          ? &runtime_data->tracked_dp_bindings
> +                                          : NULL), !ovnsb_idl_txn);

I'm not completely sure that this is correct.  The engine has already
run so any nodes that depend on runtime_data have already been processed.

But now we might be adding more "tracked changes" to runtime_data, e.g.:

if_status_mgr_update()
  -> local_binding_set_pb()
    -> update_lport_tracking()
        -> tracked_datapath_lport_add(pb, TRACKED_RESOURCE_NEW, ..)

After this point these tracked changes will not be processed anymore and
will be cleared before the next run starts in the next iteration in
en_runtime_data_clear_tracked_data().

The question is: are we ever adding more "tracked changes" to
runtime_data->tracked_dp_bindings here?  If the answer is "yes" then
this approach might be problematic.  If the answer is "no" then we might
as well avoid passing the runtime_data->tracked_dp_bindings all the way
down and just call local_binding_set_pb() with tracked_datapaths set to
NULL where needed.

>                      stopwatch_stop(IF_STATUS_MGR_UPDATE_STOPWATCH_NAME,
>                                     time_msec());
>  
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 59d51f3e0..e9c061939 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -14990,7 +14990,10 @@ AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=65 | grep actions=output:1],
>  echo "verifying that lsp0 binding moves when requested-chassis is changed"
>  
>  ovn-nbctl lsp-set-options lsp0 requested-chassis=hv2
> -OVS_WAIT_UNTIL([test 1 = $(grep -c "Releasing lport lsp0 from this chassis" hv1/ovn-controller.log)])
> +
> +# We might see multiple "Releasing lport ...", when sb is read only
> +OVS_WAIT_UNTIL([test 1 -le $(grep -c "Releasing lport lsp0 from this chassis" hv1/ovn-controller.log)])
> +
>  wait_column "$hv2_uuid" Port_Binding chassis logical_port=lsp0
>  
>  # (6) Chassis hv2 should add flows and hv1 should not.
> @@ -15036,7 +15039,7 @@ AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=0 | grep in_port=1], [0], [ig
>  AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=65 | grep actions=output:1], [0], [ignore])
>  
>  check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=non-existant-chassis
> -OVS_WAIT_UNTIL([test 1 = $(grep -c "Releasing lport lsp0 from this chassis" hv1/ovn-controller.log)])
> +OVS_WAIT_UNTIL([test 1 -le $(grep -c "Releasing lport lsp0 from this chassis" hv1/ovn-controller.log)])
>  check ovn-nbctl --wait=hv sync
>  wait_column '' Port_Binding chasssi logical_port=lsp0
>  AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=0 | grep in_port=1], [1], [])
> @@ -32041,3 +32044,109 @@ AT_CHECK([test $(ovn-sbctl list fdb | grep -c "00:00:00:00:10:30") = 0])
>  OVN_CLEANUP([hv1])
>  AT_CLEANUP
>  ])
> +
> +# RUN_OVN_NBCTL() executes list of commands built by the OVN_NBCTL() macro.
> +m4_define([OVN_NBCTL], [
> +    command="${command} -- $1"
> +])
> +
> +m4_define([RUN_OVN_NBCTL], [
> +    check ovn-nbctl ${command}
> +    unset command
> +])

Why can't we factor this out to ovn-macros.at (and remove the version
from perf-northd.at too)?

> +
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([recomputes])

As this is a test that is also performance related I think we could add
AT_KEYWORDS([performance]).  We could also add that to the tests in
perf-northd.at.  But we can do all this as a follow up, I guess, to
allow us to run all performance related tests separately if needed.

> +ovn_start
> +
> +n_hv=4
> +
> +# Add chassis
> +net_add n1
> +for i in $(seq 1 $n_hv); do
> +    sim_add hv$i
> +    as hv$i
> +    check ovs-vsctl add-br br-phys
> +    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +    ovn_attach n1 br-phys 192.168.0.$i 24 geneve
> +done
> +
> +add_switch_ports() {
> +    start_port=$1
> +    end_port=$2
> +    nb_hv=$3
> +    bulk_size=$4
> +    for ((i=start_port; i<end_port; )) do
> +        start_bulk=$i
> +        for hv in $(seq 1 $nb_hv); do
> +            end_bulk=$((start_bulk+bulk_size-1))
> +            for port in $(seq $start_bulk $end_bulk); do
> +                logical_switch_port=lsp${port}
> +                OVN_NBCTL(lsp-add ls1 $logical_switch_port)
> +                OVN_NBCTL(lsp-set-addresses $logical_switch_port dynamic)
> +            done
> +            start_bulk=$((end_bulk+1))
> +        done
> +        RUN_OVN_NBCTL()
> +
> +        start_bulk=$i
> +        for hv in $(seq 1 $nb_hv); do
> +            end_bulk=$((start_bulk+bulk_size-1))
> +            for port in $(seq $start_bulk $end_bulk); do
> +                logical_switch_port=lsp${port}
> +                as hv$hv ovs-vsctl \
> +                    --no-wait -- add-port br-int vif${port} \
> +                    -- set Interface vif${port} external_ids:iface-id=$logical_switch_port
> +            done
> +            start_bulk=$((end_bulk+1))
> +        done
> +        i=$((end_bulk+1))
> +    done
> +
> +    # Claiming still WIP here but check for recomputes so test can fail faster
> +    AT_CHECK([test $(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) == $lflow_run])
> +}
> +check ovn-nbctl ls-add ls1
> +check ovn-nbctl set Logical_Switch ls1 other_config:subnet=10.1.0.0/16
> +check ovn-nbctl set Logical_Switch ls1 other_config:exclude_ips=10.1.255.254
> +
> +check ovn-nbctl lr-add lr1
> +check ovn-nbctl lsp-add ls1 lsp0 -- set Logical_Switch_Port lsp0 type=router options:router-port=lrp0 addresses=dynamic
> +check ovn-nbctl lrp-add lr1 lrp0 "f0:00:00:01:00:01" 10.1.255.254/16
> +check ovn-nbctl lr-nat-add lr1 snat 10.2.0.1 10.1.0.0/16
> +
> +check ovn-nbctl --wait=hv sync
> +lflow_run=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run)
> +
> +add_switch_ports 1 1000 $n_hv 5
> +
> +wait_for_ports_up
> +check ovn-nbctl --wait=hv sync
> +
> +for i in $(seq 1 $n_hv); do
> +    pid=$(cat hv${i}/ovn-controller.pid)
> +    u=$((u+$(cat "/proc/$pid/stat" | awk '{print $14}')))
> +    s=$((s+$(cat "/proc/$pid/stat" | awk '{print $15}')))
> +done
> +
> +n_pid=$(cat northd/ovn-northd.pid)
> +n_u=$(cat "/proc/$pid/stat" | awk '{print $14}')
> +n_s=$(cat "/proc/$pid/stat" | awk '{print $15}')
> +
> +echo "Total Northd User Time: $n_u"
> +echo "Total Northd System Time: $n_s"
> +echo "Total Controller User Time: $u"
> +echo "Total Controller System Time: $s"
> +
> +lflow_run1=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run)
> +n_recomputes=`expr $lflow_run1 - $lflow_run`
> +echo "$n_recomputes recomputes"
> +
> +AT_CHECK([test $lflow_run1 == $lflow_run])
> +
> +for i in $(seq 2 $n_hv); do
> +    OVN_CLEANUP_SBOX([hv$i])
> +done
> +OVN_CLEANUP([hv1])
> +AT_CLEANUP
> +])

Thanks,
Dumitru
Xavier Simonart June 22, 2022, 10:19 a.m. UTC | #2
Hi Dumitru

Thanks for the review and the comments.
I'll send an update (v3) taking into account your comments.
See comments/feedback embedded below

Thanks
Xavier

On Thu, Jun 16, 2022 at 12:55 PM Dumitru Ceara <dceara@redhat.com> wrote:

> On 6/15/22 11:46, Xavier Simonart wrote:
> > When VIF ports are claimed on a chassis, SBDB Port_Binding table is
> updated.
> > If the SBDB IDL is still is read-only ("in transaction") when such a
> update
> > is required, the update is not possible and recompute is triggered
> through
> > I+P failure.
> >
> > This situation can happen:
> > - after updating Port_Binding->chassis to SBDB for one port, in a
> following
> >   iteration, ovn-controller handles Interface:external_ids:ovn-installed
> >   (for the same port) while SBDB is still read-only.
> > - after updating Port_Binding->chassis to SBDB for one port, in a
> following
> >   iteration, ovn-controller updates Port_Binding->chassis for another
> port,
> >   while SBDB is still read-only.
> >
> > This patch prevent the recompute, by having the if-status module
> > updating the Port_Binding chassis (if needed) when possible.
> > This does not delay Port_Binding chassis update compared to before this
> patch.
> > - With the patch, Port_Binding chassis will be updated as soon as SBDB is
> > again writable, without recompute.
> > - Without the patch, Port_Binding chassis was updated as soon as SBDB was
> > again writable, through a recompute.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059253
> > Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
> >
> > ---
>
> Hi Xavier,
>
> Overall I think the approach is good.  I do have a question related to
> I-P of tracked datapaths/ports below and some more rather minor comments.
>
> > v2:  - handled Dumitru's comments.
> >      - handled Han's comments, mainly ensure we moved out of CLAIMED
> state
> >        only after updating pb->chassis to guarentee physical flows are
> installed
> >        when ovn-installed is updated in OVS.
> >      - slighly reorganize the code to isolate 'notify_up = false' cases
> in
> >        claim_port (i.e. ports such as virtual ports), in the idea of
> making
> >        future patch preventing recomputes when virtual ports are claimed.
> >      - updated test case to cause more race conditions.
> >      - rebased on origin/main
> >      - note that "additional chassis" as now supported by
> >        "Support LSP:options:requested-chassis as a list" might still
> cause
> >        recomputes.
> >      - fixed missing flows when Port_Binding chassis was updated by
> mgr_update
> >        w/o any lflow recalculation.
> > ---
> >  controller/binding.c        | 154 ++++++++++++++++++++++---------
> >  controller/binding.h        |  15 ++-
> >  controller/if-status.c      | 179 ++++++++++++++++++++++++++++++++----
> >  controller/if-status.h      |  17 +++-
> >  controller/ovn-controller.c | 121 +++++++++++++++++++++++-
> >  tests/ovn.at                | 113 ++++++++++++++++++++++-
> >  6 files changed, 528 insertions(+), 71 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index 2279570f9..b21577f71 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -644,11 +644,17 @@ local_binding_get_lport_ofport(const struct shash
> *local_bindings,
> >  }
> >
> >  bool
> > -local_binding_is_up(struct shash *local_bindings, const char *pb_name)
> > +local_binding_is_up(struct shash *local_bindings, const char *pb_name,
> > +                    const struct sbrec_chassis *chassis_rec)
> >  {
> >      struct local_binding *lbinding =
> >          local_binding_find(local_bindings, pb_name);
> >      struct binding_lport *b_lport =
> local_binding_get_primary_lport(lbinding);
> > +
> > +    if (b_lport && b_lport->pb->chassis != chassis_rec) {
> > +        return false;
> > +    }
> > +
> >      if (lbinding && b_lport && lbinding->iface) {
> >          if (b_lport->pb->n_up && !b_lport->pb->up[0]) {
> >              return false;
> > @@ -660,13 +666,23 @@ local_binding_is_up(struct shash *local_bindings,
> const char *pb_name)
> >  }
> >
> >  bool
> > -local_binding_is_down(struct shash *local_bindings, const char *pb_name)
> > +local_binding_is_down(struct shash *local_bindings, const char *pb_name,
> > +                      const struct sbrec_chassis *chassis_rec)
> >  {
> >      struct local_binding *lbinding =
> >          local_binding_find(local_bindings, pb_name);
> >
> >      struct binding_lport *b_lport =
> local_binding_get_primary_lport(lbinding);
> >
> > +    if (b_lport) {
> > +        if (b_lport->pb->chassis == chassis_rec) {
> > +            return false;
> > +        } else if (b_lport->pb->chassis) {
> > +            VLOG_DBG("lport %s already claimed by other chassis",
> > +                     b_lport->pb->logical_port);
> > +        }
> > +    }
> > +
> >      if (!lbinding) {
> >          return true;
> >      }
> > @@ -884,37 +900,80 @@ get_lport_type_str(enum en_lport_type lport_type)
> >      OVS_NOT_REACHED();
> >  }
> >
> > -/* For newly claimed ports, if 'notify_up' is 'false':
> > +void
> > +set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,
> > +                        const struct sbrec_chassis *chassis_rec,
> > +                        bool is_set)
> > +{
> > +    if (pb->chassis != chassis_rec) {
> > +         if (is_set) {
> > +            if (pb->chassis) {
> > +                VLOG_INFO("Changing chassis for lport %s from %s to
> %s.",
> > +                          pb->logical_port, pb->chassis->name,
> > +                          chassis_rec->name);
> > +            } else {
> > +                VLOG_INFO("Claiming lport %s for this chassis.",
> > +                          pb->logical_port);
> > +            }
> > +            for (int i = 0; i < pb->n_mac; i++) {
> > +                VLOG_INFO("%s: Claiming %s", pb->logical_port,
> pb->mac[i]);
> > +            }
> > +            sbrec_port_binding_set_chassis(pb, chassis_rec);
> > +        }
> > +    } else if (!is_set) {
> > +        sbrec_port_binding_set_chassis(pb, NULL);
> > +    }
> > +}
> > +
> > +void
> > +local_binding_set_pb(struct shash *local_bindings, const char *pb_name,
> > +                     const struct sbrec_chassis *chassis_rec,
> > +                     struct hmap *tracked_datapaths, bool is_set)
> > +{
> > +    struct local_binding *lbinding =
> > +        local_binding_find(local_bindings, pb_name);
> > +    struct binding_lport *b_lport =
> local_binding_get_primary_lport(lbinding);
> > +
> > +    if (b_lport) {
> > +        set_pb_chassis_in_sbrec(b_lport->pb, chassis_rec, is_set);
> > +        if (tracked_datapaths) {
> > +            update_lport_tracking(b_lport->pb, tracked_datapaths, true);
> > +        }
> > +    }
> > +}
> > +
> > +/* For newly claimed ports:
> >   * - set the 'pb.up' field to true if 'pb' has no 'parent_pb'.
> >   * - set the 'pb.up' field to true if 'parent_pb.up' is 'true' (e.g.,
> for
> >   *   container and virtual ports).
> > - * Otherwise request a notification to be sent when the OVS flows
> > - * corresponding to 'pb' have been installed.
> > + *
> > + * Returns false if lport is not claimed due to 'sb_readonly'.
> > + * Returns true otherwise.
> >   *
> >   * Note:
> > - *   Updates (directly or through a notification) the 'pb->up' field
> only if
> > - *   it's explicitly set to 'false'.
> > + *   Updates the 'pb->up' field only if it's explicitly set to 'false'.
> >   *   This is to ensure compatibility with older versions of ovn-northd.
> >   */
> > -static void
> > +static bool
> >  claimed_lport_set_up(const struct sbrec_port_binding *pb,
> >                       const struct sbrec_port_binding *parent_pb,
> > -                     const struct sbrec_chassis *chassis_rec,
> > -                     bool notify_up, struct if_status_mgr *if_mgr)
> > +                     bool sb_readonly)
> >  {
> > -    if (!notify_up) {
> > -        bool up = true;
> > -        if (!parent_pb || (parent_pb->n_up && parent_pb->up[0])) {
> > +    /* When notify_up is false in claim_port(), no state is created
> > +     * by if_status_mgr. In such cases, return false (i.e. trigger
> recompute)
> > +     * if we can't update sb (because it is readonly).
> > +     */
> > +    bool up = true;
> > +    if (!parent_pb || (parent_pb->n_up && parent_pb->up[0])) {
> > +        if (!sb_readonly) {
> >              if (pb->n_up) {
> >                  sbrec_port_binding_set_up(pb, &up, 1);
> >              }
> > +        } else if (pb->n_up && !pb->up[0]) {
> > +            return false;
> >          }
> > -        return;
> > -    }
> > -
> > -    if (pb->chassis != chassis_rec || (pb->n_up && !pb->up[0])) {
> > -        if_status_mgr_claim_iface(if_mgr, pb->logical_port);
> >      }
> > +    return true;
> >  }
> >
> >  typedef void (*set_func)(const struct sbrec_port_binding *pb,
> > @@ -1057,37 +1116,35 @@ claim_lport(const struct sbrec_port_binding *pb,
> >              struct hmap *tracked_datapaths,
> >              struct if_status_mgr *if_mgr)
> >  {
> > -    if (!sb_readonly) {
> > -        claimed_lport_set_up(pb, parent_pb, chassis_rec, notify_up,
> if_mgr);
> > -    }
> > -
> >      enum can_bind can_bind =
> lport_can_bind_on_this_chassis(chassis_rec, pb);
> >      bool update_tracked = false;
> >
> >      if (can_bind == CAN_BIND_AS_MAIN) {
> >          if (pb->chassis != chassis_rec) {
> > -            if (sb_readonly) {
> > -                return false;
> > -            }
> > -
> > -            if (pb->chassis) {
> > -                VLOG_INFO("Changing chassis for lport %s from %s to
> %s.",
> > -                        pb->logical_port, pb->chassis->name,
> > -                        chassis_rec->name);
> > -            } else {
> > -                VLOG_INFO("Claiming lport %s for this chassis.",
> > -                          pb->logical_port);
> > -            }
> > -            for (size_t i = 0; i < pb->n_mac; i++) {
> > -                VLOG_INFO("%s: Claiming %s", pb->logical_port,
> pb->mac[i]);
> > -            }
> > -
> > -            sbrec_port_binding_set_chassis(pb, chassis_rec);
> >              if (is_additional_chassis(pb, chassis_rec)) {
> > +                if (sb_readonly) {
> > +                    return false;
> > +                }
> >                  remove_additional_chassis(pb, chassis_rec);
> >              }
> >              update_tracked = true;
> >          }
> > +        if (!notify_up) {
> > +            if (!claimed_lport_set_up(pb, parent_pb, sb_readonly)) {
> > +                return false;
> > +            }
> > +            if (pb->chassis != chassis_rec) {
> > +                if (sb_readonly) {
> > +                    return false;
> > +                }
> > +                set_pb_chassis_in_sbrec(pb, chassis_rec, true);
> > +            }
> > +        } else {
> > +            if ((pb->chassis != chassis_rec) || (pb->n_up &&
> !pb->up[0])) {
> > +                if_status_mgr_claim_iface(if_mgr, pb, chassis_rec,
> > +                                          sb_readonly);
> > +            }
> > +        }
> >      } else if (can_bind == CAN_BIND_AS_ADDITIONAL) {
> >          if (!is_additional_chassis(pb, chassis_rec)) {
> >              if (sb_readonly) {
> > @@ -1132,7 +1189,8 @@ claim_lport(const struct sbrec_port_binding *pb,
> >   */
> >  static bool
> >  release_lport_main_chassis(const struct sbrec_port_binding *pb,
> > -                           bool sb_readonly)
> > +                           bool sb_readonly,
> > +                           struct if_status_mgr *if_mgr)
> >  {
> >      if (pb->encap) {
> >          if (sb_readonly) {
> > @@ -1141,11 +1199,13 @@ release_lport_main_chassis(const struct
> sbrec_port_binding *pb,
> >          sbrec_port_binding_set_encap(pb, NULL);
> >      }
> >
> > +    /* If sb readonly, pb->chassis unset through if-status if present.
> */
> >      if (pb->chassis) {
> > -        if (sb_readonly) {
> > +        if (!sb_readonly) {
> > +            sbrec_port_binding_set_chassis(pb, NULL);
> > +        } else if (!if_status_mgr_iface_is_present(if_mgr,
> pb->logical_port)) {
>
> I would really love it if we could get rid of the !notify_up special
> case for ports with parents in the future.  That will also allow us to
> get rid of this check too, right?  It can be a future patch though.
>
> ack. Will do in further patch.

> >              return false;
> >          }
> > -        sbrec_port_binding_set_chassis(pb, NULL);
> >      }
> >
> >      if (pb->virtual_parent) {
> > @@ -1155,7 +1215,8 @@ release_lport_main_chassis(const struct
> sbrec_port_binding *pb,
> >          sbrec_port_binding_set_virtual_parent(pb, NULL);
> >      }
> >
> > -    VLOG_INFO("Releasing lport %s from this chassis.",
> pb->logical_port);
> > +    VLOG_INFO("Releasing lport %s from this chassis (sb_readonly=%d)",
> > +              pb->logical_port, sb_readonly);
> >      return true;
> >  }
> >
> > @@ -1189,7 +1250,7 @@ release_lport(const struct sbrec_port_binding *pb,
> >                struct hmap *tracked_datapaths, struct if_status_mgr
> *if_mgr)
> >  {
> >      if (pb->chassis == chassis_rec) {
> > -        if (!release_lport_main_chassis(pb, sb_readonly)) {
> > +        if (!release_lport_main_chassis(pb, sb_readonly, if_mgr)) {
> >              return false;
> >          }
> >      } else if (is_additional_chassis(pb, chassis_rec)) {
> > @@ -1271,7 +1332,7 @@ consider_vif_lport_(const struct
> sbrec_port_binding *pb,
> >                               b_lport->lbinding->iface,
> >                               !b_ctx_in->ovnsb_idl_txn,
> >                               !parent_pb, b_ctx_out->tracked_dp_bindings,
> > -                             b_ctx_out->if_mgr)){
> > +                             b_ctx_out->if_mgr)) {
> >                  return false;
> >              }
> >
> > @@ -1527,7 +1588,8 @@ consider_localport(const struct sbrec_port_binding
> *pb,
> >      enum can_bind can_bind = lport_can_bind_on_this_chassis(
> >          b_ctx_in->chassis_rec, pb);
> >      if (can_bind == CAN_BIND_AS_MAIN) {
> > -        if (!release_lport_main_chassis(pb, !b_ctx_in->ovnsb_idl_txn)) {
> > +        if (!release_lport_main_chassis(pb, !b_ctx_in->ovnsb_idl_txn,
> > +            b_ctx_out->if_mgr)) {
> >              return false;
> >          }
> >      } else if (can_bind == CAN_BIND_AS_ADDITIONAL) {
> > diff --git a/controller/binding.h b/controller/binding.h
> > index 1fed06674..d20659b0b 100644
> > --- a/controller/binding.h
> > +++ b/controller/binding.h
> > @@ -151,8 +151,10 @@ const struct sbrec_port_binding
> *local_binding_get_primary_pb(
> >  ofp_port_t local_binding_get_lport_ofport(const struct shash
> *local_bindings,
> >                                            const char *pb_name);
> >
> > -bool local_binding_is_up(struct shash *local_bindings, const char
> *pb_name);
> > -bool local_binding_is_down(struct shash *local_bindings, const char
> *pb_name);
> > +bool local_binding_is_up(struct shash *local_bindings, const char
> *pb_name,
> > +                         const struct sbrec_chassis *);
> > +bool local_binding_is_down(struct shash *local_bindings, const char
> *pb_name,
> > +                           const struct sbrec_chassis *);
> >  void local_binding_set_up(struct shash *local_bindings, const char
> *pb_name,
> >                            const struct sbrec_chassis *chassis_rec,
> >                            const char *ts_now_str, bool sb_readonly,
> > @@ -160,7 +162,10 @@ void local_binding_set_up(struct shash
> *local_bindings, const char *pb_name,
> >  void local_binding_set_down(struct shash *local_bindings, const char
> *pb_name,
> >                              const struct sbrec_chassis *chassis_rec,
> >                              bool sb_readonly, bool ovs_readonly);
> > -
> > +void local_binding_set_pb(struct shash *local_bindings, const char
> *pb_name,
> > +                          const struct sbrec_chassis *chassis_rec,
> > +                          struct hmap *tracked_datapaths,
> > +                          bool is_set);
> >  void binding_register_ovs_idl(struct ovsdb_idl *);
> >  void binding_run(struct binding_ctx_in *, struct binding_ctx_out *);
> >  bool binding_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn,
> > @@ -178,6 +183,10 @@ void binding_dump_local_bindings(struct
> local_binding_data *, struct ds *);
> >  bool is_additional_chassis(const struct sbrec_port_binding *pb,
> >                             const struct sbrec_chassis *chassis_rec);
> >
> > +void set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,
> > +                             const struct sbrec_chassis *chassis_rec,
> > +                             bool is_set);
> > +
> >  /* Corresponds to each Port_Binding.type. */
> >  enum en_lport_type {
> >      LP_UNKNOWN,
> > diff --git a/controller/if-status.c b/controller/if-status.c
> > index ad61844d8..5b0eef859 100644
> > --- a/controller/if-status.c
> > +++ b/controller/if-status.c
> > @@ -24,6 +24,7 @@
> >  #include "lib/util.h"
> >  #include "timeval.h"
> >  #include "openvswitch/vlog.h"
> > +#include "lib/ovn-sb-idl.h"
> >
> >  VLOG_DEFINE_THIS_MODULE(if_status);
> >
> > @@ -53,9 +54,11 @@ VLOG_DEFINE_THIS_MODULE(if_status);
> >   */
> >
> >  enum if_state {
> > -    OIF_CLAIMED,       /* Newly claimed interface. */
> > -    OIF_INSTALL_FLOWS, /* Already claimed interface for which flows are
> still
> > -                        * being installed.
> > +    OIF_CLAIMED,       /* Newly claimed interface. pb->chassis not yet
> updated.
> > +                        */
> > +    OIF_INSTALL_FLOWS, /* Claimed interface with pb->chassis
> successfully
> > +                        * updated in SB and for which flows are still
> being
> > +                        * installed.
> >                          */
> >      OIF_MARK_UP,       /* Interface with flows successfully installed
> in OVS
> >                          * but not yet marked "up" in the binding module
> (in
> > @@ -78,6 +81,55 @@ static const char *if_state_names[] = {
> >      [OIF_INSTALLED]     = "INSTALLED",
> >  };
> >
> > +/*
> > + *       +----------------------+
> > + * +---> |                      |
> > + * | +-> |         NULL         |
> <--------------------------------------+++-+
> > + * | |   +----------------------+
>       |
> > + * | |     ^ release_iface   | claim_iface
>      |
> > + * | |     |                 V - sbrec_update_chassis(if sb is rw)
>      |
> > + * | |   +----------------------+
>       |
> > + * | |   |                      |
> <----------------------------------------+ |
> > + * | |   |       CLAIMED        |
> <--------------------------------------+ | |
> > + * | |   +----------------------+
>   | | |
> > + * | |                  | mgr_update(when sb is rw)
>   | | |
> > + * | | release_iface    |  - sbrec_update_chassis
>   | | |
> > + * | |                  |  - request seqno
>  | | |
> > + * | |                  V
>   | | |
> > + * | |   +----------------------+
>   | | |
> > + * | +-- |                      |  mgr_run(seqno not rcvd)
>  | | |
> > + * |     |    INSTALL_FLOWS     |   - set port down in sb
>   | | |
> > + * |     |                      |  mgr_update()
>   | | |
> > + * |     +----------------------+   - sbrec_update_chassis if needed
>  | | |
> > + * |                    |
>   | | |
> > + * |                    |  mgr_run(seqno rcvd)
>  | | |
> > + * |                    |  - set port up in sb
>  | | |
> > + * | release_iface      |  - set ovn-installed in ovs
>   | | |
> > + * |                    V
>   | | |
> > + * |   +----------------------+
>   | | |
> > + * |   |                      |  mgr_run()
>  | | |
> > + * +-- |       MARK_UP        |  - set port up in sb
>  | | |
> > + *     |                      |  - set ovn-installed in ovs
>   | | |
> > + *     |                      |  mgr_update()
>   | | |
> > + *     +----------------------+  - sbrec_update_chassis if needed
>   | | |
> > + *              |
>   | | |
> > + *              | mgr_update(rcvd port up / ovn_installed & chassis
> set) | | |
> > + *              V
>   | | |
> > + *     +----------------------+
>   | | |
> > + *     |      INSTALLED       | ------------> claim_iface
> ---------------+ | |
> > + *     +----------------------+
>     | |
> > + *              |
>     | |
> > + *              | release_iface
>     | |
> > + *              V
>     | |
> > + *     +----------------------+
>     | |
> > + *     |                      | ------------> claim_iface
> -----------------+ |
> > + *     |      MARK_DOWN       | ------> mgr_update(rcvd port down)
> ----------+
> > + *     |                      | mgr_run()
> > + *     |                      | - set port down in sb
> > + *     |                      | mgr_update()
> > + *     +----------------------+ - sbrec_update_chassis(NULL)
> > + */
> > +
>
> Thanks for adding this, it's really useful!
>
> >  struct ovs_iface {
> >      char *id;               /* Extracted from OVS
> external_ids.iface_id. */
> >      enum if_state state;    /* State of the interface in the state
> machine. */
> > @@ -85,6 +137,7 @@ struct ovs_iface {
> >                               * be fully programmed in OVS.  Only used
> in state
> >                               * OIF_INSTALL_FLOWS.
> >                               */
> > +    bool chassis_update_required;  /* If true, pb->chassis must be
> updated. */
> >  };
> >
> >  static uint64_t ifaces_usage;
> > @@ -158,14 +211,23 @@ if_status_mgr_destroy(struct if_status_mgr *mgr)
> >  }
> >
> >  void
> > -if_status_mgr_claim_iface(struct if_status_mgr *mgr, const char
> *iface_id)
> > +if_status_mgr_claim_iface(struct if_status_mgr *mgr,
> > +                          const struct sbrec_port_binding *pb,
> > +                          const struct sbrec_chassis *chassis_rec,
> > +                          bool sb_readonly)
> >  {
> > +    const char *iface_id = pb->logical_port;
> >      struct ovs_iface *iface = shash_find_data(&mgr->ifaces, iface_id);
> >
> >      if (!iface) {
> >          iface = ovs_iface_create(mgr, iface_id, OIF_CLAIMED);
> >      }
> > -
> > +    if (!sb_readonly) {
> > +        set_pb_chassis_in_sbrec(pb, chassis_rec, true);
> > +        iface->chassis_update_required = false;
> > +    } else {
> > +        iface->chassis_update_required = true;
> > +    }
> >      switch (iface->state) {
> >      case OIF_CLAIMED:
> >      case OIF_INSTALL_FLOWS:
> > @@ -182,6 +244,13 @@ if_status_mgr_claim_iface(struct if_status_mgr
> *mgr, const char *iface_id)
> >      }
> >  }
> >
> > +bool
> > +if_status_mgr_iface_is_present(struct if_status_mgr *mgr, const char
> *iface_id)
> > +{
> > +    struct ovs_iface *iface = shash_find_data(&mgr->ifaces, iface_id);
> > +    return (!!iface);
>
> Nit: I'd rewrite this as:
>
> return !!shash_find_data(&mgr->ifaces, iface_id);
>
>
> > +}
> > +
> >  void
> >  if_status_mgr_release_iface(struct if_status_mgr *mgr, const char
> *iface_id)
> >  {
> > @@ -246,9 +315,43 @@ if_status_mgr_delete_iface(struct if_status_mgr
> *mgr, const char *iface_id)
> >      }
> >  }
> >
> > +bool
> > +if_status_handle_claims(struct if_status_mgr *mgr,
> > +                        struct local_binding_data *binding_data,
> > +                        const struct sbrec_chassis *chassis_rec,
> > +                        struct hmap *tracked_datapath,
> > +                        bool sb_readonly)
> > +{
> > +    if (!binding_data) {
> > +        return false;
> > +    }
> > +
> > +    struct shash *bindings = &binding_data->bindings;
> > +    struct hmapx_node *node;
> > +
> > +    bool rc = false;
> > +    HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED]) {
>
> We don't need the _SAFE here as far as I can tell, right?
>
Correct.

>
> > +        struct ovs_iface *iface = node->data;
> > +        if (sb_readonly) {
> > +            return false;
> > +        }
> > +        if (iface->chassis_update_required) {
> > +            VLOG_INFO("if_status_handle_claims for %s", iface->id);
> > +            local_binding_set_pb(bindings, iface->id, chassis_rec,
> > +                                 tracked_datapath, true);
> > +            rc = true;
> > +        }
> > +        iface->chassis_update_required = false;
> > +    }
> > +    return rc;
> > +}
> > +
> >  void
> >  if_status_mgr_update(struct if_status_mgr *mgr,
> > -                     struct local_binding_data *binding_data)
> > +                     struct local_binding_data *binding_data,
> > +                     const struct sbrec_chassis *chassis_rec,
> > +                     struct hmap *tracked_datapath,
> > +                     bool sb_readonly)
> >  {
> >      if (!binding_data) {
> >          return;
> > @@ -257,13 +360,25 @@ if_status_mgr_update(struct if_status_mgr *mgr,
> >      struct shash *bindings = &binding_data->bindings;
> >      struct hmapx_node *node;
> >
> > +    /* Interfaces in OIF_MARK_UP state have already set their
> pb->chassis.
> > +     * However, it might have been reset by another hv.
> > +     */
> >      /* Move all interfaces that have been confirmed "up" by the binding
> module,
> >       * from OIF_MARK_UP to OIF_INSTALLED.
> >       */
> >      HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_MARK_UP]) {
> >          struct ovs_iface *iface = node->data;
> >
> > -        if (local_binding_is_up(bindings, iface->id)) {
> > +        if (iface->chassis_update_required) {
> > +            if (!sb_readonly) {
> > +                iface->chassis_update_required = false;
> > +                local_binding_set_pb(bindings, iface->id, chassis_rec,
> > +                                     tracked_datapath, true);
> > +            } else {
> > +                continue;
> > +            }
> > +        }
> > +        if (local_binding_is_up(bindings, iface->id, chassis_rec)) {
> >              ovs_iface_set_state(mgr, iface, OIF_INSTALLED);
> >          }
> >      }
> > @@ -274,23 +389,53 @@ if_status_mgr_update(struct if_status_mgr *mgr,
> >      HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_MARK_DOWN]) {
> >          struct ovs_iface *iface = node->data;
> >
> > -        if (local_binding_is_down(bindings, iface->id)) {
> > +        if (!sb_readonly) {
> > +            local_binding_set_pb(bindings, iface->id, chassis_rec,
> > +                                 tracked_datapath, false);
> > +        }
> > +        if (local_binding_is_down(bindings, iface->id, chassis_rec)) {
> >              ovs_iface_destroy(mgr, iface);
> >          }
> >      }
> >
> > -    /* Register for a notification about flows being installed in OVS
> for all
> > -     * newly claimed interfaces.
> > +    if (!sb_readonly) {
> > +        HMAPX_FOR_EACH_SAFE (node,
> &mgr->ifaces_per_state[OIF_INSTALL_FLOWS]) {
> > +            struct ovs_iface *iface = node->data;
> > +
> > +            if (iface->chassis_update_required) {
> > +                iface->chassis_update_required = false;
> > +                local_binding_set_pb(bindings, iface->id, chassis_rec,
> > +                                     tracked_datapath, true);
> > +            }
> > +        }
> > +    }
> > +
> > +    /* Update Port_Binding->chassis for newly claimed interfaces
> > +     * Register for a notification about flows being installed in OVS
> for all
> > +     * newly claimed interfaces for which we could update pb->chassis.
> >       *
> >       * Move them from OIF_CLAIMED to OIF_INSTALL_FLOWS.
> >       */
> > -    bool new_ifaces = false;
> > -    HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED]) {
> > -        struct ovs_iface *iface = node->data;
> >
> > -        ovs_iface_set_state(mgr, iface, OIF_INSTALL_FLOWS);
> > -        iface->install_seqno = mgr->iface_seqno + 1;
> > -        new_ifaces = true;
> > +    bool new_ifaces = false;
> > +    if (!sb_readonly) {
> > +        HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED])
> {
> > +            struct ovs_iface *iface = node->data;
> > +            if (iface->chassis_update_required) {
> > +                local_binding_set_pb(bindings, iface->id, chassis_rec,
> > +                                     tracked_datapath, true);
> > +            }
> > +            ovs_iface_set_state(mgr, iface, OIF_INSTALL_FLOWS);
> > +            iface->install_seqno = mgr->iface_seqno + 1;
> > +            iface->chassis_update_required = false;
> > +            new_ifaces = true;
> > +        }
> > +    } else {
> > +        HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED])
> {
> > +            struct ovs_iface *iface = node->data;
> > +            VLOG_INFO("Not updating pb chassis for %s now as sb is
> readonly",
> > +                     iface->id);
>
> Should we rate limit this?
> Nit: indentation.
>
Good idea in case sb is really slow.

>
> > +        }
> >      }
> >
> >      /* Request a seqno update when the flows for new interfaces have
> been
> > @@ -403,7 +548,7 @@ if_status_mgr_update_bindings(struct if_status_mgr
> *mgr,
> >      struct hmapx_node *node;
> >
> >      /* Notify the binding module to set "down" all bindings that are
> still
> > -     * in the process of being installed in OVS, i.e., are not yet
> instsalled.
> > +     * in the process of being installed in OVS, i.e., are not yet
> installed.
> >       */
> >      HMAPX_FOR_EACH (node, &mgr->ifaces_per_state[OIF_INSTALL_FLOWS]) {
> >          struct ovs_iface *iface = node->data;
> > diff --git a/controller/if-status.h b/controller/if-status.h
> > index bb8a3950d..9b1200ff0 100644
> > --- a/controller/if-status.h
> > +++ b/controller/if-status.h
> > @@ -27,15 +27,28 @@ struct if_status_mgr *if_status_mgr_create(void);
> >  void if_status_mgr_clear(struct if_status_mgr *);
> >  void if_status_mgr_destroy(struct if_status_mgr *);
> >
> > -void if_status_mgr_claim_iface(struct if_status_mgr *, const char
> *iface_id);
> > +void if_status_mgr_claim_iface(struct if_status_mgr *,
> > +                               const struct sbrec_port_binding *pb,
> > +                               const struct sbrec_chassis *chassis_rec,
> > +                               bool sb_readonly);
> >  void if_status_mgr_release_iface(struct if_status_mgr *, const char
> *iface_id);
> >  void if_status_mgr_delete_iface(struct if_status_mgr *, const char
> *iface_id);
> >
> > -void if_status_mgr_update(struct if_status_mgr *, struct
> local_binding_data *);
> > +void if_status_mgr_update(struct if_status_mgr *, struct
> local_binding_data *,
> > +                          const struct sbrec_chassis *chassis,
> > +                          struct hmap *tracked_dp_bindings,
> > +                          bool sb_readonly);
> >  void if_status_mgr_run(struct if_status_mgr *mgr, struct
> local_binding_data *,
> >                         const struct sbrec_chassis *,
> >                         bool sb_readonly, bool ovs_readonly);
> >  void if_status_mgr_get_memory_usage(struct if_status_mgr *mgr,
> >                                      struct simap *usage);
> > +bool if_status_mgr_iface_is_present(struct if_status_mgr *mgr,
> > +                                    const char *iface_id);
> > +bool if_status_handle_claims(struct if_status_mgr *mgr,
> > +                             struct local_binding_data *binding_data,
> > +                             const struct sbrec_chassis *chassis_rec,
> > +                             struct hmap *tracked_datapath,
> > +                             bool sb_readonly);
> >
> >  # endif /* controller/if-status.h */
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index 2793c8687..1c3cac2c1 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -1417,6 +1417,111 @@ en_runtime_data_run(struct engine_node *node,
> void *data)
> >      engine_set_node_state(node, EN_UPDATED);
> >  }
> >
> > +struct ed_type_sb_ro {
> > +    bool sb_readonly;
> > +};
> > +
> > +static void *
> > +en_sb_ro_init(struct engine_node *node OVS_UNUSED,
> > +              struct engine_arg *arg OVS_UNUSED)
> > +{
> > +    struct ed_type_sb_ro *data = xzalloc(sizeof *data);
> > +    return data;
> > +}
> > +
> > +static void
> > +en_sb_ro_run(struct engine_node *node, void *data)
> > +{
> > +    struct ed_type_sb_ro *sb_ro_data = data;
> > +    bool sb_readonly = !engine_get_context()->ovnsb_idl_txn;
> > +    if (sb_ro_data->sb_readonly != sb_readonly) {
> > +        sb_ro_data->sb_readonly = sb_readonly;
> > +        if (!sb_ro_data->sb_readonly) {
> > +            engine_set_node_state(node, EN_UPDATED);
> > +        }
> > +    }
> > +}
> > +
> > +static void
> > +en_sb_ro_cleanup(void *data OVS_UNUSED)
> > +{
> > +}
> > +
> > +static bool
> > +pb_claims_sb_ro_handler(struct engine_node *node, void *data OVS_UNUSED)
> > +{
> > +   engine_set_node_state(node, EN_UPDATED);
> > +   return true;
> > +}
> > +
> > +static void *
> > +en_pb_claims_init(struct engine_node *node OVS_UNUSED,
> > +                  struct engine_arg *arg OVS_UNUSED)
> > +{
> > +    return NULL;
> > +}
> > +
> > +static void
> > +en_pb_claims_run(struct engine_node *node OVS_UNUSED, void *data_
> OVS_UNUSED)
> > +{
> > +}
> > +
> > +static void
> > +en_pb_claims_cleanup(void *data OVS_UNUSED)
> > +{
> > +}
>
> Not really a problem of this patch but maybe we can find a way to avoid
> having to explcitly define cleanup/run callbacks that don't do much.
> Maybe some sensible defaults for engine node callbacks would make sense.
>  Han, what do you think?
>
> > +
> > +static bool
> > +pb_claims_runtime_data_handler(struct engine_node *node OVS_UNUSED,
> > +                               void *data OVS_UNUSED)
> > +{
> > +    return true;
> > +}
> > +
> > +static bool
> > +lflow_output_runtime_data_handler(struct engine_node *node,
> > +                                  void *data);
> > +static bool
> > +lflow_output_pb_claims_handler(struct engine_node *node, void *data
> OVS_UNUSED)
> > +{
> > +    const struct sbrec_chassis *chassis = NULL;
> > +
> > +    struct ovsrec_open_vswitch_table *ovs_table =
> > +        (struct ovsrec_open_vswitch_table *)EN_OVSDB_GET(
> > +            engine_get_input("OVS_open_vswitch", node));
> > +
> > +    const char *chassis_id = get_ovs_chassis_id(ovs_table);
> > +
> > +    struct ovsdb_idl_index *sbrec_chassis_by_name =
> > +        engine_ovsdb_node_get_index(
> > +                engine_get_input("SB_chassis", node),
> > +                "name");
> > +
> > +    if (chassis_id) {
> > +        chassis = chassis_lookup_by_name(sbrec_chassis_by_name,
> chassis_id);
> > +    }
> > +    if (chassis) {
> > +        struct ed_type_runtime_data *runtime_data =
> > +            engine_get_input_data("runtime_data", node);
> > +        bool sb_readonly = !engine_get_context()->ovnsb_idl_txn;
> > +        struct controller_engine_ctx *ctrl_ctx =
> > +            engine_get_context()->client_ctx;
> > +
> > +        if (if_status_handle_claims(ctrl_ctx->if_mgr,
> > +                                    &runtime_data->lbinding_data,
> > +                                    chassis,
> > +                                    &runtime_data->tracked_dp_bindings,
> > +                                    sb_readonly)) {
> > +            struct engine_node *rt_node =
> > +                engine_get_input("runtime_data", node);
> > +            if (!engine_node_changed(rt_node)) {
> > +                lflow_output_runtime_data_handler(node, data);
>
> Instead of explicitly calling another input's handler here I think I
> would factor out the logic of the handler into a separate helper
> function which we could call from lflow_output_runtime_data_handler()
> and here.  Based on its return value we could set the node state, if
> needed.
>
> > +            }
> > +        }
> > +    }
> > +    return true;
> > +}
> > +
> >  static bool
> >  runtime_data_ovs_interface_shadow_handler(struct engine_node *node,
> void *data)
> >  {
> > @@ -3438,6 +3543,8 @@ main(int argc, char *argv[])
> >      stopwatch_create(VIF_PLUG_RUN_STOPWATCH_NAME, SW_MS);
> >
> >      /* Define inc-proc-engine nodes. */
> > +    ENGINE_NODE(sb_ro, "sb_ro");
> > +    ENGINE_NODE(pb_claims, "pb_claims");
> >      ENGINE_NODE_WITH_CLEAR_TRACK_DATA_IS_VALID(ct_zones, "ct_zones");
> >      ENGINE_NODE_WITH_CLEAR_TRACK_DATA(ovs_interface_shadow,
> >                                        "ovs_interface_shadow");
> > @@ -3477,6 +3584,13 @@ main(int argc, char *argv[])
> >      engine_add_input(&en_non_vif_data, &en_ovs_interface,
> >                       non_vif_data_ovs_iface_handler);
> >
> > +    /* This ensures we run after runtime data. */
> > +    engine_add_input(&en_pb_claims, &en_runtime_data,
> > +                     pb_claims_runtime_data_handler);
>
> This is neat but I think we need to at least add a comment to
> lflow_output_pb_claims_handler() to explain why it's ok to access the
> runtime_data input there.
>
> Also, we can use engine_noop_handler() instead of defining an explicit one.
>
> > +
> > +    /* This ensures we always run when sb becomes writable. */
> > +    engine_add_input(&en_pb_claims, &en_sb_ro, pb_claims_sb_ro_handler);
> > +
> >      /* Note: The order of inputs is important, all OVS interface
> changes must
> >       * be handled before any ct_zone changes.
> >       */
> > @@ -3518,6 +3632,8 @@ main(int argc, char *argv[])
> >                       lflow_output_port_groups_handler);
> >      engine_add_input(&en_lflow_output, &en_runtime_data,
> >                       lflow_output_runtime_data_handler);
> > +    engine_add_input(&en_lflow_output, &en_pb_claims,
> > +                     lflow_output_pb_claims_handler);
> >      engine_add_input(&en_lflow_output, &en_non_vif_data,
> >                       NULL);
> >
> > @@ -3999,7 +4115,10 @@ main(int argc, char *argv[])
> >                          runtime_data ? &runtime_data->lbinding_data :
> NULL;
> >                      stopwatch_start(IF_STATUS_MGR_UPDATE_STOPWATCH_NAME,
> >                                      time_msec());
> > -                    if_status_mgr_update(if_mgr, binding_data);
> > +                    if_status_mgr_update(if_mgr, binding_data, chassis,
> > +                                         (runtime_data
> > +                                          ?
> &runtime_data->tracked_dp_bindings
> > +                                          : NULL), !ovnsb_idl_txn);
>
> I'm not completely sure that this is correct.  The engine has already
> run so any nodes that depend on runtime_data have already been processed.
>
> But now we might be adding more "tracked changes" to runtime_data, e.g.:
>
> if_status_mgr_update()
>   -> local_binding_set_pb()
>     -> update_lport_tracking()
>         -> tracked_datapath_lport_add(pb, TRACKED_RESOURCE_NEW, ..)
>
> After this point these tracked changes will not be processed anymore and
> will be cleared before the next run starts in the next iteration in
> en_runtime_data_clear_tracked_data().
>
> The question is: are we ever adding more "tracked changes" to
> runtime_data->tracked_dp_bindings here?  If the answer is "yes" then
> this approach might be problematic.  If the answer is "no" then we might
> as well avoid passing the runtime_data->tracked_dp_bindings all the way
> down and just call local_binding_set_pb() with tracked_datapaths set to
> NULL where needed.
>
> I think we were adding tracked changes, and so the approach was
problematic.
I reorg the nodes, and now have only one additional node as input of
runtime data.
Only if_status_handle_claims or if_status_mgr_claim_iface update tracked
data (
as part of I+P).

> >                      stopwatch_stop(IF_STATUS_MGR_UPDATE_STOPWATCH_NAME,
> >                                     time_msec());
> >
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index 59d51f3e0..e9c061939 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -14990,7 +14990,10 @@ AT_CHECK([as hv1 ovs-ofctl dump-flows br-int
> table=65 | grep actions=output:1],
> >  echo "verifying that lsp0 binding moves when requested-chassis is
> changed"
> >
> >  ovn-nbctl lsp-set-options lsp0 requested-chassis=hv2
> > -OVS_WAIT_UNTIL([test 1 = $(grep -c "Releasing lport lsp0 from this
> chassis" hv1/ovn-controller.log)])
> > +
> > +# We might see multiple "Releasing lport ...", when sb is read only
> > +OVS_WAIT_UNTIL([test 1 -le $(grep -c "Releasing lport lsp0 from this
> chassis" hv1/ovn-controller.log)])
> > +
> >  wait_column "$hv2_uuid" Port_Binding chassis logical_port=lsp0
> >
> >  # (6) Chassis hv2 should add flows and hv1 should not.
> > @@ -15036,7 +15039,7 @@ AT_CHECK([as hv1 ovs-ofctl dump-flows br-int
> table=0 | grep in_port=1], [0], [ig
> >  AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=65 | grep
> actions=output:1], [0], [ignore])
> >
> >  check ovn-nbctl --wait=hv lsp-set-options lsp0
> requested-chassis=non-existant-chassis
> > -OVS_WAIT_UNTIL([test 1 = $(grep -c "Releasing lport lsp0 from this
> chassis" hv1/ovn-controller.log)])
> > +OVS_WAIT_UNTIL([test 1 -le $(grep -c "Releasing lport lsp0 from this
> chassis" hv1/ovn-controller.log)])
> >  check ovn-nbctl --wait=hv sync
> >  wait_column '' Port_Binding chasssi logical_port=lsp0
> >  AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=0 | grep in_port=1],
> [1], [])
> > @@ -32041,3 +32044,109 @@ AT_CHECK([test $(ovn-sbctl list fdb | grep -c
> "00:00:00:00:10:30") = 0])
> >  OVN_CLEANUP([hv1])
> >  AT_CLEANUP
> >  ])
> > +
> > +# RUN_OVN_NBCTL() executes list of commands built by the OVN_NBCTL()
> macro.
> > +m4_define([OVN_NBCTL], [
> > +    command="${command} -- $1"
> > +])
> > +
> > +m4_define([RUN_OVN_NBCTL], [
> > +    check ovn-nbctl ${command}
> > +    unset command
> > +])
>
> Why can't we factor this out to ovn-macros.at (and remove the version
> from perf-northd.at too)?
>
> Done

> > +
> > +OVN_FOR_EACH_NORTHD([
> > +AT_SETUP([recomputes])
>
> As this is a test that is also performance related I think we could add
> AT_KEYWORDS([performance]).  We could also add that to the tests in
> perf-northd.at.  But we can do all this as a follow up, I guess, to
> allow us to run all performance related tests separately if needed.
>
> Will do in further patch.

> > +ovn_start
> > +
> > +n_hv=4
> > +
> > +# Add chassis
> > +net_add n1
> > +for i in $(seq 1 $n_hv); do
> > +    sim_add hv$i
> > +    as hv$i
> > +    check ovs-vsctl add-br br-phys
> > +    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> > +    ovn_attach n1 br-phys 192.168.0.$i 24 geneve
> > +done
> > +
> > +add_switch_ports() {
> > +    start_port=$1
> > +    end_port=$2
> > +    nb_hv=$3
> > +    bulk_size=$4
> > +    for ((i=start_port; i<end_port; )) do
> > +        start_bulk=$i
> > +        for hv in $(seq 1 $nb_hv); do
> > +            end_bulk=$((start_bulk+bulk_size-1))
> > +            for port in $(seq $start_bulk $end_bulk); do
> > +                logical_switch_port=lsp${port}
> > +                OVN_NBCTL(lsp-add ls1 $logical_switch_port)
> > +                OVN_NBCTL(lsp-set-addresses $logical_switch_port
> dynamic)
> > +            done
> > +            start_bulk=$((end_bulk+1))
> > +        done
> > +        RUN_OVN_NBCTL()
> > +
> > +        start_bulk=$i
> > +        for hv in $(seq 1 $nb_hv); do
> > +            end_bulk=$((start_bulk+bulk_size-1))
> > +            for port in $(seq $start_bulk $end_bulk); do
> > +                logical_switch_port=lsp${port}
> > +                as hv$hv ovs-vsctl \
> > +                    --no-wait -- add-port br-int vif${port} \
> > +                    -- set Interface vif${port}
> external_ids:iface-id=$logical_switch_port
> > +            done
> > +            start_bulk=$((end_bulk+1))
> > +        done
> > +        i=$((end_bulk+1))
> > +    done
> > +
> > +    # Claiming still WIP here but check for recomputes so test can fail
> faster
> > +    AT_CHECK([test $(ovn-appctl -t ovn-controller coverage/read-counter
> lflow_run) == $lflow_run])
> > +}
> > +check ovn-nbctl ls-add ls1
> > +check ovn-nbctl set Logical_Switch ls1 other_config:subnet=10.1.0.0/16
> > +check ovn-nbctl set Logical_Switch ls1
> other_config:exclude_ips=10.1.255.254
> > +
> > +check ovn-nbctl lr-add lr1
> > +check ovn-nbctl lsp-add ls1 lsp0 -- set Logical_Switch_Port lsp0
> type=router options:router-port=lrp0 addresses=dynamic
> > +check ovn-nbctl lrp-add lr1 lrp0 "f0:00:00:01:00:01" 10.1.255.254/16
> > +check ovn-nbctl lr-nat-add lr1 snat 10.2.0.1 10.1.0.0/16
> > +
> > +check ovn-nbctl --wait=hv sync
> > +lflow_run=$(ovn-appctl -t ovn-controller coverage/read-counter
> lflow_run)
> > +
> > +add_switch_ports 1 1000 $n_hv 5
> > +
> > +wait_for_ports_up
> > +check ovn-nbctl --wait=hv sync
> > +
> > +for i in $(seq 1 $n_hv); do
> > +    pid=$(cat hv${i}/ovn-controller.pid)
> > +    u=$((u+$(cat "/proc/$pid/stat" | awk '{print $14}')))
> > +    s=$((s+$(cat "/proc/$pid/stat" | awk '{print $15}')))
> > +done
> > +
> > +n_pid=$(cat northd/ovn-northd.pid)
> > +n_u=$(cat "/proc/$pid/stat" | awk '{print $14}')
> > +n_s=$(cat "/proc/$pid/stat" | awk '{print $15}')
> > +
> > +echo "Total Northd User Time: $n_u"
> > +echo "Total Northd System Time: $n_s"
> > +echo "Total Controller User Time: $u"
> > +echo "Total Controller System Time: $s"
> > +
> > +lflow_run1=$(ovn-appctl -t ovn-controller coverage/read-counter
> lflow_run)
> > +n_recomputes=`expr $lflow_run1 - $lflow_run`
> > +echo "$n_recomputes recomputes"
> > +
> > +AT_CHECK([test $lflow_run1 == $lflow_run])
> > +
> > +for i in $(seq 2 $n_hv); do
> > +    OVN_CLEANUP_SBOX([hv$i])
> > +done
> > +OVN_CLEANUP([hv1])
> > +AT_CLEANUP
> > +])
>
> Thanks,
> Dumitru
>
> Thanks
Xavier
diff mbox series

Patch

diff --git a/controller/binding.c b/controller/binding.c
index 2279570f9..b21577f71 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -644,11 +644,17 @@  local_binding_get_lport_ofport(const struct shash *local_bindings,
 }
 
 bool
-local_binding_is_up(struct shash *local_bindings, const char *pb_name)
+local_binding_is_up(struct shash *local_bindings, const char *pb_name,
+                    const struct sbrec_chassis *chassis_rec)
 {
     struct local_binding *lbinding =
         local_binding_find(local_bindings, pb_name);
     struct binding_lport *b_lport = local_binding_get_primary_lport(lbinding);
+
+    if (b_lport && b_lport->pb->chassis != chassis_rec) {
+        return false;
+    }
+
     if (lbinding && b_lport && lbinding->iface) {
         if (b_lport->pb->n_up && !b_lport->pb->up[0]) {
             return false;
@@ -660,13 +666,23 @@  local_binding_is_up(struct shash *local_bindings, const char *pb_name)
 }
 
 bool
-local_binding_is_down(struct shash *local_bindings, const char *pb_name)
+local_binding_is_down(struct shash *local_bindings, const char *pb_name,
+                      const struct sbrec_chassis *chassis_rec)
 {
     struct local_binding *lbinding =
         local_binding_find(local_bindings, pb_name);
 
     struct binding_lport *b_lport = local_binding_get_primary_lport(lbinding);
 
+    if (b_lport) {
+        if (b_lport->pb->chassis == chassis_rec) {
+            return false;
+        } else if (b_lport->pb->chassis) {
+            VLOG_DBG("lport %s already claimed by other chassis",
+                     b_lport->pb->logical_port);
+        }
+    }
+
     if (!lbinding) {
         return true;
     }
@@ -884,37 +900,80 @@  get_lport_type_str(enum en_lport_type lport_type)
     OVS_NOT_REACHED();
 }
 
-/* For newly claimed ports, if 'notify_up' is 'false':
+void
+set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,
+                        const struct sbrec_chassis *chassis_rec,
+                        bool is_set)
+{
+    if (pb->chassis != chassis_rec) {
+         if (is_set) {
+            if (pb->chassis) {
+                VLOG_INFO("Changing chassis for lport %s from %s to %s.",
+                          pb->logical_port, pb->chassis->name,
+                          chassis_rec->name);
+            } else {
+                VLOG_INFO("Claiming lport %s for this chassis.",
+                          pb->logical_port);
+            }
+            for (int i = 0; i < pb->n_mac; i++) {
+                VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]);
+            }
+            sbrec_port_binding_set_chassis(pb, chassis_rec);
+        }
+    } else if (!is_set) {
+        sbrec_port_binding_set_chassis(pb, NULL);
+    }
+}
+
+void
+local_binding_set_pb(struct shash *local_bindings, const char *pb_name,
+                     const struct sbrec_chassis *chassis_rec,
+                     struct hmap *tracked_datapaths, bool is_set)
+{
+    struct local_binding *lbinding =
+        local_binding_find(local_bindings, pb_name);
+    struct binding_lport *b_lport = local_binding_get_primary_lport(lbinding);
+
+    if (b_lport) {
+        set_pb_chassis_in_sbrec(b_lport->pb, chassis_rec, is_set);
+        if (tracked_datapaths) {
+            update_lport_tracking(b_lport->pb, tracked_datapaths, true);
+        }
+    }
+}
+
+/* For newly claimed ports:
  * - set the 'pb.up' field to true if 'pb' has no 'parent_pb'.
  * - set the 'pb.up' field to true if 'parent_pb.up' is 'true' (e.g., for
  *   container and virtual ports).
- * Otherwise request a notification to be sent when the OVS flows
- * corresponding to 'pb' have been installed.
+ *
+ * Returns false if lport is not claimed due to 'sb_readonly'.
+ * Returns true otherwise.
  *
  * Note:
- *   Updates (directly or through a notification) the 'pb->up' field only if
- *   it's explicitly set to 'false'.
+ *   Updates the 'pb->up' field only if it's explicitly set to 'false'.
  *   This is to ensure compatibility with older versions of ovn-northd.
  */
-static void
+static bool
 claimed_lport_set_up(const struct sbrec_port_binding *pb,
                      const struct sbrec_port_binding *parent_pb,
-                     const struct sbrec_chassis *chassis_rec,
-                     bool notify_up, struct if_status_mgr *if_mgr)
+                     bool sb_readonly)
 {
-    if (!notify_up) {
-        bool up = true;
-        if (!parent_pb || (parent_pb->n_up && parent_pb->up[0])) {
+    /* When notify_up is false in claim_port(), no state is created
+     * by if_status_mgr. In such cases, return false (i.e. trigger recompute)
+     * if we can't update sb (because it is readonly).
+     */
+    bool up = true;
+    if (!parent_pb || (parent_pb->n_up && parent_pb->up[0])) {
+        if (!sb_readonly) {
             if (pb->n_up) {
                 sbrec_port_binding_set_up(pb, &up, 1);
             }
+        } else if (pb->n_up && !pb->up[0]) {
+            return false;
         }
-        return;
-    }
-
-    if (pb->chassis != chassis_rec || (pb->n_up && !pb->up[0])) {
-        if_status_mgr_claim_iface(if_mgr, pb->logical_port);
     }
+    return true;
 }
 
 typedef void (*set_func)(const struct sbrec_port_binding *pb,
@@ -1057,37 +1116,35 @@  claim_lport(const struct sbrec_port_binding *pb,
             struct hmap *tracked_datapaths,
             struct if_status_mgr *if_mgr)
 {
-    if (!sb_readonly) {
-        claimed_lport_set_up(pb, parent_pb, chassis_rec, notify_up, if_mgr);
-    }
-
     enum can_bind can_bind = lport_can_bind_on_this_chassis(chassis_rec, pb);
     bool update_tracked = false;
 
     if (can_bind == CAN_BIND_AS_MAIN) {
         if (pb->chassis != chassis_rec) {
-            if (sb_readonly) {
-                return false;
-            }
-
-            if (pb->chassis) {
-                VLOG_INFO("Changing chassis for lport %s from %s to %s.",
-                        pb->logical_port, pb->chassis->name,
-                        chassis_rec->name);
-            } else {
-                VLOG_INFO("Claiming lport %s for this chassis.",
-                          pb->logical_port);
-            }
-            for (size_t i = 0; i < pb->n_mac; i++) {
-                VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]);
-            }
-
-            sbrec_port_binding_set_chassis(pb, chassis_rec);
             if (is_additional_chassis(pb, chassis_rec)) {
+                if (sb_readonly) {
+                    return false;
+                }
                 remove_additional_chassis(pb, chassis_rec);
             }
             update_tracked = true;
         }
+        if (!notify_up) {
+            if (!claimed_lport_set_up(pb, parent_pb, sb_readonly)) {
+                return false;
+            }
+            if (pb->chassis != chassis_rec) {
+                if (sb_readonly) {
+                    return false;
+                }
+                set_pb_chassis_in_sbrec(pb, chassis_rec, true);
+            }
+        } else {
+            if ((pb->chassis != chassis_rec) || (pb->n_up && !pb->up[0])) {
+                if_status_mgr_claim_iface(if_mgr, pb, chassis_rec,
+                                          sb_readonly);
+            }
+        }
     } else if (can_bind == CAN_BIND_AS_ADDITIONAL) {
         if (!is_additional_chassis(pb, chassis_rec)) {
             if (sb_readonly) {
@@ -1132,7 +1189,8 @@  claim_lport(const struct sbrec_port_binding *pb,
  */
 static bool
 release_lport_main_chassis(const struct sbrec_port_binding *pb,
-                           bool sb_readonly)
+                           bool sb_readonly,
+                           struct if_status_mgr *if_mgr)
 {
     if (pb->encap) {
         if (sb_readonly) {
@@ -1141,11 +1199,13 @@  release_lport_main_chassis(const struct sbrec_port_binding *pb,
         sbrec_port_binding_set_encap(pb, NULL);
     }
 
+    /* If sb readonly, pb->chassis unset through if-status if present. */
     if (pb->chassis) {
-        if (sb_readonly) {
+        if (!sb_readonly) {
+            sbrec_port_binding_set_chassis(pb, NULL);
+        } else if (!if_status_mgr_iface_is_present(if_mgr, pb->logical_port)) {
             return false;
         }
-        sbrec_port_binding_set_chassis(pb, NULL);
     }
 
     if (pb->virtual_parent) {
@@ -1155,7 +1215,8 @@  release_lport_main_chassis(const struct sbrec_port_binding *pb,
         sbrec_port_binding_set_virtual_parent(pb, NULL);
     }
 
-    VLOG_INFO("Releasing lport %s from this chassis.", pb->logical_port);
+    VLOG_INFO("Releasing lport %s from this chassis (sb_readonly=%d)",
+              pb->logical_port, sb_readonly);
     return true;
 }
 
@@ -1189,7 +1250,7 @@  release_lport(const struct sbrec_port_binding *pb,
               struct hmap *tracked_datapaths, struct if_status_mgr *if_mgr)
 {
     if (pb->chassis == chassis_rec) {
-        if (!release_lport_main_chassis(pb, sb_readonly)) {
+        if (!release_lport_main_chassis(pb, sb_readonly, if_mgr)) {
             return false;
         }
     } else if (is_additional_chassis(pb, chassis_rec)) {
@@ -1271,7 +1332,7 @@  consider_vif_lport_(const struct sbrec_port_binding *pb,
                              b_lport->lbinding->iface,
                              !b_ctx_in->ovnsb_idl_txn,
                              !parent_pb, b_ctx_out->tracked_dp_bindings,
-                             b_ctx_out->if_mgr)){
+                             b_ctx_out->if_mgr)) {
                 return false;
             }
 
@@ -1527,7 +1588,8 @@  consider_localport(const struct sbrec_port_binding *pb,
     enum can_bind can_bind = lport_can_bind_on_this_chassis(
         b_ctx_in->chassis_rec, pb);
     if (can_bind == CAN_BIND_AS_MAIN) {
-        if (!release_lport_main_chassis(pb, !b_ctx_in->ovnsb_idl_txn)) {
+        if (!release_lport_main_chassis(pb, !b_ctx_in->ovnsb_idl_txn,
+            b_ctx_out->if_mgr)) {
             return false;
         }
     } else if (can_bind == CAN_BIND_AS_ADDITIONAL) {
diff --git a/controller/binding.h b/controller/binding.h
index 1fed06674..d20659b0b 100644
--- a/controller/binding.h
+++ b/controller/binding.h
@@ -151,8 +151,10 @@  const struct sbrec_port_binding *local_binding_get_primary_pb(
 ofp_port_t local_binding_get_lport_ofport(const struct shash *local_bindings,
                                           const char *pb_name);
 
-bool local_binding_is_up(struct shash *local_bindings, const char *pb_name);
-bool local_binding_is_down(struct shash *local_bindings, const char *pb_name);
+bool local_binding_is_up(struct shash *local_bindings, const char *pb_name,
+                         const struct sbrec_chassis *);
+bool local_binding_is_down(struct shash *local_bindings, const char *pb_name,
+                           const struct sbrec_chassis *);
 void local_binding_set_up(struct shash *local_bindings, const char *pb_name,
                           const struct sbrec_chassis *chassis_rec,
                           const char *ts_now_str, bool sb_readonly,
@@ -160,7 +162,10 @@  void local_binding_set_up(struct shash *local_bindings, const char *pb_name,
 void local_binding_set_down(struct shash *local_bindings, const char *pb_name,
                             const struct sbrec_chassis *chassis_rec,
                             bool sb_readonly, bool ovs_readonly);
-
+void local_binding_set_pb(struct shash *local_bindings, const char *pb_name,
+                          const struct sbrec_chassis *chassis_rec,
+                          struct hmap *tracked_datapaths,
+                          bool is_set);
 void binding_register_ovs_idl(struct ovsdb_idl *);
 void binding_run(struct binding_ctx_in *, struct binding_ctx_out *);
 bool binding_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn,
@@ -178,6 +183,10 @@  void binding_dump_local_bindings(struct local_binding_data *, struct ds *);
 bool is_additional_chassis(const struct sbrec_port_binding *pb,
                            const struct sbrec_chassis *chassis_rec);
 
+void set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,
+                             const struct sbrec_chassis *chassis_rec,
+                             bool is_set);
+
 /* Corresponds to each Port_Binding.type. */
 enum en_lport_type {
     LP_UNKNOWN,
diff --git a/controller/if-status.c b/controller/if-status.c
index ad61844d8..5b0eef859 100644
--- a/controller/if-status.c
+++ b/controller/if-status.c
@@ -24,6 +24,7 @@ 
 #include "lib/util.h"
 #include "timeval.h"
 #include "openvswitch/vlog.h"
+#include "lib/ovn-sb-idl.h"
 
 VLOG_DEFINE_THIS_MODULE(if_status);
 
@@ -53,9 +54,11 @@  VLOG_DEFINE_THIS_MODULE(if_status);
  */
 
 enum if_state {
-    OIF_CLAIMED,       /* Newly claimed interface. */
-    OIF_INSTALL_FLOWS, /* Already claimed interface for which flows are still
-                        * being installed.
+    OIF_CLAIMED,       /* Newly claimed interface. pb->chassis not yet updated.
+                        */
+    OIF_INSTALL_FLOWS, /* Claimed interface with pb->chassis successfully
+                        * updated in SB and for which flows are still being
+                        * installed.
                         */
     OIF_MARK_UP,       /* Interface with flows successfully installed in OVS
                         * but not yet marked "up" in the binding module (in
@@ -78,6 +81,55 @@  static const char *if_state_names[] = {
     [OIF_INSTALLED]     = "INSTALLED",
 };
 
+/*
+ *       +----------------------+
+ * +---> |                      |
+ * | +-> |         NULL         | <--------------------------------------+++-+
+ * | |   +----------------------+                                            |
+ * | |     ^ release_iface   | claim_iface                                   |
+ * | |     |                 V - sbrec_update_chassis(if sb is rw)           |
+ * | |   +----------------------+                                            |
+ * | |   |                      | <----------------------------------------+ |
+ * | |   |       CLAIMED        | <--------------------------------------+ | |
+ * | |   +----------------------+                                        | | |
+ * | |                  | mgr_update(when sb is rw)                      | | |
+ * | | release_iface    |  - sbrec_update_chassis                        | | |
+ * | |                  |  - request seqno                               | | |
+ * | |                  V                                                | | |
+ * | |   +----------------------+                                        | | |
+ * | +-- |                      |  mgr_run(seqno not rcvd)               | | |
+ * |     |    INSTALL_FLOWS     |   - set port down in sb                | | |
+ * |     |                      |  mgr_update()                          | | |
+ * |     +----------------------+   - sbrec_update_chassis if needed     | | |
+ * |                    |                                                | | |
+ * |                    |  mgr_run(seqno rcvd)                           | | |
+ * |                    |  - set port up in sb                           | | |
+ * | release_iface      |  - set ovn-installed in ovs                    | | |
+ * |                    V                                                | | |
+ * |   +----------------------+                                          | | |
+ * |   |                      |  mgr_run()                               | | |
+ * +-- |       MARK_UP        |  - set port up in sb                     | | |
+ *     |                      |  - set ovn-installed in ovs              | | |
+ *     |                      |  mgr_update()                            | | |
+ *     +----------------------+  - sbrec_update_chassis if needed        | | |
+ *              |                                                        | | |
+ *              | mgr_update(rcvd port up / ovn_installed & chassis set) | | |
+ *              V                                                        | | |
+ *     +----------------------+                                          | | |
+ *     |      INSTALLED       | ------------> claim_iface ---------------+ | |
+ *     +----------------------+                                            | |
+ *              |                                                          | |
+ *              | release_iface                                            | |
+ *              V                                                          | |
+ *     +----------------------+                                            | |
+ *     |                      | ------------> claim_iface -----------------+ |
+ *     |      MARK_DOWN       | ------> mgr_update(rcvd port down) ----------+
+ *     |                      | mgr_run()
+ *     |                      | - set port down in sb
+ *     |                      | mgr_update()
+ *     +----------------------+ - sbrec_update_chassis(NULL)
+ */
+
 struct ovs_iface {
     char *id;               /* Extracted from OVS external_ids.iface_id. */
     enum if_state state;    /* State of the interface in the state machine. */
@@ -85,6 +137,7 @@  struct ovs_iface {
                              * be fully programmed in OVS.  Only used in state
                              * OIF_INSTALL_FLOWS.
                              */
+    bool chassis_update_required;  /* If true, pb->chassis must be updated. */
 };
 
 static uint64_t ifaces_usage;
@@ -158,14 +211,23 @@  if_status_mgr_destroy(struct if_status_mgr *mgr)
 }
 
 void
-if_status_mgr_claim_iface(struct if_status_mgr *mgr, const char *iface_id)
+if_status_mgr_claim_iface(struct if_status_mgr *mgr,
+                          const struct sbrec_port_binding *pb,
+                          const struct sbrec_chassis *chassis_rec,
+                          bool sb_readonly)
 {
+    const char *iface_id = pb->logical_port;
     struct ovs_iface *iface = shash_find_data(&mgr->ifaces, iface_id);
 
     if (!iface) {
         iface = ovs_iface_create(mgr, iface_id, OIF_CLAIMED);
     }
-
+    if (!sb_readonly) {
+        set_pb_chassis_in_sbrec(pb, chassis_rec, true);
+        iface->chassis_update_required = false;
+    } else {
+        iface->chassis_update_required = true;
+    }
     switch (iface->state) {
     case OIF_CLAIMED:
     case OIF_INSTALL_FLOWS:
@@ -182,6 +244,13 @@  if_status_mgr_claim_iface(struct if_status_mgr *mgr, const char *iface_id)
     }
 }
 
+bool
+if_status_mgr_iface_is_present(struct if_status_mgr *mgr, const char *iface_id)
+{
+    struct ovs_iface *iface = shash_find_data(&mgr->ifaces, iface_id);
+    return (!!iface);
+}
+
 void
 if_status_mgr_release_iface(struct if_status_mgr *mgr, const char *iface_id)
 {
@@ -246,9 +315,43 @@  if_status_mgr_delete_iface(struct if_status_mgr *mgr, const char *iface_id)
     }
 }
 
+bool
+if_status_handle_claims(struct if_status_mgr *mgr,
+                        struct local_binding_data *binding_data,
+                        const struct sbrec_chassis *chassis_rec,
+                        struct hmap *tracked_datapath,
+                        bool sb_readonly)
+{
+    if (!binding_data) {
+        return false;
+    }
+
+    struct shash *bindings = &binding_data->bindings;
+    struct hmapx_node *node;
+
+    bool rc = false;
+    HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED]) {
+        struct ovs_iface *iface = node->data;
+        if (sb_readonly) {
+            return false;
+        }
+        if (iface->chassis_update_required) {
+            VLOG_INFO("if_status_handle_claims for %s", iface->id);
+            local_binding_set_pb(bindings, iface->id, chassis_rec,
+                                 tracked_datapath, true);
+            rc = true;
+        }
+        iface->chassis_update_required = false;
+    }
+    return rc;
+}
+
 void
 if_status_mgr_update(struct if_status_mgr *mgr,
-                     struct local_binding_data *binding_data)
+                     struct local_binding_data *binding_data,
+                     const struct sbrec_chassis *chassis_rec,
+                     struct hmap *tracked_datapath,
+                     bool sb_readonly)
 {
     if (!binding_data) {
         return;
@@ -257,13 +360,25 @@  if_status_mgr_update(struct if_status_mgr *mgr,
     struct shash *bindings = &binding_data->bindings;
     struct hmapx_node *node;
 
+    /* Interfaces in OIF_MARK_UP state have already set their pb->chassis.
+     * However, it might have been reset by another hv.
+     */
     /* Move all interfaces that have been confirmed "up" by the binding module,
      * from OIF_MARK_UP to OIF_INSTALLED.
      */
     HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_MARK_UP]) {
         struct ovs_iface *iface = node->data;
 
-        if (local_binding_is_up(bindings, iface->id)) {
+        if (iface->chassis_update_required) {
+            if (!sb_readonly) {
+                iface->chassis_update_required = false;
+                local_binding_set_pb(bindings, iface->id, chassis_rec,
+                                     tracked_datapath, true);
+            } else {
+                continue;
+            }
+        }
+        if (local_binding_is_up(bindings, iface->id, chassis_rec)) {
             ovs_iface_set_state(mgr, iface, OIF_INSTALLED);
         }
     }
@@ -274,23 +389,53 @@  if_status_mgr_update(struct if_status_mgr *mgr,
     HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_MARK_DOWN]) {
         struct ovs_iface *iface = node->data;
 
-        if (local_binding_is_down(bindings, iface->id)) {
+        if (!sb_readonly) {
+            local_binding_set_pb(bindings, iface->id, chassis_rec,
+                                 tracked_datapath, false);
+        }
+        if (local_binding_is_down(bindings, iface->id, chassis_rec)) {
             ovs_iface_destroy(mgr, iface);
         }
     }
 
-    /* Register for a notification about flows being installed in OVS for all
-     * newly claimed interfaces.
+    if (!sb_readonly) {
+        HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_INSTALL_FLOWS]) {
+            struct ovs_iface *iface = node->data;
+
+            if (iface->chassis_update_required) {
+                iface->chassis_update_required = false;
+                local_binding_set_pb(bindings, iface->id, chassis_rec,
+                                     tracked_datapath, true);
+            }
+        }
+    }
+
+    /* Update Port_Binding->chassis for newly claimed interfaces
+     * Register for a notification about flows being installed in OVS for all
+     * newly claimed interfaces for which we could update pb->chassis.
      *
      * Move them from OIF_CLAIMED to OIF_INSTALL_FLOWS.
      */
-    bool new_ifaces = false;
-    HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED]) {
-        struct ovs_iface *iface = node->data;
 
-        ovs_iface_set_state(mgr, iface, OIF_INSTALL_FLOWS);
-        iface->install_seqno = mgr->iface_seqno + 1;
-        new_ifaces = true;
+    bool new_ifaces = false;
+    if (!sb_readonly) {
+        HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED]) {
+            struct ovs_iface *iface = node->data;
+            if (iface->chassis_update_required) {
+                local_binding_set_pb(bindings, iface->id, chassis_rec,
+                                     tracked_datapath, true);
+            }
+            ovs_iface_set_state(mgr, iface, OIF_INSTALL_FLOWS);
+            iface->install_seqno = mgr->iface_seqno + 1;
+            iface->chassis_update_required = false;
+            new_ifaces = true;
+        }
+    } else {
+        HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED]) {
+            struct ovs_iface *iface = node->data;
+            VLOG_INFO("Not updating pb chassis for %s now as sb is readonly",
+                     iface->id);
+        }
     }
 
     /* Request a seqno update when the flows for new interfaces have been
@@ -403,7 +548,7 @@  if_status_mgr_update_bindings(struct if_status_mgr *mgr,
     struct hmapx_node *node;
 
     /* Notify the binding module to set "down" all bindings that are still
-     * in the process of being installed in OVS, i.e., are not yet instsalled.
+     * in the process of being installed in OVS, i.e., are not yet installed.
      */
     HMAPX_FOR_EACH (node, &mgr->ifaces_per_state[OIF_INSTALL_FLOWS]) {
         struct ovs_iface *iface = node->data;
diff --git a/controller/if-status.h b/controller/if-status.h
index bb8a3950d..9b1200ff0 100644
--- a/controller/if-status.h
+++ b/controller/if-status.h
@@ -27,15 +27,28 @@  struct if_status_mgr *if_status_mgr_create(void);
 void if_status_mgr_clear(struct if_status_mgr *);
 void if_status_mgr_destroy(struct if_status_mgr *);
 
-void if_status_mgr_claim_iface(struct if_status_mgr *, const char *iface_id);
+void if_status_mgr_claim_iface(struct if_status_mgr *,
+                               const struct sbrec_port_binding *pb,
+                               const struct sbrec_chassis *chassis_rec,
+                               bool sb_readonly);
 void if_status_mgr_release_iface(struct if_status_mgr *, const char *iface_id);
 void if_status_mgr_delete_iface(struct if_status_mgr *, const char *iface_id);
 
-void if_status_mgr_update(struct if_status_mgr *, struct local_binding_data *);
+void if_status_mgr_update(struct if_status_mgr *, struct local_binding_data *,
+                          const struct sbrec_chassis *chassis,
+                          struct hmap *tracked_dp_bindings,
+                          bool sb_readonly);
 void if_status_mgr_run(struct if_status_mgr *mgr, struct local_binding_data *,
                        const struct sbrec_chassis *,
                        bool sb_readonly, bool ovs_readonly);
 void if_status_mgr_get_memory_usage(struct if_status_mgr *mgr,
                                     struct simap *usage);
+bool if_status_mgr_iface_is_present(struct if_status_mgr *mgr,
+                                    const char *iface_id);
+bool if_status_handle_claims(struct if_status_mgr *mgr,
+                             struct local_binding_data *binding_data,
+                             const struct sbrec_chassis *chassis_rec,
+                             struct hmap *tracked_datapath,
+                             bool sb_readonly);
 
 # endif /* controller/if-status.h */
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 2793c8687..1c3cac2c1 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -1417,6 +1417,111 @@  en_runtime_data_run(struct engine_node *node, void *data)
     engine_set_node_state(node, EN_UPDATED);
 }
 
+struct ed_type_sb_ro {
+    bool sb_readonly;
+};
+
+static void *
+en_sb_ro_init(struct engine_node *node OVS_UNUSED,
+              struct engine_arg *arg OVS_UNUSED)
+{
+    struct ed_type_sb_ro *data = xzalloc(sizeof *data);
+    return data;
+}
+
+static void
+en_sb_ro_run(struct engine_node *node, void *data)
+{
+    struct ed_type_sb_ro *sb_ro_data = data;
+    bool sb_readonly = !engine_get_context()->ovnsb_idl_txn;
+    if (sb_ro_data->sb_readonly != sb_readonly) {
+        sb_ro_data->sb_readonly = sb_readonly;
+        if (!sb_ro_data->sb_readonly) {
+            engine_set_node_state(node, EN_UPDATED);
+        }
+    }
+}
+
+static void
+en_sb_ro_cleanup(void *data OVS_UNUSED)
+{
+}
+
+static bool
+pb_claims_sb_ro_handler(struct engine_node *node, void *data OVS_UNUSED)
+{
+   engine_set_node_state(node, EN_UPDATED);
+   return true;
+}
+
+static void *
+en_pb_claims_init(struct engine_node *node OVS_UNUSED,
+                  struct engine_arg *arg OVS_UNUSED)
+{
+    return NULL;
+}
+
+static void
+en_pb_claims_run(struct engine_node *node OVS_UNUSED, void *data_ OVS_UNUSED)
+{
+}
+
+static void
+en_pb_claims_cleanup(void *data OVS_UNUSED)
+{
+}
+
+static bool
+pb_claims_runtime_data_handler(struct engine_node *node OVS_UNUSED,
+                               void *data OVS_UNUSED)
+{
+    return true;
+}
+
+static bool
+lflow_output_runtime_data_handler(struct engine_node *node,
+                                  void *data);
+static bool
+lflow_output_pb_claims_handler(struct engine_node *node, void *data OVS_UNUSED)
+{
+    const struct sbrec_chassis *chassis = NULL;
+
+    struct ovsrec_open_vswitch_table *ovs_table =
+        (struct ovsrec_open_vswitch_table *)EN_OVSDB_GET(
+            engine_get_input("OVS_open_vswitch", node));
+
+    const char *chassis_id = get_ovs_chassis_id(ovs_table);
+
+    struct ovsdb_idl_index *sbrec_chassis_by_name =
+        engine_ovsdb_node_get_index(
+                engine_get_input("SB_chassis", node),
+                "name");
+
+    if (chassis_id) {
+        chassis = chassis_lookup_by_name(sbrec_chassis_by_name, chassis_id);
+    }
+    if (chassis) {
+        struct ed_type_runtime_data *runtime_data =
+            engine_get_input_data("runtime_data", node);
+        bool sb_readonly = !engine_get_context()->ovnsb_idl_txn;
+        struct controller_engine_ctx *ctrl_ctx =
+            engine_get_context()->client_ctx;
+
+        if (if_status_handle_claims(ctrl_ctx->if_mgr,
+                                    &runtime_data->lbinding_data,
+                                    chassis,
+                                    &runtime_data->tracked_dp_bindings,
+                                    sb_readonly)) {
+            struct engine_node *rt_node =
+                engine_get_input("runtime_data", node);
+            if (!engine_node_changed(rt_node)) {
+                lflow_output_runtime_data_handler(node, data);
+            }
+        }
+    }
+    return true;
+}
+
 static bool
 runtime_data_ovs_interface_shadow_handler(struct engine_node *node, void *data)
 {
@@ -3438,6 +3543,8 @@  main(int argc, char *argv[])
     stopwatch_create(VIF_PLUG_RUN_STOPWATCH_NAME, SW_MS);
 
     /* Define inc-proc-engine nodes. */
+    ENGINE_NODE(sb_ro, "sb_ro");
+    ENGINE_NODE(pb_claims, "pb_claims");
     ENGINE_NODE_WITH_CLEAR_TRACK_DATA_IS_VALID(ct_zones, "ct_zones");
     ENGINE_NODE_WITH_CLEAR_TRACK_DATA(ovs_interface_shadow,
                                       "ovs_interface_shadow");
@@ -3477,6 +3584,13 @@  main(int argc, char *argv[])
     engine_add_input(&en_non_vif_data, &en_ovs_interface,
                      non_vif_data_ovs_iface_handler);
 
+    /* This ensures we run after runtime data. */
+    engine_add_input(&en_pb_claims, &en_runtime_data,
+                     pb_claims_runtime_data_handler);
+
+    /* This ensures we always run when sb becomes writable. */
+    engine_add_input(&en_pb_claims, &en_sb_ro, pb_claims_sb_ro_handler);
+
     /* Note: The order of inputs is important, all OVS interface changes must
      * be handled before any ct_zone changes.
      */
@@ -3518,6 +3632,8 @@  main(int argc, char *argv[])
                      lflow_output_port_groups_handler);
     engine_add_input(&en_lflow_output, &en_runtime_data,
                      lflow_output_runtime_data_handler);
+    engine_add_input(&en_lflow_output, &en_pb_claims,
+                     lflow_output_pb_claims_handler);
     engine_add_input(&en_lflow_output, &en_non_vif_data,
                      NULL);
 
@@ -3999,7 +4115,10 @@  main(int argc, char *argv[])
                         runtime_data ? &runtime_data->lbinding_data : NULL;
                     stopwatch_start(IF_STATUS_MGR_UPDATE_STOPWATCH_NAME,
                                     time_msec());
-                    if_status_mgr_update(if_mgr, binding_data);
+                    if_status_mgr_update(if_mgr, binding_data, chassis,
+                                         (runtime_data
+                                          ? &runtime_data->tracked_dp_bindings
+                                          : NULL), !ovnsb_idl_txn);
                     stopwatch_stop(IF_STATUS_MGR_UPDATE_STOPWATCH_NAME,
                                    time_msec());
 
diff --git a/tests/ovn.at b/tests/ovn.at
index 59d51f3e0..e9c061939 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -14990,7 +14990,10 @@  AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=65 | grep actions=output:1],
 echo "verifying that lsp0 binding moves when requested-chassis is changed"
 
 ovn-nbctl lsp-set-options lsp0 requested-chassis=hv2
-OVS_WAIT_UNTIL([test 1 = $(grep -c "Releasing lport lsp0 from this chassis" hv1/ovn-controller.log)])
+
+# We might see multiple "Releasing lport ...", when sb is read only
+OVS_WAIT_UNTIL([test 1 -le $(grep -c "Releasing lport lsp0 from this chassis" hv1/ovn-controller.log)])
+
 wait_column "$hv2_uuid" Port_Binding chassis logical_port=lsp0
 
 # (6) Chassis hv2 should add flows and hv1 should not.
@@ -15036,7 +15039,7 @@  AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=0 | grep in_port=1], [0], [ig
 AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=65 | grep actions=output:1], [0], [ignore])
 
 check ovn-nbctl --wait=hv lsp-set-options lsp0 requested-chassis=non-existant-chassis
-OVS_WAIT_UNTIL([test 1 = $(grep -c "Releasing lport lsp0 from this chassis" hv1/ovn-controller.log)])
+OVS_WAIT_UNTIL([test 1 -le $(grep -c "Releasing lport lsp0 from this chassis" hv1/ovn-controller.log)])
 check ovn-nbctl --wait=hv sync
 wait_column '' Port_Binding chasssi logical_port=lsp0
 AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=0 | grep in_port=1], [1], [])
@@ -32041,3 +32044,109 @@  AT_CHECK([test $(ovn-sbctl list fdb | grep -c "00:00:00:00:10:30") = 0])
 OVN_CLEANUP([hv1])
 AT_CLEANUP
 ])
+
+# RUN_OVN_NBCTL() executes list of commands built by the OVN_NBCTL() macro.
+m4_define([OVN_NBCTL], [
+    command="${command} -- $1"
+])
+
+m4_define([RUN_OVN_NBCTL], [
+    check ovn-nbctl ${command}
+    unset command
+])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([recomputes])
+ovn_start
+
+n_hv=4
+
+# Add chassis
+net_add n1
+for i in $(seq 1 $n_hv); do
+    sim_add hv$i
+    as hv$i
+    check ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovn_attach n1 br-phys 192.168.0.$i 24 geneve
+done
+
+add_switch_ports() {
+    start_port=$1
+    end_port=$2
+    nb_hv=$3
+    bulk_size=$4
+    for ((i=start_port; i<end_port; )) do
+        start_bulk=$i
+        for hv in $(seq 1 $nb_hv); do
+            end_bulk=$((start_bulk+bulk_size-1))
+            for port in $(seq $start_bulk $end_bulk); do
+                logical_switch_port=lsp${port}
+                OVN_NBCTL(lsp-add ls1 $logical_switch_port)
+                OVN_NBCTL(lsp-set-addresses $logical_switch_port dynamic)
+            done
+            start_bulk=$((end_bulk+1))
+        done
+        RUN_OVN_NBCTL()
+
+        start_bulk=$i
+        for hv in $(seq 1 $nb_hv); do
+            end_bulk=$((start_bulk+bulk_size-1))
+            for port in $(seq $start_bulk $end_bulk); do
+                logical_switch_port=lsp${port}
+                as hv$hv ovs-vsctl \
+                    --no-wait -- add-port br-int vif${port} \
+                    -- set Interface vif${port} external_ids:iface-id=$logical_switch_port
+            done
+            start_bulk=$((end_bulk+1))
+        done
+        i=$((end_bulk+1))
+    done
+
+    # Claiming still WIP here but check for recomputes so test can fail faster
+    AT_CHECK([test $(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) == $lflow_run])
+}
+check ovn-nbctl ls-add ls1
+check ovn-nbctl set Logical_Switch ls1 other_config:subnet=10.1.0.0/16
+check ovn-nbctl set Logical_Switch ls1 other_config:exclude_ips=10.1.255.254
+
+check ovn-nbctl lr-add lr1
+check ovn-nbctl lsp-add ls1 lsp0 -- set Logical_Switch_Port lsp0 type=router options:router-port=lrp0 addresses=dynamic
+check ovn-nbctl lrp-add lr1 lrp0 "f0:00:00:01:00:01" 10.1.255.254/16
+check ovn-nbctl lr-nat-add lr1 snat 10.2.0.1 10.1.0.0/16
+
+check ovn-nbctl --wait=hv sync
+lflow_run=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run)
+
+add_switch_ports 1 1000 $n_hv 5
+
+wait_for_ports_up
+check ovn-nbctl --wait=hv sync
+
+for i in $(seq 1 $n_hv); do
+    pid=$(cat hv${i}/ovn-controller.pid)
+    u=$((u+$(cat "/proc/$pid/stat" | awk '{print $14}')))
+    s=$((s+$(cat "/proc/$pid/stat" | awk '{print $15}')))
+done
+
+n_pid=$(cat northd/ovn-northd.pid)
+n_u=$(cat "/proc/$pid/stat" | awk '{print $14}')
+n_s=$(cat "/proc/$pid/stat" | awk '{print $15}')
+
+echo "Total Northd User Time: $n_u"
+echo "Total Northd System Time: $n_s"
+echo "Total Controller User Time: $u"
+echo "Total Controller System Time: $s"
+
+lflow_run1=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run)
+n_recomputes=`expr $lflow_run1 - $lflow_run`
+echo "$n_recomputes recomputes"
+
+AT_CHECK([test $lflow_run1 == $lflow_run])
+
+for i in $(seq 2 $n_hv); do
+    OVN_CLEANUP_SBOX([hv$i])
+done
+OVN_CLEANUP([hv1])
+AT_CLEANUP
+])