[ovs-dev,v3,2/5] ovn: Add generic HA chassis group

Message ID 20190305155312.12721-1-nusiddiq@redhat.com
State Superseded
Headers show
Series
  • ovn: Add HA chassis group and 'external' port support
Related show

Commit Message

Numan Siddique March 5, 2019, 3:53 p.m.
From: Numan Siddique <nusiddiq@redhat.com>

This patch adds the tables - 'HA_Chassis_Group' and 'HA_Chassis' in
both OVN Northbound and Southbound DBs to support generic HA Chassis
groups in OVN. CMS can create a group of HA chassis with the priorities
assigned to each chassis in the group. An HA chassis group can be associated to
a distributed logical router port. An upcoming patch will make
use of it while supporting  'external'* logical ports.

HA chassis group is similar to the existing gateway chassis support in
OVN which is used by the distributed gateway router ports.
This patch tries to abstract this so that, the HA chassis support
can be leveraged by not just distributed gateway router ports.

If a logical router port has a set of gateway chassis associated to
it, ovn-northd will create HA chassis group in Southbound
DB and add these gateway chassis to this group. ovn-northd would still create
gateway chassis in Southbound DB as ovn-controller still doesn't support
using the HA chassis group.

Next patch in the series will add the support in ovn-controller to
make use of HA chassis group instead of gateway chassis. The patch following
that will delete creation of gateway chassis in Southbound DB.

HA_Chasss_Group table in Southbound DB has a column - 'ref_chassis'.
This column is used to store the list of chassis which references the
HA chassis group. This information will be used by ovn-controller in an
upcoming patch to establish BFD sessions with the required chassis.

Suppose if there is an HA chassis group - 'hagrp1' in the Southbound
DB and it has HA chasiss list - ha1, ha2 and ha3 and this HA chassis
group is used by a distributed logical router port, then ovn-northd
will update the 'ref_chassis' with the list of chassis which has claimed
all the logical switch ports which are connected to the logical router
which has this distributed logical router port.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
---
 ovn/lib/chassis-index.c       |  26 +++
 ovn/lib/chassis-index.h       |   4 +
 ovn/northd/ovn-northd.c       | 374 ++++++++++++++++++++++++++++++++--
 ovn/ovn-nb.ovsschema          |  36 +++-
 ovn/ovn-nb.xml                |  75 +++++++
 ovn/ovn-sb.ovsschema          |  43 +++-
 ovn/ovn-sb.xml                |  63 ++++++
 ovn/utilities/ovn-nbctl.8.xml |  41 ++++
 ovn/utilities/ovn-nbctl.c     | 221 ++++++++++++++++++++
 ovn/utilities/ovn-sbctl.c     |   6 +
 tests/ovn-northd.at           | 262 ++++++++++++++++++++++++
 tests/ovn.at                  | 150 +++++++++++---
 12 files changed, 1247 insertions(+), 54 deletions(-)

Comments

0-day Robot March 5, 2019, 5:03 p.m. | #1
Bleep bloop.  Greetings Numan Siddique, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line lacks whitespace around operator
#1022 FILE: ovn/utilities/ovn-nbctl.c:698:
  ha-chassis-group-add GRP  Create an HA chassis group GRP\n\

WARNING: Line lacks whitespace around operator
#1023 FILE: ovn/utilities/ovn-nbctl.c:699:
  ha-chassis-group-del GRP  Delete the HA chassis group GRP\n\

WARNING: Line lacks whitespace around operator
#1024 FILE: ovn/utilities/ovn-nbctl.c:700:
  ha-chassis-group-list     List the HA chassis groups\n\

WARNING: Line lacks whitespace around operator
#1025 FILE: ovn/utilities/ovn-nbctl.c:701:
  ha-chassis-group-add-chassis GRP CHASSIS [PRIORITY] Adds an HA\

WARNING: Line lacks whitespace around operator
#1027 FILE: ovn/utilities/ovn-nbctl.c:703:
  ha-chassis-group-del-chassis GRP CHASSIS Deletes the HA chassis\

Lines checked: 1816, Warnings: 5, Errors: 0


Please check this out.  If you feel there has been an error, please email aconole@bytheb.org

Thanks,
0-day Robot
Han Zhou March 5, 2019, 5:58 p.m. | #2
On Tue, Mar 5, 2019 at 8:06 AM <nusiddiq@redhat.com> wrote:
>
> From: Numan Siddique <nusiddiq@redhat.com>
>
> This patch adds the tables - 'HA_Chassis_Group' and 'HA_Chassis' in
> both OVN Northbound and Southbound DBs to support generic HA Chassis
> groups in OVN. CMS can create a group of HA chassis with the priorities
> assigned to each chassis in the group. An HA chassis group can be associated to
> a distributed logical router port. An upcoming patch will make
> use of it while supporting  'external'* logical ports.
>
> HA chassis group is similar to the existing gateway chassis support in
> OVN which is used by the distributed gateway router ports.
> This patch tries to abstract this so that, the HA chassis support
> can be leveraged by not just distributed gateway router ports.
>
> If a logical router port has a set of gateway chassis associated to
> it, ovn-northd will create HA chassis group in Southbound
> DB and add these gateway chassis to this group. ovn-northd would still create
> gateway chassis in Southbound DB as ovn-controller still doesn't support
> using the HA chassis group.
>
> Next patch in the series will add the support in ovn-controller to
> make use of HA chassis group instead of gateway chassis. The patch following
> that will delete creation of gateway chassis in Southbound DB.
>
> HA_Chasss_Group table in Southbound DB has a column - 'ref_chassis'.
> This column is used to store the list of chassis which references the
> HA chassis group. This information will be used by ovn-controller in an
> upcoming patch to establish BFD sessions with the required chassis.
>
> Suppose if there is an HA chassis group - 'hagrp1' in the Southbound
> DB and it has HA chasiss list - ha1, ha2 and ha3 and this HA chassis
> group is used by a distributed logical router port, then ovn-northd
> will update the 'ref_chassis' with the list of chassis which has claimed
> all the logical switch ports which are connected to the logical router
> which has this distributed logical router port.
>
> Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> ---
>  ovn/lib/chassis-index.c       |  26 +++
>  ovn/lib/chassis-index.h       |   4 +
>  ovn/northd/ovn-northd.c       | 374 ++++++++++++++++++++++++++++++++--
>  ovn/ovn-nb.ovsschema          |  36 +++-
>  ovn/ovn-nb.xml                |  75 +++++++
>  ovn/ovn-sb.ovsschema          |  43 +++-
>  ovn/ovn-sb.xml                |  63 ++++++
>  ovn/utilities/ovn-nbctl.8.xml |  41 ++++
>  ovn/utilities/ovn-nbctl.c     | 221 ++++++++++++++++++++
>  ovn/utilities/ovn-sbctl.c     |   6 +
>  tests/ovn-northd.at           | 262 ++++++++++++++++++++++++
>  tests/ovn.at                  | 150 +++++++++++---
>  12 files changed, 1247 insertions(+), 54 deletions(-)
>
> diff --git a/ovn/lib/chassis-index.c b/ovn/lib/chassis-index.c
> index a5dbf4ace..34d4a31eb 100644
> --- a/ovn/lib/chassis-index.c
> +++ b/ovn/lib/chassis-index.c
> @@ -39,3 +39,29 @@ chassis_lookup_by_name(struct ovsdb_idl_index *sbrec_chassis_by_name,
>
>      return retval;
>  }
> +
> +struct ovsdb_idl_index *
> +ha_chassis_group_index_create(struct ovsdb_idl *idl)
> +{
> +    return ovsdb_idl_index_create1(idl, &sbrec_ha_chassis_group_col_name);
> +}
> +
> +/* Finds and returns the HA chassis group with the given 'name', or NULL
> + * if no such HA chassis group exists. */
> +const struct sbrec_ha_chassis_group *
> +ha_chassis_group_lookup_by_name(
> +    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name,
> +    const char *name)
> +{
> +    struct sbrec_ha_chassis_group *target =
> +        sbrec_ha_chassis_group_index_init_row(sbrec_ha_chassis_grp_by_name);
> +    sbrec_ha_chassis_group_set_name(target, name);
> +
> +    struct sbrec_ha_chassis_group *retval =
> +        sbrec_ha_chassis_group_index_find(sbrec_ha_chassis_grp_by_name,
> +                                          target);
> +
> +    sbrec_ha_chassis_group_index_destroy_row(target);
> +
> +    return retval;
> +}
> diff --git a/ovn/lib/chassis-index.h b/ovn/lib/chassis-index.h
> index d5e5df926..9bc610ad2 100644
> --- a/ovn/lib/chassis-index.h
> +++ b/ovn/lib/chassis-index.h
> @@ -23,4 +23,8 @@ struct ovsdb_idl_index *chassis_index_create(struct ovsdb_idl *);
>  const struct sbrec_chassis *chassis_lookup_by_name(
>      struct ovsdb_idl_index *sbrec_chassis_by_name, const char *name);
>
> +struct ovsdb_idl_index *ha_chassis_group_index_create(struct ovsdb_idl *idl);
> +const struct sbrec_ha_chassis_group *ha_chassis_group_lookup_by_name(
> +    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name, const char *name);
> +
>  #endif /* ovn/lib/chassis-index.h */
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 838251dd1..f4fc5be2f 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -56,6 +56,7 @@ struct northd_context {
>      struct ovsdb_idl *ovnsb_idl;
>      struct ovsdb_idl_txn *ovnnb_txn;
>      struct ovsdb_idl_txn *ovnsb_txn;
> +    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name;
>  };
>
>  static const char *ovnnb_db;
> @@ -1991,6 +1992,13 @@ sbpb_gw_chassis_needs_update(
>          return false;
>      }
>
> +    if (lrp->n_gateway_chassis && !port_binding->ha_chassis_group) {
> +        /* If there are gateway chassis in the NB DB, but there is
> +         * no corresponding HA chassis group in SB DB we need to
> +         * create the HA chassis group in SB DB for this lrp. */
> +        return true;
> +    }
> +
>      /* These arrays are used to collect valid Gateway_Chassis and valid
>       * Chassis records from the Logical_Router_Port Gateway_Chassis list,
>       * we ignore the ones we can't match on the SBDB */
> @@ -2060,31 +2068,63 @@ sbpb_gw_chassis_needs_update(
>      return false;
>  }
>
> +static struct sbrec_ha_chassis *
> +create_sb_ha_chassis(struct northd_context *ctx,
> +                     const struct sbrec_chassis *chassis, int priority)
> +{
> +    struct sbrec_ha_chassis *sb_ha_chassis =
> +        sbrec_ha_chassis_insert(ctx->ovnsb_txn);
> +    sbrec_ha_chassis_set_chassis(sb_ha_chassis, chassis);
> +    sbrec_ha_chassis_set_priority(sb_ha_chassis, priority);
> +    return sb_ha_chassis;
> +}
> +
>  /* This functions translates the gw chassis on the nb database
>   * to sb database entries, the only difference is that SB database
>   * Gateway_Chassis table references the chassis directly instead
> - * of using the name */
> + * of using the name.
> + *
> + * This function also creates a HA Chassis group in SB DB for
> + * the gateway chassis associated to a distributed gateway
> + * router port in the NB DB.
> + *
> + * An upcoming patch will delete the code to create the Gateway chassis
> + * in SB DB.*/
>  static void
>  copy_gw_chassis_from_nbrp_to_sbpb(
>          struct northd_context *ctx,
>          struct ovsdb_idl_index *sbrec_chassis_by_name,
>          const struct nbrec_logical_router_port *lrp,
> -        const struct sbrec_port_binding *port_binding) {
> -
> -    if (!lrp || !port_binding || !lrp->n_gateway_chassis) {
> -        return;
> -    }
> -
> +        const struct sbrec_port_binding *port_binding)
> +{
>      struct sbrec_gateway_chassis **gw_chassis = NULL;
>      int n_gwc = 0;
>      int n;
>
> +    /* Make use of the new HA chassis group table to support HA
> +     * for the distributed gateway router port. We can delete
> +     * the old gateway_chassis code once ovn-controller supports
> +     * HA chassis group. */
> +    const struct sbrec_ha_chassis_group *sb_ha_chassis_group =
> +        ha_chassis_group_lookup_by_name(
> +            ctx->sbrec_ha_chassis_grp_by_name, lrp->name);
> +    if (!sb_ha_chassis_group) {
> +        sb_ha_chassis_group = sbrec_ha_chassis_group_insert(ctx->ovnsb_txn);
> +        sbrec_ha_chassis_group_set_name(sb_ha_chassis_group, lrp->name);
> +    }
> +
> +    struct sbrec_ha_chassis **sb_ha_chassis = xcalloc(lrp->n_gateway_chassis,
> +                                                      sizeof *sb_ha_chassis);
> +    size_t n_sb_ha_ch = 0;
>      /* XXX: This can be improved. This code will generate a set of new
>       * Gateway_Chassis and push them all in a single transaction, instead
>       * this would be more optimal if we just add/update/remove the rows in
>       * the southbound db that need to change. We don't expect lots of
>       * changes to the Gateway_Chassis table, but if that proves to be wrong
> -     * we should optimize this. */
> +     * we should optimize this.
> +     *
> +     * Note: Remove the below code to add gateway_chassis row in OVN
> +     * Southbound db once ovn-controller supports HA chassis group. */
>      for (n = 0; n < lrp->n_gateway_chassis; n++) {
>          struct nbrec_gateway_chassis *lrp_gwc = lrp->gateway_chassis[n];
>          if (!lrp_gwc->chassis_name) {
> @@ -2097,6 +2137,8 @@ copy_gw_chassis_from_nbrp_to_sbpb(
>
>          gw_chassis = xrealloc(gw_chassis, (n_gwc + 1) * sizeof *gw_chassis);
>
> +        /* This code to create gateway_chassis in SB DB needs to be deleted
> +         * once ovn-controller supports making use of HA chassis groups. */
>          struct sbrec_gateway_chassis *pb_gwc =
>              sbrec_gateway_chassis_insert(ctx->ovnsb_txn);
>
> @@ -2107,16 +2149,26 @@ copy_gw_chassis_from_nbrp_to_sbpb(
>          sbrec_gateway_chassis_set_external_ids(pb_gwc, &lrp_gwc->external_ids);
>
>          gw_chassis[n_gwc++] = pb_gwc;
> +
> +        sb_ha_chassis[n_sb_ha_ch] =
> +            create_sb_ha_chassis(ctx, chassis, lrp_gwc->priority);
> +        n_sb_ha_ch++;
>      }
>      sbrec_port_binding_set_gateway_chassis(port_binding, gw_chassis, n_gwc);
>      free(gw_chassis);
> +
> +    sbrec_ha_chassis_group_set_ha_chassis(sb_ha_chassis_group,
> +                                          sb_ha_chassis, n_sb_ha_ch);
> +    sbrec_port_binding_set_ha_chassis_group(port_binding, sb_ha_chassis_group);
> +    free(sb_ha_chassis);
>  }
>
>  static void
>  ovn_port_update_sbrec(struct northd_context *ctx,
>                        struct ovsdb_idl_index *sbrec_chassis_by_name,
>                        const struct ovn_port *op,
> -                      struct hmap *chassis_qdisc_queues)
> +                      struct hmap *chassis_qdisc_queues,
> +                      struct sset *active_ha_chassis_grps)
>  {
>      sbrec_port_binding_set_datapath(op->sb, op->od->sb);
>      if (op->nbrp) {
> @@ -2153,6 +2205,7 @@ ovn_port_update_sbrec(struct northd_context *ctx,
>                                                        op->nbrp, op->sb);
>                  }
>
> +                sset_add(active_ha_chassis_grps, op->nbrp->name);
>              } else if (redirect_chassis) {
>                  /* Handle ports that had redirect-chassis option attached
>                   * to them, and for backwards compatibility convert them
> @@ -2163,20 +2216,23 @@ ovn_port_update_sbrec(struct northd_context *ctx,
>                  if (chassis) {
>                      /* If we found the chassis, and the gw chassis on record
>                       * differs from what we expect go ahead and update */
> +                    char *gwc_name = xasprintf("%s_%s", op->nbrp->name,
> +                                               chassis->name);
>                      if (op->sb->n_gateway_chassis != 1
>                          || !op->sb->gateway_chassis[0]->chassis
>                          || strcmp(op->sb->gateway_chassis[0]->chassis->name,
>                                    chassis->name)
>                          || op->sb->gateway_chassis[0]->priority != 0) {
> +                        /* This code to create gateway_chassis in SB DB needs
> +                         * to be deleted once ovn-controller supports making
> +                         * use of HA chassis groups. */
> +
>                          /* Construct a single Gateway_Chassis entry on the
>                           * Port_Binding attached to the redirect_chassis
>                           * name */
>                          struct sbrec_gateway_chassis *gw_chassis =
>                              sbrec_gateway_chassis_insert(ctx->ovnsb_txn);
>
> -                        char *gwc_name = xasprintf("%s_%s", op->nbrp->name,
> -                                chassis->name);
> -
>                          /* XXX: Again, here, we could just update an existing
>                           * Gateway_Chassis, instead of creating a new one
>                           * and replacing it */
> @@ -2187,8 +2243,31 @@ ovn_port_update_sbrec(struct northd_context *ctx,
>                                  &op->nbrp->external_ids);
>                          sbrec_port_binding_set_gateway_chassis(op->sb,
>                                                                 &gw_chassis, 1);
> -                        free(gwc_name);
>                      }
> +
> +                    /* Create HA chassis group in SB DB for the
> +                     * redirect-chassis option. */
> +                    const struct sbrec_ha_chassis_group *sb_ha_ch_grp;
> +                    sb_ha_ch_grp = ha_chassis_group_lookup_by_name(
> +                        ctx->sbrec_ha_chassis_grp_by_name, gwc_name);
> +                    if (!sb_ha_ch_grp) {
> +                        sb_ha_ch_grp =
> +                            sbrec_ha_chassis_group_insert(ctx->ovnsb_txn);
> +                        sbrec_ha_chassis_group_set_name(sb_ha_ch_grp,
> +                                                        gwc_name);
> +                    }
> +
> +                    if (sb_ha_ch_grp->n_ha_chassis != 1) {
> +                        struct sbrec_ha_chassis **sb_ha_ch =
> +                            xcalloc(1, sizeof *sb_ha_ch);
> +                        sb_ha_ch[0] = create_sb_ha_chassis(ctx, chassis, 0);
> +                        sbrec_ha_chassis_group_set_ha_chassis(sb_ha_ch_grp,
> +                                                              sb_ha_ch, 1);
> +                    }
> +                    sbrec_port_binding_set_ha_chassis_group(op->sb,
> +                                                            sb_ha_ch_grp);
> +                    sset_add(active_ha_chassis_grps, gwc_name);
> +                    free(gwc_name);
>                  } else {
>                      VLOG_WARN("chassis name '%s' from redirect from logical "
>                                " router port '%s' redirect-chassis not found",
> @@ -2359,6 +2438,18 @@ cleanup_mac_bindings(struct northd_context *ctx, struct hmap *ports)
>      }
>  }
>
> +static void
> +cleanup_sb_ha_chassis_groups(struct northd_context *ctx,
> +                             struct sset *active_ha_chassis_groups)
> +{
> +    const struct sbrec_ha_chassis_group *b, *n;
> +    SBREC_HA_CHASSIS_GROUP_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
> +        if (!sset_contains(active_ha_chassis_groups, b->name)) {
> +            sbrec_ha_chassis_group_delete(b);
> +        }
> +    }
> +}
> +
>  /* Updates the southbound Port_Binding table so that it contains the logical
>   * switch ports specified by the northbound database.
>   *
> @@ -2374,6 +2465,10 @@ build_ports(struct northd_context *ctx,
>      struct hmap tag_alloc_table = HMAP_INITIALIZER(&tag_alloc_table);
>      struct hmap chassis_qdisc_queues = HMAP_INITIALIZER(&chassis_qdisc_queues);
>
> +    /* sset which stores the set of ha chassis group names used. */
> +    struct sset active_ha_chassis_grps =
> +        SSET_INITIALIZER(&active_ha_chassis_grps);
> +
>      join_logical_ports(ctx, datapaths, ports, &chassis_qdisc_queues,
>                         &tag_alloc_table, &sb_only, &nb_only, &both);
>
> @@ -2387,8 +2482,8 @@ build_ports(struct northd_context *ctx,
>              tag_alloc_create_new_tag(&tag_alloc_table, op->nbsp);
>          }
>          ovn_port_update_sbrec(ctx, sbrec_chassis_by_name,
> -                              op, &chassis_qdisc_queues);
> -
> +                              op, &chassis_qdisc_queues,
> +                              &active_ha_chassis_grps);
>          add_tnlid(&op->od->port_tnlids, op->sb->tunnel_key);
>          if (op->sb->tunnel_key > op->od->port_key_hint) {
>              op->od->port_key_hint = op->sb->tunnel_key;
> @@ -2404,8 +2499,8 @@ build_ports(struct northd_context *ctx,
>
>          op->sb = sbrec_port_binding_insert(ctx->ovnsb_txn);
>          ovn_port_update_sbrec(ctx, sbrec_chassis_by_name, op,
> -                              &chassis_qdisc_queues);
> -
> +                              &chassis_qdisc_queues,
> +                              &active_ha_chassis_grps);
>          sbrec_port_binding_set_logical_port(op->sb, op->key);
>          sbrec_port_binding_set_tunnel_key(op->sb, tunnel_key);
>      }
> @@ -2427,6 +2522,8 @@ build_ports(struct northd_context *ctx,
>
>      tag_alloc_destroy(&tag_alloc_table);
>      destroy_chassis_queues(&chassis_qdisc_queues);
> +    cleanup_sb_ha_chassis_groups(ctx, &active_ha_chassis_grps);
> +    sset_destroy(&active_ha_chassis_grps);
>  }
>
>  #define OVN_MIN_MULTICAST 32768
> @@ -7306,13 +7403,216 @@ ovnnb_db_run(struct northd_context *ctx,
>      cleanup_macam(&macam);
>  }
>
> +/* Stores the list of chassis which references an ha_chassis_group.
> + */
> +struct ha_ref_chassis_info {
> +    const struct sbrec_ha_chassis_group *ha_chassis_group;
> +    struct sbrec_chassis **ref_chassis;
> +    size_t n_ref_chassis;
> +};
> +
> +static void
> +add_to_ha_ref_chassis_info(struct ha_ref_chassis_info *ref_ch_info,
> +                           const struct sbrec_chassis *chassis)
> +{
> +    for (size_t j = 0; j < ref_ch_info->n_ref_chassis; j++) {
> +        if (ref_ch_info->ref_chassis[j] == chassis) {
> +           return;
> +        }
> +    }
> +
> +    ref_ch_info->ref_chassis = xrealloc(ref_ch_info->ref_chassis,
> +                                        sizeof *ref_ch_info->ref_chassis *
> +                                        (++ref_ch_info->n_ref_chassis));

This may be inefficient, considering the amount of ref chassises to be
added for each HA group. It is better to xrealloc for original_size *
2 every time and expand only when more space is needed.

> +    ref_ch_info->ref_chassis[ref_ch_info->n_ref_chassis - 1] =
> +        CONST_CAST(struct sbrec_chassis *, chassis);
> +}
> +
> +static void
> +update_sb_ha_group_ref_chassis(struct shash *ha_ref_chassis_map)
> +{
> +    struct shash_node *node, *next;
> +    SHASH_FOR_EACH_SAFE (node, next, ha_ref_chassis_map) {
> +        struct ha_ref_chassis_info *ha_ref_info = node->data;
> +        sbrec_ha_chassis_group_set_ref_chassis(ha_ref_info->ha_chassis_group,
> +                                               ha_ref_info->ref_chassis,
> +                                               ha_ref_info->n_ref_chassis);
> +        free(ha_ref_info->ref_chassis);
> +        free(ha_ref_info);
> +        shash_delete(ha_ref_chassis_map, node);
> +    }
> +}
> +
> +/* This function returns logical router datapath (with a distributed
> + * gateway port) to which 'od' is connected to - either directly
> + * or indirectly (via transit logical switches).
> + * Returns NULL if no logical router with gw port found.
> + */
> +static struct ovn_datapath *
> +get_router_dp_with_gw_port(struct hmap *ports,
> +                           struct ovn_datapath *od,
> +                           struct ovn_datapath *peer_od)
> +{
> +    if (!od) {
> +        return NULL;
> +    }
> +
> +    if (od->nbs) {
> +        /* It's a logical switch datapath. */
> +        if (peer_od) {
> +            /* If peer datapath is not logical router, then
> +             * something is wrong. */
> +            ovs_assert(peer_od->nbr);
> +        }
> +
> +        for (size_t i = 0; i < od->n_router_ports; i++) {
> +            if (!od->router_ports[i]->peer) {
> +                /* If there is no peer port connecting to the
> +                 * router datapath, ignore it. */
> +                continue;
> +            }
> +
> +            struct ovn_datapath *router_dp = od->router_ports[i]->peer->od;
> +            if (router_dp->l3dgw_port && router_dp->l3dgw_port->nbrp) {
> +                /* Router datapath has a distributed gateway router port. */
> +                return router_dp;

I think we can't return when just one router_dp is found. There can be
more than one connected router that has gateway router ports. So the
return value of this function should be a set.

> +            }
> +        }
> +
> +        /* The logical switch datapath is not connected to any
> +         * logical router with a distributed gateway port. Check if it
> +         * is indirectly connected to a logical router with a gw port. */
> +        for (size_t i = 0; i < od->n_router_ports; i++) {
> +            if (!od->router_ports[i]->peer) {
> +                continue;
> +            }
> +
> +            struct ovn_datapath *router_dp =
> +                od->router_ports[i]->peer->od;
> +
> +            /* If we don't check this, we will be in an infinite loop. */
> +            if (router_dp != peer_od) {

peer_od should also be a set, it should skip checking any datapath
that has already been checked. Consider the case LR1 is connected with
LS1, LS2 and LS3.

> +                router_dp = get_router_dp_with_gw_port(ports, router_dp,
> +                                                       od);
> +                if (router_dp) {
> +                    /* Found a logical router with gw port indirectly connected
> +                     * to 'od'. */
> +                    return router_dp;
> +                }
> +            }
> +        }
> +    } else if (od->nbr) {
> +        /* It's a logical router datapath. */
> +        if (peer_od) {
> +            /* If peer datapath is not logical switch, then
> +             * something is wrong. */
> +            ovs_assert(peer_od->nbs);

A router port can be peered with another router port directly, so this
assert is not true.

> +        }
> +
> +        /* Check if this logical router datapath is indirectly connected
> +         * to another logical router via a transit logical switch(es). */
> +        for (size_t i = 0; i < od->nbr->n_ports; i++) {
> +            struct ovn_port *router_port =
> +                ovn_port_find(ports, od->nbr->ports[i]->name);
> +
> +            if (!router_port || !router_port->peer) {
> +                continue;
> +            }
> +            /* If we don't check this, we will be in an infinite loop. */
> +            if (router_port->peer->od != peer_od) {
> +                struct ovn_datapath *router_dp;
> +                /* router_port->peer->od points a logical switch datapath. */
> +                router_dp = get_router_dp_with_gw_port(ports,
> +                                                       router_port->peer->od,
> +                                                       od);
> +                if (router_dp) {
> +                    /* Found a logical router with gw port indirectly connected
> +                    * to 'od'. */
> +                    return router_dp;
> +                }
> +            }
> +        }
> +    }
> +
> +    return NULL;
> +}

In general, it may refer to the flood-fill approach implemented in
ovn-controller for populating local datapaths in binding.c, which is
similar to the purpose here. However, we should consider an
optimization here since the flood-fill cost would apply for every
port-binding. It can be optimized with a cache, which maps between
each datapath and its related router_dps with gw ports, to avoid
repeated traversing. (In ovn-controller it can be optimized, too, but
the gain is not as big as here since only local port-bindings are
checked in ovn-controller.)

> +
> +/* This function checks if the port binding 'sb' references
> + * a HA chassis group.
> + * Eg. Suppose a distributed logical router port - lr0-public
> + * uses an HA chassis group - hagrp1 and if hagrp1 has 3 ha
> + * chassis - gw1, gw2 and gw3.
> + * Or
> + * If the distributed logical router port - lr0-public has
> + * 3 gateway chassis - gw1, gw2 and gw3.
> + * ovn-northd creates ha chassis group - hagrp1 in SB DB
> + * and adds gw1, gw2 and gw3 to its ha_chassis list.
> + *
> + * If port binding 'sb' represents a logical switch port 'p1'
> + * and its logical switch is connected to the logical router
> + * 'lr0' directly or indirectly (i.e p1's logical switch is
> + *  connected to a router 'lr1' and 'lr1' has a path to lr0 via
> + *  transit logical switches) and 'sb' is claimed by chassis - 'c1' then
> + * this function adds c1 to the list of the reference chassis
> + *  - 'ref_chassis' of hagrp1.
> + */
> +static void
> +build_ha_chassis_group_ref_chassis(struct northd_context *ctx,
> +                                   const struct sbrec_port_binding *sb,
> +                                   struct ovn_port *op,
> +                                   struct hmap *ports,
> +                                   struct shash *ha_ref_chassis_map)
> +{
> +    struct ovn_datapath *router_dp =
> +        get_router_dp_with_gw_port(ports, op->od, NULL);
> +    if (!router_dp) {
> +        return;
> +    }
> +
> +    const struct sbrec_ha_chassis_group *sb_ha_chassis_grp = NULL;
> +    /* Get the HA Chassis group associated with the lrp. */
> +    struct nbrec_ha_chassis_group *nb_chassis_grp =
> +        router_dp->l3dgw_port->nbrp->ha_chassis_group;
> +
> +    if (!nb_chassis_grp) {
> +        /* Check if the legacy gateway chassis is used. */
> +        if (router_dp->l3redirect_port && router_dp->l3redirect_port->sb) {
> +            sb_ha_chassis_grp =
> +                router_dp->l3redirect_port->sb->ha_chassis_group;
> +        }
> +    } else {
> +        /* Get the HA chassis group row in the SB DB. */
> +        sb_ha_chassis_grp = ha_chassis_group_lookup_by_name(
> +            ctx->sbrec_ha_chassis_grp_by_name, nb_chassis_grp->name);
> +    }
> +
> +    if (sb_ha_chassis_grp) {
> +        struct ha_ref_chassis_info *ref_ch_info =
> +            shash_find_data(ha_ref_chassis_map, sb_ha_chassis_grp->name);
> +        ovs_assert(ref_ch_info);
> +        add_to_ha_ref_chassis_info(ref_ch_info, sb->chassis);
> +    }
> +}
> +
>  /* Handle changes to the 'chassis' column of the 'Port_Binding' table.  When
>   * this column is not empty, it means we need to set the corresponding logical
>   * port as 'up' in the northbound DB. */
>  static void
> -update_logical_port_status(struct northd_context *ctx, struct hmap *ports)
> +handle_port_binding_changes(struct northd_context *ctx, struct hmap *ports,
> +                            struct shash *ha_ref_chassis_map)
>  {
>      const struct sbrec_port_binding *sb;
> +    bool build_ha_chassis_ref = false;
> +    if (ctx->ovnsb_txn) {
> +        const struct sbrec_ha_chassis_group *ha_ch_grp;
> +        SBREC_HA_CHASSIS_GROUP_FOR_EACH (ha_ch_grp, ctx->ovnsb_idl) {
> +            struct ha_ref_chassis_info *ref_ch_info =
> +                xzalloc(sizeof *ref_ch_info);
> +            ref_ch_info->ha_chassis_group = ha_ch_grp;
> +            build_ha_chassis_ref = true;
> +            shash_add(ha_ref_chassis_map, ha_ch_grp->name, ref_ch_info);
> +        }
> +    }
>
>      SBREC_PORT_BINDING_FOR_EACH(sb, ctx->ovnsb_idl) {
>          struct ovn_port *op = ovn_port_find(ports, sb->logical_port);
> @@ -7328,6 +7628,13 @@ update_logical_port_status(struct northd_context *ctx, struct hmap *ports)
>          if (!op->nbsp->up || *op->nbsp->up != up) {
>              nbrec_logical_switch_port_set_up(op->nbsp, &up, 1);
>          }
> +
> +        if (build_ha_chassis_ref && ctx->ovnsb_txn && sb->chassis) {
> +            /* Check and add the chassis which has claimed this 'sb'
> +             * to the ha chassis group's ref_chassis if required. */
> +            build_ha_chassis_group_ref_chassis(ctx, sb, op, ports,
> +                                               ha_ref_chassis_map);
> +        }
>      }
>  }
>
> @@ -7655,8 +7962,13 @@ ovnsb_db_run(struct northd_context *ctx, struct ovsdb_idl_loop *sb_loop,
>          return;
>      }
>
> -    update_logical_port_status(ctx, ports);
> +    struct shash ha_ref_chassis_map = SHASH_INITIALIZER(&ha_ref_chassis_map);
> +    handle_port_binding_changes(ctx, ports, &ha_ref_chassis_map);
>      update_northbound_cfg(ctx, sb_loop);
> +    if (ctx->ovnsb_txn) {
> +        update_sb_ha_group_ref_chassis(&ha_ref_chassis_map);
> +    }
> +    shash_destroy(&ha_ref_chassis_map);
>  }
>
>  static void
> @@ -7835,6 +8147,8 @@ main(int argc, char *argv[])
>      ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_port_binding_col_chassis);
>      ovsdb_idl_add_column(ovnsb_idl_loop.idl,
>                           &sbrec_port_binding_col_gateway_chassis);
> +    ovsdb_idl_add_column(ovnsb_idl_loop.idl,
> +                         &sbrec_port_binding_col_ha_chassis_group);
>      ovsdb_idl_add_column(ovnsb_idl_loop.idl,
>                           &sbrec_gateway_chassis_col_chassis);
>      ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_gateway_chassis_col_name);
> @@ -7899,9 +8213,30 @@ main(int argc, char *argv[])
>      ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_chassis_col_nb_cfg);
>      ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_chassis_col_name);
>
> +    ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_ha_chassis);
> +    add_column_noalert(ovnsb_idl_loop.idl,
> +                       &sbrec_ha_chassis_col_chassis);
> +    add_column_noalert(ovnsb_idl_loop.idl,
> +                       &sbrec_ha_chassis_col_priority);
> +    add_column_noalert(ovnsb_idl_loop.idl,
> +                       &sbrec_ha_chassis_col_external_ids);
> +
> +    ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_ha_chassis_group);
> +    add_column_noalert(ovnsb_idl_loop.idl,
> +                       &sbrec_ha_chassis_group_col_name);
> +    add_column_noalert(ovnsb_idl_loop.idl,
> +                       &sbrec_ha_chassis_group_col_ha_chassis);
> +    add_column_noalert(ovnsb_idl_loop.idl,
> +                       &sbrec_ha_chassis_group_col_external_ids);
> +    add_column_noalert(ovnsb_idl_loop.idl,
> +                       &sbrec_ha_chassis_group_col_ref_chassis);
> +
>      struct ovsdb_idl_index *sbrec_chassis_by_name
>          = chassis_index_create(ovnsb_idl_loop.idl);
>
> +    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name
> +        = ha_chassis_group_index_create(ovnsb_idl_loop.idl);
> +
>      /* Ensure that only a single ovn-northd is active in the deployment by
>       * acquiring a lock called "ovn_northd" on the southbound database
>       * and then only performing DB transactions if the lock is held. */
> @@ -7916,6 +8251,7 @@ main(int argc, char *argv[])
>              .ovnnb_txn = ovsdb_idl_loop_run(&ovnnb_idl_loop),
>              .ovnsb_idl = ovnsb_idl_loop.idl,
>              .ovnsb_txn = ovsdb_idl_loop_run(&ovnsb_idl_loop),
> +            .sbrec_ha_chassis_grp_by_name = sbrec_ha_chassis_grp_by_name,
>          };
>
>          if (!had_lock && ovsdb_idl_has_lock(ovnsb_idl_loop.idl)) {
> diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
> index 10a59649a..48d27b960 100644
> --- a/ovn/ovn-nb.ovsschema
> +++ b/ovn/ovn-nb.ovsschema
> @@ -1,7 +1,7 @@
>  {
>      "name": "OVN_Northbound",
> -    "version": "5.14.1",
> -    "cksum": "3758097843 20509",
> +    "version": "5.15.0",
> +    "cksum": "1078795414 21917",
>      "tables": {
>          "NB_Global": {
>              "columns": {
> @@ -271,6 +271,12 @@
>                                       "refType": "strong"},
>                               "min": 0,
>                               "max": "unlimited"}},
> +                "ha_chassis_group": {
> +                    "type": {"key": {"type": "uuid",
> +                                     "refTable": "HA_Chassis_Group",
> +                                     "refType": "weak"},
> +                             "min": 0,
> +                             "max": 1}},
>                  "options": {
>                      "type": {"key": "string",
>                               "value": "string",
> @@ -392,5 +398,29 @@
>                      "type": {"key": "string", "value": "string",
>                               "min": 0, "max": "unlimited"}}},
>              "indexes": [["name"]],
> -            "isRoot": false}}
> +            "isRoot": false},
> +        "HA_Chassis": {
> +            "columns": {
> +                "chassis_name": {"type": "string"},
> +                "priority": {"type": {"key": {"type": "integer",
> +                                              "minInteger": 0,
> +                                              "maxInteger": 32767}}},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}},
> +            "isRoot": false},
> +        "HA_Chassis_Group": {
> +            "columns": {
> +                "name": {"type": "string"},
> +                "ha_chassis": {
> +                    "type": {"key": {"type": "uuid",
> +                                     "refTable": "HA_Chassis",
> +                                     "refType": "strong"},
> +                             "min": 0,
> +                             "max": "unlimited"}},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}},
> +            "indexes": [["name"]],
> +            "isRoot": true}}
>      }
> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
> index 18396507d..70d5a4689 100644
> --- a/ovn/ovn-nb.xml
> +++ b/ovn/ovn-nb.xml
> @@ -1522,6 +1522,12 @@
>      </column>
>
>      <column name="gateway_chassis">
> +      <p>
> +        This column is ignored if the column
> +        <ref column="ha_chassis_group" table="Logical_Router_Port"/>.
> +        is set.
> +      </p>
> +
>        <p>
>          If set, this indicates that this logical router port represents
>          a distributed gateway port that connects this router to a logical
> @@ -1549,6 +1555,24 @@
>        </p>
>      </column>
>
> +    <column name="ha_chassis_group">
> +      <p>
> +        If set, this indicates that this logical router port represents
> +        a distributed gateway port that connects this router to a logical
> +        switch with a localnet port.  There may be at most one such
> +        logical router port on each logical router. The HA chassis which
> +        are part of the HA chassis group will provide the gateway high
> +        availability. Please see the <ref table="HA_Chassis_Group"/> for
> +        more details.
> +      </p>
> +
> +      <p>
> +        When this column is set, the column
> +        <ref column="gateway_chassis" table="Logical_Router_Port"/> will
> +        be ignored.
> +      </p>
> +    </column>
> +
>      <column name="networks">
>        <p>
>          The IP addresses and netmasks of the router.  For example,
> @@ -2605,4 +2629,55 @@
>      </group>
>    </table>
>
> +  <table name="HA_Chassis_Group">
> +    <p>
> +      Table representing a group of chassis which can provide High availability
> +      services. Each chassis in the group is represented by the table
> +      <ref table="HA_Chassis"/>. The HA chassis with highest priority will
> +      be the master of this group. If the master chassis failover is detected,
> +      the HA chassis with the next higher priority takes over the
> +      responsibility of providing the HA. If a distributed gateway router port
> +      references a row in this table, then the master HA chassis in this group
> +      provides the gateway functionality.
> +    </p>
> +
> +    <column name="name">
> +      Name of the <ref table="HA_Chassis_Group"/>. Name should be unique.
> +    </column>
> +
> +    <column name="ha_chassis">
> +      A list of HA chassis which belongs to this group.
> +    </column>
> +
> +    <group title="Common Columns">
> +      <column name="external_ids">
> +        See <em>External IDs</em> at the beginning of this document.
> +      </column>
> +    </group>
> +  </table>
> +
> +  <table name="HA_Chassis">
> +    <column name="chassis_name">
> +      <p>
> +        Name of the chassis which is part of the HA chassis group.
> +        The value must match the
> +        <ref db="OVN_Southbound" table="Chassis" column="name"/> column
> +        of the <ref db="OVN_Southbound" table="Chassis"/> table in the
> +        <ref db="OVN_Southbound"/> database.
> +      </p>
> +    </column>
> +
> +    <column name="priority">
> +      <p>
> +        Priority of the chassis. Chassis with highest priority will be
> +        the master.
> +      </p>
> +    </column>
> +
> +    <group title="Common Columns">
> +      <column name="external_ids">
> +        See <em>External IDs</em> at the beginning of this document.
> +      </column>
> +    </group>
> +  </table>
>  </database>
> diff --git a/ovn/ovn-sb.ovsschema b/ovn/ovn-sb.ovsschema
> index cc8c771a7..e05f964b0 100644
> --- a/ovn/ovn-sb.ovsschema
> +++ b/ovn/ovn-sb.ovsschema
> @@ -1,7 +1,7 @@
>  {
>      "name": "OVN_Southbound",
> -    "version": "2.1.0",
> -    "cksum": "3806083220 15332",
> +    "version": "2.2.0",
> +    "cksum": "2001312516 17220",
>      "tables": {
>          "SB_Global": {
>              "columns": {
> @@ -147,6 +147,12 @@
>                                       "refType": "strong"},
>                               "min": 0,
>                               "max": "unlimited"}},
> +                "ha_chassis_group": {
> +                    "type": {"key": {"type": "uuid",
> +                                     "refTable": "HA_Chassis_Group",
> +                                     "refType": "weak"},
> +                             "min": 0,
> +                             "max": 1}},
>                  "options": {
>                       "type": {"key": "string",
>                                "value": "string",
> @@ -309,4 +315,35 @@
>                      "type": {"key": "string", "value": "string",
>                               "min": 0, "max": "unlimited"}}},
>              "indexes": [["name"]],
> -            "isRoot": false}}}
> +            "isRoot": false},
> +        "HA_Chassis": {
> +            "columns": {
> +                "chassis": {"type": {"key": {"type": "uuid",
> +                                             "refTable": "Chassis",
> +                                             "refType": "weak"},
> +                                     "min": 0, "max": 1}},
> +                "priority": {"type": {"key": {"type": "integer",
> +                                              "minInteger": 0,
> +                                              "maxInteger": 32767}}},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}},
> +            "isRoot": false},
> +        "HA_Chassis_Group": {
> +            "columns": {
> +                "name": {"type": "string"},
> +                "ha_chassis": {
> +                    "type": {"key": {"type": "uuid",
> +                                     "refTable": "HA_Chassis",
> +                                     "refType": "strong"},
> +                             "min": 0,
> +                             "max": "unlimited"}},
> +                "ref_chassis": {"type": {"key": {"type": "uuid",
> +                                                 "refTable": "Chassis",
> +                                                 "refType": "weak"},
> +                                         "min": 0, "max": "unlimited"}},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}},
> +            "indexes": [["name"]],
> +            "isRoot": true}}}
> diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
> index 4e080abff..5c4a852a5 100644
> --- a/ovn/ovn-sb.xml
> +++ b/ovn/ovn-sb.xml
> @@ -2246,6 +2246,16 @@ tcp.flags = RST;
>          </p>
>        </column>
>
> +      <column name="ha_chassis_group">
> +        <p>
> +          This should only be populated for ports with
> +          <ref column="type"/> set to <code>chassisredirect</code>.
> +          This column defines the HA chassis group with a list of
> +          HA chassis used as gateways where traffic will be redirected
> +          through.
> +        </p>
> +      </column>
> +
>        <column name="tunnel_key">
>          <p>
>            A number that represents the logical port in the key (e.g. STT key or
> @@ -3339,4 +3349,57 @@ tcp.flags = RST;
>        <column name="external_ids"/>
>      </group>
>    </table>
> +
> +  <table name="HA_Chassis">
> +    <column name="chassis">
> +      <p>
> +        The <ref table="Chassis"/> which provides the HA functionality.
> +        </p>
> +    </column>
> +
> +    <column name="priority">
> +      <p>
> +        Priority of the HA chassis. Chassis with highest priority will be
> +        the master in the HA chassis group.
> +      </p>
> +    </column>
> +
> +    <group title="Common Columns">
> +      <column name="external_ids">
> +        See <em>External IDs</em> at the beginning of this document.
> +      </column>
> +    </group>
> +  </table>
> +
> +  <table name="HA_Chassis_Group">
> +    <p>
> +      Table representing a group of chassis which can provide High availability
> +      services. Each chassis in the group is represented by the table
> +      <ref table="HA_Chassis"/>. The HA chassis with highest priority will
> +      be the master of this group. If the master chassis failover is detected,
> +      the HA chassis with the next higher priority takes over the
> +      responsibility of providing the HA. If <ref db="OVN_Southbound"
> +      table="Port_Binding" column="ha_chassis_group"/> column of the table
> +      <ref db="OVN_Southbound" table="Port_Binding"/> references this table,
> +      then this HA chassis group provides the gateway functionality and
> +      redirects the gateway traffic to the master of this group.
> +    </p>
> +    <column name="name">
> +      Name of the <ref table="HA_Chassis_Group"/>. Name should be unique.
> +    </column>
> +
> +    <column name="ha_chassis">
> +      A list of <ref table="HA_Chassis"/> which belongs to this group.
> +    </column>
> +
> +    <column name="ref_chassis">
> +      A list of <ref table="chassis"/> which references this HA chassis group.
> +    </column>
> +
> +    <group title="Common Columns">
> +      <column name="external_ids">
> +        See <em>External IDs</em> at the beginning of this document.
> +      </column>
> +    </group>
> +  </table>
>  </database>
> diff --git a/ovn/utilities/ovn-nbctl.8.xml b/ovn/utilities/ovn-nbctl.8.xml
> index a7a9c2701..483602970 100644
> --- a/ovn/utilities/ovn-nbctl.8.xml
> +++ b/ovn/utilities/ovn-nbctl.8.xml
> @@ -908,6 +908,47 @@
>        </dd>
>      </dl>
>
> +    <h1> HA Chassis Group commands</h1>
> +
> +    <dl>
> +      <dt><code>ha-chassis-group-add</code> <var>group</var></dt>
> +      <dd>
> +        Creates a new HA chassis group in the <code>HA_Chassis_Group</code>
> +        table named <code>group</code>.
> +      </dd>
> +
> +      <dt><code>ha-chassis-group-del</code> <var>group</var></dt>
> +      <dd>
> +        Deletes the HA chassis group <code>group</code>. It is an error if
> +        <code>group</code> does not exist.
> +      </dd>
> +
> +      <dt><code>ha-chassis-group-list</code></dt>
> +      <dd>
> +        Lists the HA chassis group <code>group</code> along with the
> +        <code>HA chassis</code> if any associated with it.
> +      </dd>
> +
> +      <dt><code>ha-chassis-group-add-chassis</code> <var>group</var>
> +      <var>chassis</var> <var>priority</var></dt>
> +      <dd>
> +        Adds a new HA chassis <code>chassis</code> to the
> +        HA Chassis group <code>group</code> with the specified priority.
> +        If the <code>chassis</code> already exists, then the
> +        <code>priority</code> is updated.
> +        The <code>chassis</code> should be the name of the chassis in the
> +        <code>OVN_Southbound</code>.
> +      </dd>
> +
> +      <dt><code>ha-chassis-group-remove-chassis</code> <var>group</var>
> +      <var>chassis</var></dt>
> +      <dd>
> +        Removes the HA chassis <code>chassis</code> from the HA chassis
> +        group <code>group</code>. It is an error if <code>chassis</code> does
> +        not exist.
> +      </dd>
> +    </dl>
> +
>      <h1>Database Commands</h1>
>      <p>These commands query and modify the contents of <code>ovsdb</code> tables.
>      They are a slight abstraction of the <code>ovsdb</code> interface and
> diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
> index 8a72e9574..8f4e96d6e 100644
> --- a/ovn/utilities/ovn-nbctl.c
> +++ b/ovn/utilities/ovn-nbctl.c
> @@ -694,6 +694,15 @@ Port group commands:\n\
>    pg-set-ports PG PORTS       Set PORTS on port group PG\n\
>    pg-del PG                   Delete port group PG\n\
>  \n\
> +HA chassis group commands:\n\
> +  ha-chassis-group-add GRP  Create an HA chassis group GRP\n\
> +  ha-chassis-group-del GRP  Delete the HA chassis group GRP\n\
> +  ha-chassis-group-list     List the HA chassis groups\n\
> +  ha-chassis-group-add-chassis GRP CHASSIS [PRIORITY] Adds an HA\
> +chassis with optional PRIORITY to the HA chassis group GRP\n\
> +  ha-chassis-group-del-chassis GRP CHASSIS Deletes the HA chassis\
> +CHASSIS from the HA chassis group GRP\n\
> +\n\
>  %s\
>  %s\
>  \n\
> @@ -4756,6 +4765,201 @@ cmd_pg_del(struct ctl_context *ctx)
>      nbrec_port_group_delete(pg);
>  }
>
> +static const struct nbrec_ha_chassis_group*
> +ha_chassis_group_by_name_or_uuid(struct ctl_context *ctx, const char *id,
> +                                 bool must_exist)
> +{
> +    struct uuid ch_grp_uuid;
> +    const struct nbrec_ha_chassis_group *ha_ch_grp = NULL;
> +    bool is_uuid = uuid_from_string(&ch_grp_uuid, id);
> +    if (is_uuid) {
> +        ha_ch_grp = nbrec_ha_chassis_group_get_for_uuid(ctx->idl,
> +                                                        &ch_grp_uuid);
> +    }
> +
> +    if (!ha_ch_grp) {
> +        const struct nbrec_ha_chassis_group *iter;
> +        NBREC_HA_CHASSIS_GROUP_FOR_EACH (iter, ctx->idl) {
> +            if (!strcmp(iter->name, id)) {
> +                ha_ch_grp = iter;
> +                break;
> +            }
> +        }
> +    }
> +
> +    if (!ha_ch_grp && must_exist) {
> +        ctx->error = xasprintf("%s: ha_chassi_group %s not found",
> +                               id, is_uuid ? "UUID" : "name");
> +    }
> +
> +    return ha_ch_grp;
> +}
> +
> +static void
> +cmd_ha_ch_grp_add(struct ctl_context *ctx)
> +{
> +    const char *name = ctx->argv[1];
> +    struct nbrec_ha_chassis_group *ha_ch_grp =
> +        nbrec_ha_chassis_group_insert(ctx->txn);
> +    nbrec_ha_chassis_group_set_name(ha_ch_grp, name);
> +}
> +
> +static void
> +cmd_ha_ch_grp_del(struct ctl_context *ctx)
> +{
> +    const char *name_or_id = ctx->argv[1];
> +
> +    const struct nbrec_ha_chassis_group *ha_ch_grp =
> +        ha_chassis_group_by_name_or_uuid(ctx, name_or_id, true);
> +
> +    if (ha_ch_grp) {
> +        nbrec_ha_chassis_group_delete(ha_ch_grp);
> +    }
> +}
> +
> +static void
> +cmd_ha_ch_grp_list(struct ctl_context *ctx)
> +{
> +    const struct nbrec_ha_chassis_group *ha_ch_grp;
> +
> +    NBREC_HA_CHASSIS_GROUP_FOR_EACH (ha_ch_grp, ctx->idl) {
> +        ds_put_format(&ctx->output, UUID_FMT " (%s)\n",
> +                      UUID_ARGS(&ha_ch_grp->header_.uuid), ha_ch_grp->name);
> +        const struct nbrec_ha_chassis *ha_ch;
> +        for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
> +            ha_ch = ha_ch_grp->ha_chassis[i];
> +            ds_put_format(&ctx->output,
> +                          "    "UUID_FMT " (%s)\n"
> +                          "    priority %lu\n\n",
> +                          UUID_ARGS(&ha_ch->header_.uuid), ha_ch->chassis_name,
> +                          ha_ch->priority);
> +        }
> +        ds_put_cstr(&ctx->output, "\n");
> +    }
> +}
> +
> +static void
> +cmd_ha_ch_grp_add_chassis(struct ctl_context *ctx)
> +{
> +    const struct nbrec_ha_chassis_group *ha_ch_grp =
> +        ha_chassis_group_by_name_or_uuid(ctx, ctx->argv[1], true);
> +
> +    if (!ha_ch_grp) {
> +        return;
> +    }
> +
> +    const char *chassis_name = ctx->argv[2];
> +    int64_t priority;
> +    char *error = parse_priority(ctx->argv[3], &priority);
> +    if (error) {
> +        ctx->error = error;
> +        return;
> +    }
> +
> +    struct nbrec_ha_chassis *ha_chassis = NULL;
> +    for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
> +        if (!strcmp(ha_ch_grp->ha_chassis[i]->chassis_name, chassis_name)) {
> +            ha_chassis = ha_ch_grp->ha_chassis[i];
> +            break;
> +        }
> +    }
> +
> +    if (ha_chassis) {
> +        nbrec_ha_chassis_set_priority(ha_chassis, priority);
> +        return;
> +    }
> +
> +    ha_chassis = nbrec_ha_chassis_insert(ctx->txn);
> +    nbrec_ha_chassis_set_chassis_name(ha_chassis, chassis_name);
> +    nbrec_ha_chassis_set_priority(ha_chassis, priority);
> +
> +    nbrec_ha_chassis_group_verify_ha_chassis(ha_ch_grp);
> +
> +    struct nbrec_ha_chassis **new_ha_chs =
> +        xmalloc(sizeof *new_ha_chs * (ha_ch_grp->n_ha_chassis + 1));
> +    nullable_memcpy(new_ha_chs, ha_ch_grp->ha_chassis,
> +                    sizeof *new_ha_chs * ha_ch_grp->n_ha_chassis);
> +    new_ha_chs[ha_ch_grp->n_ha_chassis] =
> +        CONST_CAST(struct nbrec_ha_chassis *, ha_chassis);
> +    nbrec_ha_chassis_group_set_ha_chassis(ha_ch_grp, new_ha_chs,
> +                                          ha_ch_grp->n_ha_chassis + 1);
> +    free(new_ha_chs);
> +}
> +
> +static void
> +cmd_ha_ch_grp_remove_chassis(struct ctl_context *ctx)
> +{
> +    const struct nbrec_ha_chassis_group *ha_ch_grp =
> +        ha_chassis_group_by_name_or_uuid(ctx, ctx->argv[1], true);
> +
> +    if (!ha_ch_grp) {
> +        return;
> +    }
> +
> +    const char *chassis_name = ctx->argv[2];
> +    struct nbrec_ha_chassis *ha_chassis = NULL;
> +    size_t idx = 0;
> +    for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
> +        if (!strcmp(ha_ch_grp->ha_chassis[i]->chassis_name, chassis_name)) {
> +            ha_chassis = ha_ch_grp->ha_chassis[i];
> +            idx = i;
> +            break;
> +        }
> +    }
> +
> +    if (!ha_chassis) {
> +        ctx->error = xasprintf("%s: ha chassis not found in %s ha "
> +                               "chassis group", chassis_name, ctx->argv[1]);
> +        return;
> +    }
> +
> +    struct nbrec_ha_chassis **new_ha_ch
> +        = xmemdup(ha_ch_grp->ha_chassis,
> +                  sizeof *new_ha_ch * ha_ch_grp->n_ha_chassis);
> +    new_ha_ch[idx] = new_ha_ch[ha_ch_grp->n_ha_chassis - 1];
> +    nbrec_ha_chassis_group_verify_ha_chassis(ha_ch_grp);
> +    nbrec_ha_chassis_group_set_ha_chassis(ha_ch_grp, new_ha_ch,
> +                                          ha_ch_grp->n_ha_chassis - 1);
> +    free(new_ha_ch);
> +    nbrec_ha_chassis_delete(ha_chassis);
> +}
> +
> +static void
> +cmd_ha_ch_grp_set_chassis_prio(struct ctl_context *ctx)
> +{
> +    const struct nbrec_ha_chassis_group *ha_ch_grp =
> +        ha_chassis_group_by_name_or_uuid(ctx, ctx->argv[1], true);
> +
> +    if (!ha_ch_grp) {
> +        return;
> +    }
> +
> +    int64_t priority;
> +    char *error = parse_priority(ctx->argv[3], &priority);
> +    if (error) {
> +        ctx->error = error;
> +        return;
> +    }
> +
> +    const char *chassis_name = ctx->argv[2];
> +    struct nbrec_ha_chassis *ha_chassis = NULL;
> +
> +    for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
> +        if (!strcmp(ha_ch_grp->ha_chassis[i]->chassis_name, chassis_name)) {
> +            ha_chassis = ha_ch_grp->ha_chassis[i];
> +            break;
> +        }
> +    }
> +
> +    if (!ha_chassis) {
> +        ctx->error = xasprintf("%s: ha chassis not found in %s ha "
> +                               "chassis group", chassis_name, ctx->argv[1]);
> +        return;
> +    }
> +
> +    nbrec_ha_chassis_set_priority(ha_chassis, priority);
> +}
> +
>  static const struct ctl_table_class tables[NBREC_N_TABLES] = {
>      [NBREC_TABLE_DHCP_OPTIONS].row_ids
>      = {{&nbrec_logical_switch_port_col_name, NULL,
> @@ -4790,6 +4994,9 @@ static const struct ctl_table_class tables[NBREC_N_TABLES] = {
>      = {&nbrec_port_group_col_name, NULL, NULL},
>
>      [NBREC_TABLE_ACL].row_ids[0] = {&nbrec_acl_col_name, NULL, NULL},
> +
> +    [NBREC_TABLE_HA_CHASSIS_GROUP].row_ids[0]
> +    = {&nbrec_ha_chassis_group_col_name, NULL, NULL},
>  };
>
>  static char *
> @@ -5230,6 +5437,20 @@ static const struct ctl_command_syntax nbctl_commands[] = {
>      {"pg-set-ports", 2, INT_MAX, "", NULL, cmd_pg_set_ports, NULL, "", RW },
>      {"pg-del", 1, 1, "", NULL, cmd_pg_del, NULL, "", RW },
>
> +    /* HA chassis group commands. */
> +    {"ha-chassis-group-add", 1, 1, "[CHASSIS GROUP]", NULL,
> +      cmd_ha_ch_grp_add, NULL, "", RW },
> +    {"ha-chassis-group-del", 1, 1, "[CHASSIS GROUP]", NULL,
> +      cmd_ha_ch_grp_del, NULL, "", RW },
> +    {"ha-chassis-group-list", 0, 0, "[CHASSIS GROUP]", NULL,
> +      cmd_ha_ch_grp_list, NULL, "", RO },
> +    {"ha-chassis-group-add-chassis", 3, 3, "[CHASSIS GROUP]", NULL,
> +      cmd_ha_ch_grp_add_chassis, NULL, "", RW },
> +    {"ha-chassis-group-remove-chassis", 2, 2, "[CHASSIS GROUP]", NULL,
> +      cmd_ha_ch_grp_remove_chassis, NULL, "", RW },
> +    {"ha-chassis-group-set-chassis-prio", 3, 3, "[CHASSIS GROUP]", NULL,
> +      cmd_ha_ch_grp_set_chassis_prio, NULL, "", RW },
> +
>      {NULL, 0, 0, NULL, NULL, NULL, NULL, "", RO},
>  };
>
> diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c
> index ee97a4710..c5ff9318f 100644
> --- a/ovn/utilities/ovn-sbctl.c
> +++ b/ovn/utilities/ovn-sbctl.c
> @@ -1189,6 +1189,12 @@ static const struct ctl_table_class tables[SBREC_N_TABLES] = {
>
>      [SBREC_TABLE_ADDRESS_SET].row_ids[0]
>      = {&sbrec_address_set_col_name, NULL, NULL},
> +
> +    [SBREC_TABLE_HA_CHASSIS_GROUP].row_ids[0]
> +    = {&sbrec_ha_chassis_group_col_name, NULL, NULL},
> +
> +    [SBREC_TABLE_HA_CHASSIS].row_ids[0]
> +    = {&sbrec_ha_chassis_col_chassis, NULL, NULL},
>  };
>
>
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 1878eb2df..b4e8995be 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -34,6 +34,33 @@ AT_CHECK([ovn-sbctl --bare --columns gateway_chassis find port_binding logical_p
>  AT_CHECK([ovn-sbctl --bare --columns gateway_chassis find port_binding logical_port="cr-alice" | grep $gwc2_uuid | wc -l], [0], [1
>  ])
>
> +# There should be one ha_chassis_group with the name "alice"
> +ha_chassi_grp_name=`ovn-sbctl --bare --columns name find \
> +ha_chassis_group name="alice"`
> +
> +AT_CHECK([test $ha_chassi_grp_name = alice])
> +
> +ha_chgrp_uuid=`ovn-sbctl --bare --columns _uuid find ha_chassis_group name=alice`
> +
> +AT_CHECK([ovn-sbctl --bare --columns ha_chassis_group find port_binding \
> +logical_port="cr-alice" | grep $ha_chgrp_uuid | wc -l], [0], [1
> +])
> +
> +ha_ch=`ovn-sbctl --bare --columns ha_chassis  find ha_chassis_group`
> +# Trim the spaces.
> +ha_ch=`echo $ha_ch | sed 's/ //g'`
> +
> +ha_ch_list=''
> +for i in `ovn-sbctl --bare --columns _uuid list ha_chassis | sort`
> +do
> +    ha_ch_list="$ha_ch_list $i"
> +done
> +
> +# Trim the spaces.
> +ha_ch_list=`echo $ha_ch_list | sed 's/ //g'`
> +
> +AT_CHECK([test "$ha_ch_list" = "$ha_ch"])
> +
>  # delete the 2nd Gateway_Chassis on NBDB for alice port
>
>  ovn-nbctl --wait=sb set Logical_Router_Port alice gateway_chassis=${nb_gwc1_uuid}
> @@ -45,6 +72,10 @@ AT_CHECK([ovn-sbctl --bare --columns gateway_chassis find port_binding logical_p
>
>  AT_CHECK([ovn-sbctl find Gateway_Chassis name=alice_gw2], [0], [])
>
> +# There should be only 1 row in ha_chassis SB DB table.
> +AT_CHECK([ovn-sbctl --bare --columns _uuid list ha_chassis | wc -l], [0], [1
> +])
> +
>  # delete all the gateway_chassis on NBDB for alice port
>
>  ovn-nbctl --wait=sb clear Logical_Router_Port alice gateway_chassis
> @@ -54,6 +85,12 @@ ovn-nbctl --wait=sb clear Logical_Router_Port alice gateway_chassis
>  AT_CHECK([ovn-sbctl find Gateway_Chassis name=alice_gw1], [0], [])
>  AT_CHECK([ovn-sbctl find Gateway_Chassis name=alice_gw2], [0], [])
>
> +# expect that the ha_chassis doesn't exist anymore
> +AT_CHECK([ovn-sbctl list ha_chassis | wc -l], [0], [0
> +])
> +AT_CHECK([ovn-sbctl list ha_chassis_group | wc -l], [0], [0
> +])
> +
>  AT_CLEANUP
>
>  AT_SETUP([ovn -- check Gateway_Chassis propagation from NBDB to SBDB backwards compatibility])
> @@ -301,3 +338,228 @@ as northd
>  OVS_APP_EXIT_AND_WAIT([ovn-northd])
>
>  AT_CLEANUP
> +
> +AT_SETUP([ovn -- check HA_Chassis_Group propagation from NBDB to SBDB])
> +AT_SKIP_IF([test $HAVE_PYTHON = no])
> +ovn_start
> +
> +ovn-nbctl --wait=sb ha-chassis-group-add hagrp1
> +
> +# ovn-northd should not create HA chassis group and HA chassis rows
> +# unless the HA chassis group in OVN NB DB is associated to
> +# a logical router port. ovn-northd still doesn't support
> +# associating a HA chassis group to a logical router port.
> +AT_CHECK([ovn-sbctl --bare --columns name find ha_chassis_group name="hagrp1" \
> +| wc -l], [0], [0
> +])
> +
> +ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp1 ch1 30
> +ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp1 ch2 20
> +ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp1 ch3 10
> +
> +# There should be no HA_Chassis rows in SB DB.
> +AT_CHECK([ovn-sbctl list ha_chassis | grep chassis | awk '{print $3}' \
> +| grep -v '-' | wc -l ], [0], [0
> +])
> +
> +# Add chassis ch1.
> +ovn-sbctl chassis-add ch1 geneve 127.0.0.2
> +
> +OVS_WAIT_UNTIL([test 1 = `ovn-sbctl list chassis | grep ch1 | wc -l`])
> +
> +# There should be no HA_Chassis rows
> +AT_CHECK([ovn-sbctl list ha_chassis | grep chassis | awk '{print $3}' \
> +| grep -v '-' | wc -l ], [0], [0
> +])
> +
> +ovn-nbctl ha-chassis-group-del hagrp1
> +OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis_group | wc -l`])
> +OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis | wc -l`])
> +
> +# Create a logical router port and attach Gateway chassis.
> +# ovn-northd should create HA chassis group rows in SB DB.
> +ovn-nbctl lr-add lr0
> +
> +ovn-nbctl lrp-add lr0 lr0-public 00:00:20:20:12:13 172.168.0.100/24
> +ovn-nbctl lrp-set-gateway-chassis lr0-public ch1 20
> +
> +OVS_WAIT_UNTIL([test 1 = `ovn-sbctl --bare --columns name find ha_chassis_group \
> +name="lr0-public" | wc -l`])
> +
> +OVS_WAIT_UNTIL([test 1 = `ovn-sbctl --bare --columns _uuid \
> +find ha_chassis | wc -l`])
> +
> +ovn-nbctl lrp-set-gateway-chassis lr0-public ch2 10
> +
> +ovn-sbctl --bare --columns _uuid find ha_chassis
> +OVS_WAIT_UNTIL([test 2 = `ovn-sbctl list ha_chassis | grep chassis | wc -l`])
> +
> +# Test if 'ref_chassis' column is properly set or not in
> +# SB DB ha_chassis_group.
> +ovn-nbctl ls-add sw0
> +ovn-nbctl lsp-add sw0 sw0-p1
> +
> +ovn-sbctl chassis-add ch2 geneve 127.0.0.3
> +ovn-sbctl chassis-add ch3 geneve 127.0.0.4
> +ovn-sbctl chassis-add comp1 geneve 127.0.0.5
> +ovn-sbctl chassis-add comp2 geneve 127.0.0.6
> +
> +ovn-nbctl lrp-add lr0 lr0-sw0 00:00:20:20:12:14 10.0.0.1/24
> +ovn-nbctl lsp-add sw0 sw0-lr0
> +ovn-nbctl lsp-set-type sw0-lr0 router
> +ovn-nbctl lsp-set-addresses sw0-lr0 router
> +ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0
> +
> +ovn-sbctl lsp-bind sw0-p1 comp1
> +OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p1` = xup])
> +
> +comp1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="comp1"`
> +comp2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="comp2"`
> +ch2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="comp1"`
> +
> +echo "comp1_ch_uuid = $comp1_ch_uuid"
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$comp1_ch_uuid" = "$ref_ch_list"])
> +
> +# unbind sw0-p1
> +ovn-sbctl lsp-unbind sw0-p1
> +OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p1` = xdown])
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "" = "$ref_ch_list"])
> +
> +# Bind sw0-p1 in comp2
> +ovn-sbctl lsp-bind sw0-p1 comp2
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$comp2_ch_uuid" = "$ref_ch_list"])
> +
> +ovn-nbctl ls-add sw1
> +ovn-nbctl lsp-add sw1 sw1-p1
> +ovn-nbctl lr-add lr1
> +ovn-nbctl lrp-add lr1 lr1-sw1 00:00:20:20:12:15 20.0.0.1/24
> +ovn-nbctl lsp-add sw1 sw1-lr1
> +ovn-nbctl lsp-set-type sw1-lr1 router
> +ovn-nbctl lsp-set-addresses sw1-lr1 router
> +ovn-nbctl lsp-set-options sw1-lr1 router-port=lr1-sw1
> +
> +# Bind sw1-p1 in comp1.
> +ovn-sbctl lsp-bind sw1-p1 comp1
> +# Wait until sw1-p1 is up
> +OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw1-p1` = xup])
> +
> +# sw1-p1 is not connected to lr0. So comp1 should not be in 'ref_chassis'
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$comp2_ch_uuid" = "$ref_ch_list"])
> +
> +# Now attach sw0 to lr1
> +ovn-nbctl lrp-add lr1 lr1-sw0 00:00:20:20:12:16 10.0.0.10/24
> +ovn-nbctl lsp-add sw0 sw0-lr1
> +ovn-nbctl lsp-set-type sw0-lr1 router
> +ovn-nbctl lsp-set-addresses sw0-lr1 router
> +ovn-nbctl lsp-set-options sw0-lr1 router-port=lr1-sw0
> +
> +# Both comp1 and comp2 should be in 'ref_chassis' as sw1 is indirectly
> +# connected to lr0
> +exp_ref_ch_list=''
> +for i in `ovn-sbctl --bare --columns _uuid list chassis | sort`
> +do
> +    if test $i = $comp1_ch_uuid; then
> +        exp_ref_ch_list="${exp_ref_ch_list}$i"
> +    elif test $i = $comp2_ch_uuid; then
> +        exp_ref_ch_list="${exp_ref_ch_list}$i"
> +    fi
> +done
> +
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$exp_ref_ch_list" = "$ref_ch_list"])
> +
> +# Unind sw1-p1. comp2 should not be in the ref_chassis.
> +ovn-sbctl lsp-unbind sw1-p1
> +OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw1-p1` = xdown])
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$comp2_ch_uuid" = "$ref_ch_list"])
> +
> +# Create sw2 and attach it to lr2
> +ovn-nbctl ls-add sw2
> +ovn-nbctl lsp-add sw2 sw2-p1
> +ovn-nbctl lr-add lr2
> +ovn-nbctl lrp-add lr2 lr2-sw2 00:00:20:20:12:17 30.0.0.1/24
> +ovn-nbctl lsp-add sw2 sw2-lr2
> +ovn-nbctl lsp-set-type sw2-lr2 router
> +ovn-nbctl lsp-set-addresses sw2-lr2 router
> +ovn-nbctl lsp-set-options sw2-lr2 router-port=lr2-sw2
> +
> +# Bind sw2-p1 to comp1
> +ovn-sbctl lsp-bind sw2-p1 comp1
> +# Wait until sw2-p1 is up
> +OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw2-p1` = xup])
> +
> +# sw2-p1 is not connected to lr0. So comp1 should not be in 'ref_chassis'
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$comp2_ch_uuid" = "$ref_ch_list"])
> +
> +# Now attach sw1 to lr2. With this sw2-p1 is indirectly connected to lr0.
> +ovn-nbctl lrp-add lr2 lr2-sw1 00:00:20:20:12:18 20.0.0.10/24
> +ovn-nbctl lsp-add sw1 sw1-lr2
> +ovn-nbctl lsp-set-type sw1-lr2 router
> +ovn-nbctl lsp-set-addresses sw1-lr2 router
> +ovn-nbctl lsp-set-options sw1-lr2 router-port=lr2-sw1
> +
> +# sw2-p1 is indirectly connected to lr0. So comp1 (and comp2) should be in
> +# 'ref_chassis'
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$exp_ref_ch_list" = "$ref_ch_list"])
> +
> +# Create sw0-p2 and bind it to comp1
> +ovn-nbctl lsp-add sw0 sw0-p2
> +ovn-sbctl lsp-bind sw0-p2 comp1
> +OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p2` = xup])
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$exp_ref_ch_list" = "$ref_ch_list"])
> +
> +# unbind sw0-p2
> +ovn-sbctl lsp-unbind sw0-p2
> +OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p2` = xdown])
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$exp_ref_ch_list" = "$ref_ch_list"])
> +
> +# Delete lr1-sw0. comp1 should be deleted from ref_chassis as there is no link
> +# from sw1 and sw2 to lr0.
> +ovn-nbctl lrp-del lr1-sw0
> +
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$comp2_ch_uuid" = "$ref_ch_list"])
> +
> +AT_CLEANUP
> diff --git a/tests/ovn.at b/tests/ovn.at
> index ec79651bd..dedbadd1b 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -8386,6 +8386,16 @@ as ext1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
>
>  AT_CHECK([ovn-nbctl --timeout=3 --wait=sb sync], [0], [ignore])
>
> +# hv1 should be in 'ref_chassis' of the ha_chasssi_group as logical
> +# switch 'foo' can reach the router 'R1' (which has gw router port)
> +# via foo1 -> foo -> R0 -> join -> R1
> +hv1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv1"`
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$hv1_ch_uuid" = "$ref_ch_list"])
> +
>  # Allow some time for ovn-northd and ovn-controller to catch up.
>  # XXX This should be more systematic.
>  sleep 2
> @@ -9754,6 +9764,32 @@ echo "------ Port_Binding chassisredirect -------"
>  ovn-sbctl find Port_Binding type=chassisredirect
>  echo "-------------------------------------------"
>
> +# There should be one ha_chassis_group with the name "outside"
> +ha_chassi_grp_name=`ovn-sbctl --bare --columns name find \
> +ha_chassis_group name="outside"`
> +
> +AT_CHECK([test $ha_chassi_grp_name = outside])
> +
> +# There should be 2 ha_chassis rows in SB DB.
> +AT_CHECK([ovn-sbctl list ha_chassis | grep chassis | awk '{print $3}' \
> +| grep '-' | wc -l ], [0], [2
> +])
> +
> +ha_ch=`ovn-sbctl --bare --columns ha_chassis  find ha_chassis_group`
> +# Trim the spaces.
> +ha_ch=`echo $ha_ch | sed 's/ //g'`
> +
> +ha_ch_list=''
> +for i in `ovn-sbctl --bare --columns _uuid list ha_chassis | sort`
> +do
> +    ha_ch_list="$ha_ch_list $i"
> +done
> +
> +# Trim the spaces.
> +ha_ch_list=`echo $ha_ch_list | sed 's/ //g'`
> +
> +AT_CHECK([test "$ha_ch_list" = "$ha_ch"])
> +
>  for chassis in gw1 gw2 hv1 hv2; do
>      as $chassis
>      echo "------ $chassis dump ----------"
> @@ -9798,33 +9834,56 @@ as hv2 ovs-ofctl dump-flows br-int table=32
>  gw1_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw1)
>  gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
>
> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv1_gw1_ofport,$hv1_gw2_ofport | wc -l], [0], [1
> +OVS_WAIT_UNTIL([as hv1 ovs-ofctl dump-flows br-int table=32 | \
> +grep active_backup | grep slaves:$hv1_gw1_ofport,$hv1_gw2_ofport \
> +| wc -l], [0], [1
>  ])
>
> -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv2_gw1_ofport,$hv2_gw2_ofport | wc -l], [0], [1
> +OVS_WAIT_UNTIL([as hv2 ovs-ofctl dump-flows br-int table=32 | \
> +grep active_backup | grep slaves:$hv2_gw1_ofport,$hv2_gw2_ofport \
> +| wc -l], [0], [1
>  ])
>
> -sleep 3 # let BFD sessions settle so we get the right flows on the right chassis
> -
>  # make sure that flows for handling the outside router port reside on gw1
> -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
> +OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=24 | \
> +grep 00:00:02:01:02:04 | wc -l], [0], [[1
>  ]])
> -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
> +
> +OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=24 | \
> +grep 00:00:02:01:02:04 | wc -l], [0], [[0
>  ]])
>
>  # make sure ARP responder flows for outside router port reside on gw1 too
> -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=9 | grep arp_tpa=192.168.0.101 | wc -l], [0], [[1
> +OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=9 | \
> +grep arp_tpa=192.168.0.101 | wc -l], [0], [[1
>  ]])
> -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=9 | grep arp_tpa=192.168.0.101 | wc -l], [0], [[0
> +OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=9 | grep arp_tpa=192.168.0.101 | wc -l], [0], [[0
>  ]])
>
> -
> -
>  # check that the chassis redirect port has been claimed by the gw1 chassis
> -AT_CHECK([ovn-sbctl --columns chassis --bare find Port_Binding logical_port=cr-outside | grep $gw1_chassis | wc -l],
> -         [0],[[1
> +OVS_WAIT_UNTIL([ovn-sbctl --columns chassis --bare find Port_Binding \
> +logical_port=cr-outside | grep $gw1_chassis | wc -l], [0],[[1
>  ]])
>
> +hv1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv1"`
> +hv2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv2"`
> +
> +exp_ref_ch_list=''
> +for i in `ovn-sbctl --bare --columns _uuid list chassis | sort`
> +do
> +    if test $i = $hv1_ch_uuid; then
> +        exp_ref_ch_list="${exp_ref_ch_list}$i"
> +    elif test $i = $hv2_ch_uuid; then
> +        exp_ref_ch_list="${exp_ref_ch_list}$i"
> +    fi
> +done
> +
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$exp_ref_ch_list" = "$ref_ch_list"])
> +
>
>  # at this point, we invert the priority of the gw chassis between gw1 and gw2
>
> @@ -9839,15 +9898,19 @@ ovn-nbctl --id=@gc0 create Gateway_Chassis \
>  ovn-nbctl --wait=hv --timeout=3 sync
>
>  # we make sure that the hypervisors noticed, and inverted the slave ports
> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv1_gw2_ofport,$hv1_gw1_ofport | wc -l], [0], [1
> +OVS_WAIT_UNTIL([as hv1 ovs-ofctl dump-flows br-int table=32 | \
> +grep active_backup | grep slaves:$hv1_gw2_ofport,$hv1_gw1_ofport \
> +| wc -l], [0], [1
>  ])
>
> -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv2_gw2_ofport,$hv2_gw1_ofport | wc -l], [0], [1
> +OVS_WAIT_UNTIL([as hv2 ovs-ofctl dump-flows br-int table=32 | \
> +grep active_backup | grep slaves:$hv2_gw2_ofport,$hv2_gw1_ofport \
> +| wc -l], [0], [1
>  ])
>
>  # check that the chassis redirect port has been reclaimed by the gw2 chassis
> -AT_CHECK([ovn-sbctl --columns chassis --bare find Port_Binding logical_port=cr-outside | grep $gw2_chassis | wc -l],
> -         [0],[[1
> +OVS_WAIT_UNTIL([ovn-sbctl --columns chassis --bare find Port_Binding \
> +logical_port=cr-outside | grep $gw2_chassis | wc -l], [0],[[1
>  ]])
>
>  # check BFD enablement on tunnel ports from gw1 #########
> @@ -9896,31 +9959,32 @@ AT_CHECK([ovs-vsctl --bare --columns bfd find Interface name=ovn-hv1-0],[0],
>           [[
>  ]])
>
> -sleep 3  # let BFD sessions settle so we get the right flows on the right chassis
> -
>  # make sure that flows for handling the outside router port reside on gw2 now
> -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
> +OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=24 | \
> +grep 00:00:02:01:02:04 | wc -l], [0], [[1
>  ]])
> -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
> +OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=24 | \
> +grep 00:00:02:01:02:04 | wc -l], [0], [[0
>  ]])
>
>  # disconnect GW2 from the network, GW1 should take over
>  as gw2
>  port=${sandbox}_br-phys
>  as main ovs-vsctl del-port n1 $port
> -sleep 4
>
>  bfd_dump
>
>  # make sure that flows for handling the outside router port reside on gw2 now
> -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
> +OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=24 | \
> +grep 00:00:02:01:02:04 | wc -l], [0], [[1
>  ]])
> -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
> +OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=24 | \
> +grep 00:00:02:01:02:04 | wc -l], [0], [[0
>  ]])
>
>  # check that the chassis redirect port has been reclaimed by the gw1 chassis
> -AT_CHECK([ovn-sbctl --columns chassis --bare find Port_Binding logical_port=cr-outside | grep $gw1_chassis | wc -l],
> -         [0],[[1
> +OVS_WAIT_UNTIL([ovn-sbctl --columns chassis --bare find Port_Binding \
> +logical_port=cr-outside | grep $gw1_chassis | wc -l], [0],[[1
>  ]])
>
>  ovn-nbctl --wait=hv set NB_Global . options:"bfd-min-rx"=2000
> @@ -9950,6 +10014,37 @@ for chassis in gw1 hv1 hv2; do
>  ])
>  done
>
> +# Delete the inside1 vif. The ref_chassis in ha_chassis_group shouldn't have
> +# reference to hv1.
> +as hv1 ovs-vsctl del-port hv1-vif1
> +
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     test "$hv2_ch_uuid" = "$ref_ch_list"])
> +
> +# Delete the inside2 vif.
> +ovn-sbctl show
> +
> +echo "Deleting hv2-vif1"
> +as hv2 ovs-vsctl del-port hv2-vif1
> +
> +# ref_chassis of ha_chassis_group should be empty
> +OVS_WAIT_UNTIL(
> +    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
> +     # Trim the spaces.
> +     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
> +     exp_ref_ch_list=""
> +     test "$exp_ref_ch_list" = "$ref_ch_list"])
> +
> +# Delete the Gateway_Chassis for lrp - outside
> +ovn-nbctl clear Logical_Router_Port outside gateway_chassis
> +
> +# There shoud be no ha_chassis_group rows in SB DB.
> +OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis_group | wc -l`])
> +OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis | wc -l`])
> +
>  OVN_CLEANUP([gw1],[gw2],[hv1],[hv2])
>
>  AT_CLEANUP
> @@ -10186,10 +10281,7 @@ ovn-nbctl --wait=hv --timeout=3 sync
>  gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
>  ovn-sbctl destroy Chassis $gw2_chassis
>
> -# Ensure ovn-controller has processed latest sbdb update
> -# ovn-nbctl --wait=hv sync
> -
> -AT_CHECK([grep "Releasing lport" gw1/ovn-controller.log], [1], [])
> +OVS_WAIT_UNTIL([test 0 = `grep -c "Releasing lport" gw1/ovn-controller.log`])
>
>  OVN_CLEANUP([gw1],[gw2],[hv1])
>
> --
> 2.20.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Han Zhou March 5, 2019, 6:03 p.m. | #3
> > +
> > +static void
> > +add_to_ha_ref_chassis_info(struct ha_ref_chassis_info *ref_ch_info,
> > +                           const struct sbrec_chassis *chassis)
> > +{
> > +    for (size_t j = 0; j < ref_ch_info->n_ref_chassis; j++) {
> > +        if (ref_ch_info->ref_chassis[j] == chassis) {
> > +           return;
> > +        }
> > +    }
> > +
> > +    ref_ch_info->ref_chassis = xrealloc(ref_ch_info->ref_chassis,
> > +                                        sizeof *ref_ch_info->ref_chassis *
> > +                                        (++ref_ch_info->n_ref_chassis));
>
> This may be inefficient, considering the amount of ref chassises to be
> added for each HA group. It is better to xrealloc for original_size *
> 2 every time and expand only when more space is needed.
>
> > +    ref_ch_info->ref_chassis[ref_ch_info->n_ref_chassis - 1] =
> > +        CONST_CAST(struct sbrec_chassis *, chassis);
> > +}
> > +
> > +static void
> > +update_sb_ha_group_ref_chassis(struct shash *ha_ref_chassis_map)
> > +{
> > +    struct shash_node *node, *next;
> > +    SHASH_FOR_EACH_SAFE (node, next, ha_ref_chassis_map) {
> > +        struct ha_ref_chassis_info *ha_ref_info = node->data;
> > +        sbrec_ha_chassis_group_set_ref_chassis(ha_ref_info->ha_chassis_group,
> > +                                               ha_ref_info->ref_chassis,
> > +                                               ha_ref_info->n_ref_chassis);
> > +        free(ha_ref_info->ref_chassis);
> > +        free(ha_ref_info);
> > +        shash_delete(ha_ref_chassis_map, node);
> > +    }
> > +}
> > +
> > +/* This function returns logical router datapath (with a distributed
> > + * gateway port) to which 'od' is connected to - either directly
> > + * or indirectly (via transit logical switches).
> > + * Returns NULL if no logical router with gw port found.
> > + */
> > +static struct ovn_datapath *
> > +get_router_dp_with_gw_port(struct hmap *ports,
> > +                           struct ovn_datapath *od,
> > +                           struct ovn_datapath *peer_od)
> > +{
> > +    if (!od) {
> > +        return NULL;
> > +    }
> > +
> > +    if (od->nbs) {
> > +        /* It's a logical switch datapath. */
> > +        if (peer_od) {
> > +            /* If peer datapath is not logical router, then
> > +             * something is wrong. */
> > +            ovs_assert(peer_od->nbr);
> > +        }
> > +
> > +        for (size_t i = 0; i < od->n_router_ports; i++) {
> > +            if (!od->router_ports[i]->peer) {
> > +                /* If there is no peer port connecting to the
> > +                 * router datapath, ignore it. */
> > +                continue;
> > +            }
> > +
> > +            struct ovn_datapath *router_dp = od->router_ports[i]->peer->od;
> > +            if (router_dp->l3dgw_port && router_dp->l3dgw_port->nbrp) {
> > +                /* Router datapath has a distributed gateway router port. */
> > +                return router_dp;
>
> I think we can't return when just one router_dp is found. There can be
> more than one connected router that has gateway router ports. So the
> return value of this function should be a set.
>
> > +            }
> > +        }
> > +
> > +        /* The logical switch datapath is not connected to any
> > +         * logical router with a distributed gateway port. Check if it
> > +         * is indirectly connected to a logical router with a gw port. */
> > +        for (size_t i = 0; i < od->n_router_ports; i++) {
> > +            if (!od->router_ports[i]->peer) {
> > +                continue;
> > +            }
> > +
> > +            struct ovn_datapath *router_dp =
> > +                od->router_ports[i]->peer->od;
> > +
> > +            /* If we don't check this, we will be in an infinite loop. */
> > +            if (router_dp != peer_od) {
>
> peer_od should also be a set, it should skip checking any datapath
> that has already been checked. Consider the case LR1 is connected with
> LS1, LS2 and LS3.
>
> > +                router_dp = get_router_dp_with_gw_port(ports, router_dp,
> > +                                                       od);
> > +                if (router_dp) {
> > +                    /* Found a logical router with gw port indirectly connected
> > +                     * to 'od'. */
> > +                    return router_dp;
> > +                }
> > +            }
> > +        }
> > +    } else if (od->nbr) {
> > +        /* It's a logical router datapath. */
> > +        if (peer_od) {
> > +            /* If peer datapath is not logical switch, then
> > +             * something is wrong. */
> > +            ovs_assert(peer_od->nbs);
>
> A router port can be peered with another router port directly, so this
> assert is not true.
>
> > +        }
> > +
> > +        /* Check if this logical router datapath is indirectly connected
> > +         * to another logical router via a transit logical switch(es). */
> > +        for (size_t i = 0; i < od->nbr->n_ports; i++) {
> > +            struct ovn_port *router_port =
> > +                ovn_port_find(ports, od->nbr->ports[i]->name);
> > +
> > +            if (!router_port || !router_port->peer) {
> > +                continue;
> > +            }
> > +            /* If we don't check this, we will be in an infinite loop. */
> > +            if (router_port->peer->od != peer_od) {
> > +                struct ovn_datapath *router_dp;
> > +                /* router_port->peer->od points a logical switch datapath. */
> > +                router_dp = get_router_dp_with_gw_port(ports,
> > +                                                       router_port->peer->od,
> > +                                                       od);
> > +                if (router_dp) {
> > +                    /* Found a logical router with gw port indirectly connected
> > +                    * to 'od'. */
> > +                    return router_dp;
> > +                }
> > +            }
> > +        }
> > +    }
> > +
> > +    return NULL;
> > +}
>
> In general, it may refer to the flood-fill approach implemented in
> ovn-controller for populating local datapaths in binding.c, which is
> similar to the purpose here. However, we should consider an
> optimization here since the flood-fill cost would apply for every
> port-binding. It can be optimized with a cache, which maps between
> each datapath and its related router_dps with gw ports, to avoid
> repeated traversing. (In ovn-controller it can be optimized, too, but
> the gain is not as big as here since only local port-bindings are
> checked in ovn-controller.)
>
I forgot to mention, this implementation is for finding out the
required BFD sessions, however, it can be reused for a more generic
purpose - to figure out whether a tunnel is needed between each pair
of chassises. There happens to be a related discussion ongoing here:
https://mail.openvswitch.org/pipermail/ovs-discuss/2019-March/048281.html

Thanks,
Han
Numan Siddique March 7, 2019, 4:37 p.m. | #4
On Tue, Mar 5, 2019 at 11:34 PM Han Zhou <zhouhan@gmail.com> wrote:

> > > +
> > > +static void
> > > +add_to_ha_ref_chassis_info(struct ha_ref_chassis_info *ref_ch_info,
> > > +                           const struct sbrec_chassis *chassis)
> > > +{
> > > +    for (size_t j = 0; j < ref_ch_info->n_ref_chassis; j++) {
> > > +        if (ref_ch_info->ref_chassis[j] == chassis) {
> > > +           return;
> > > +        }
> > > +    }
> > > +
> > > +    ref_ch_info->ref_chassis = xrealloc(ref_ch_info->ref_chassis,
> > > +                                        sizeof
> *ref_ch_info->ref_chassis *
> > > +
> (++ref_ch_info->n_ref_chassis));
> >
> > This may be inefficient, considering the amount of ref chassises to be
> > added for each HA group. It is better to xrealloc for original_size *
> > 2 every time and expand only when more space is needed.
> >
> > > +    ref_ch_info->ref_chassis[ref_ch_info->n_ref_chassis - 1] =
> > > +        CONST_CAST(struct sbrec_chassis *, chassis);
> > > +}
> > > +
> > > +static void
> > > +update_sb_ha_group_ref_chassis(struct shash *ha_ref_chassis_map)
> > > +{
> > > +    struct shash_node *node, *next;
> > > +    SHASH_FOR_EACH_SAFE (node, next, ha_ref_chassis_map) {
> > > +        struct ha_ref_chassis_info *ha_ref_info = node->data;
> > > +
> sbrec_ha_chassis_group_set_ref_chassis(ha_ref_info->ha_chassis_group,
> > > +
>  ha_ref_info->ref_chassis,
> > > +
>  ha_ref_info->n_ref_chassis);
> > > +        free(ha_ref_info->ref_chassis);
> > > +        free(ha_ref_info);
> > > +        shash_delete(ha_ref_chassis_map, node);
> > > +    }
> > > +}
> > > +
> > > +/* This function returns logical router datapath (with a distributed
> > > + * gateway port) to which 'od' is connected to - either directly
> > > + * or indirectly (via transit logical switches).
> > > + * Returns NULL if no logical router with gw port found.
> > > + */
> > > +static struct ovn_datapath *
> > > +get_router_dp_with_gw_port(struct hmap *ports,
> > > +                           struct ovn_datapath *od,
> > > +                           struct ovn_datapath *peer_od)
> > > +{
> > > +    if (!od) {
> > > +        return NULL;
> > > +    }
> > > +
> > > +    if (od->nbs) {
> > > +        /* It's a logical switch datapath. */
> > > +        if (peer_od) {
> > > +            /* If peer datapath is not logical router, then
> > > +             * something is wrong. */
> > > +            ovs_assert(peer_od->nbr);
> > > +        }
> > > +
> > > +        for (size_t i = 0; i < od->n_router_ports; i++) {
> > > +            if (!od->router_ports[i]->peer) {
> > > +                /* If there is no peer port connecting to the
> > > +                 * router datapath, ignore it. */
> > > +                continue;
> > > +            }
> > > +
> > > +            struct ovn_datapath *router_dp =
> od->router_ports[i]->peer->od;
> > > +            if (router_dp->l3dgw_port && router_dp->l3dgw_port->nbrp)
> {
> > > +                /* Router datapath has a distributed gateway router
> port. */
> > > +                return router_dp;
> >
> > I think we can't return when just one router_dp is found. There can be
> > more than one connected router that has gateway router ports. So the
> > return value of this function should be a set.
> >
> > > +            }
> > > +        }
> > > +
> > > +        /* The logical switch datapath is not connected to any
> > > +         * logical router with a distributed gateway port. Check if it
> > > +         * is indirectly connected to a logical router with a gw
> port. */
> > > +        for (size_t i = 0; i < od->n_router_ports; i++) {
> > > +            if (!od->router_ports[i]->peer) {
> > > +                continue;
> > > +            }
> > > +
> > > +            struct ovn_datapath *router_dp =
> > > +                od->router_ports[i]->peer->od;
> > > +
> > > +            /* If we don't check this, we will be in an infinite
> loop. */
> > > +            if (router_dp != peer_od) {
> >
> > peer_od should also be a set, it should skip checking any datapath
> > that has already been checked. Consider the case LR1 is connected with
> > LS1, LS2 and LS3.
> >
> > > +                router_dp = get_router_dp_with_gw_port(ports,
> router_dp,
> > > +                                                       od);
> > > +                if (router_dp) {
> > > +                    /* Found a logical router with gw port indirectly
> connected
> > > +                     * to 'od'. */
> > > +                    return router_dp;
> > > +                }
> > > +            }
> > > +        }
> > > +    } else if (od->nbr) {
> > > +        /* It's a logical router datapath. */
> > > +        if (peer_od) {
> > > +            /* If peer datapath is not logical switch, then
> > > +             * something is wrong. */
> > > +            ovs_assert(peer_od->nbs);
> >
> > A router port can be peered with another router port directly, so this
> > assert is not true.
> >
> > > +        }
> > > +
> > > +        /* Check if this logical router datapath is indirectly
> connected
> > > +         * to another logical router via a transit logical
> switch(es). */
> > > +        for (size_t i = 0; i < od->nbr->n_ports; i++) {
> > > +            struct ovn_port *router_port =
> > > +                ovn_port_find(ports, od->nbr->ports[i]->name);
> > > +
> > > +            if (!router_port || !router_port->peer) {
> > > +                continue;
> > > +            }
> > > +            /* If we don't check this, we will be in an infinite
> loop. */
> > > +            if (router_port->peer->od != peer_od) {
> > > +                struct ovn_datapath *router_dp;
> > > +                /* router_port->peer->od points a logical switch
> datapath. */
> > > +                router_dp = get_router_dp_with_gw_port(ports,
> > > +
>  router_port->peer->od,
> > > +                                                       od);
> > > +                if (router_dp) {
> > > +                    /* Found a logical router with gw port indirectly
> connected
> > > +                    * to 'od'. */
> > > +                    return router_dp;
> > > +                }
> > > +            }
> > > +        }
> > > +    }
> > > +
> > > +    return NULL;
> > > +}
> >
> > In general, it may refer to the flood-fill approach implemented in
> > ovn-controller for populating local datapaths in binding.c, which is
> > similar to the purpose here. However, we should consider an
> > optimization here since the flood-fill cost would apply for every
> > port-binding. It can be optimized with a cache, which maps between
> > each datapath and its related router_dps with gw ports, to avoid
> > repeated traversing. (In ovn-controller it can be optimized, too, but
> > the gain is not as big as here since only local port-bindings are
> > checked in ovn-controller.)
> >
>

Thanks for the comments Han. I will work on them.
I haven't thought of an optimization yet. I will try to explore based
on your suggestions. At the same time, I personally think the code
shouldn't get complicated because of the optimization. (I feel the present
code in ovn-controller [1] is a bit complicated -
[1] -
https://github.com/openvswitch/ovs/blob/master/ovn/controller/bfd.c#L98
)

I think it is uncommon to have indirect logical router connections which
eventually
connect to a router with a gateway router port.
May be I am wrong. Do you think it's common to have such setups ?


I forgot to mention, this implementation is for finding out the
> required BFD sessions, however, it can be reused for a more generic
> purpose - to figure out whether a tunnel is needed between each pair
> of chassises. There happens to be a related discussion ongoing here:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2019-March/048281.html
>
>
I have seen thread. I will try to make it generic.
If not I think we can revisit the code later and make it generic and more
optimized.
As a first step I think it should be reasonable to aim to have the solution
for the HA_Chassis_Group's ref_chassis calculation.
Does this sound good ?

Thanks
Numan

Thanks,
> Han
>
Han Zhou March 7, 2019, 5:15 p.m. | #5
On Thu, Mar 7, 2019 at 8:37 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>
>
>
> On Tue, Mar 5, 2019 at 11:34 PM Han Zhou <zhouhan@gmail.com> wrote:
>>
>> > > +
>> > > +static void
>> > > +add_to_ha_ref_chassis_info(struct ha_ref_chassis_info *ref_ch_info,
>> > > +                           const struct sbrec_chassis *chassis)
>> > > +{
>> > > +    for (size_t j = 0; j < ref_ch_info->n_ref_chassis; j++) {
>> > > +        if (ref_ch_info->ref_chassis[j] == chassis) {
>> > > +           return;
>> > > +        }
>> > > +    }
>> > > +
>> > > +    ref_ch_info->ref_chassis = xrealloc(ref_ch_info->ref_chassis,
>> > > +                                        sizeof *ref_ch_info->ref_chassis *
>> > > +                                        (++ref_ch_info->n_ref_chassis));
>> >
>> > This may be inefficient, considering the amount of ref chassises to be
>> > added for each HA group. It is better to xrealloc for original_size *
>> > 2 every time and expand only when more space is needed.
>> >
>> > > +    ref_ch_info->ref_chassis[ref_ch_info->n_ref_chassis - 1] =
>> > > +        CONST_CAST(struct sbrec_chassis *, chassis);
>> > > +}
>> > > +
>> > > +static void
>> > > +update_sb_ha_group_ref_chassis(struct shash *ha_ref_chassis_map)
>> > > +{
>> > > +    struct shash_node *node, *next;
>> > > +    SHASH_FOR_EACH_SAFE (node, next, ha_ref_chassis_map) {
>> > > +        struct ha_ref_chassis_info *ha_ref_info = node->data;
>> > > +        sbrec_ha_chassis_group_set_ref_chassis(ha_ref_info->ha_chassis_group,
>> > > +                                               ha_ref_info->ref_chassis,
>> > > +                                               ha_ref_info->n_ref_chassis);
>> > > +        free(ha_ref_info->ref_chassis);
>> > > +        free(ha_ref_info);
>> > > +        shash_delete(ha_ref_chassis_map, node);
>> > > +    }
>> > > +}
>> > > +
>> > > +/* This function returns logical router datapath (with a distributed
>> > > + * gateway port) to which 'od' is connected to - either directly
>> > > + * or indirectly (via transit logical switches).
>> > > + * Returns NULL if no logical router with gw port found.
>> > > + */
>> > > +static struct ovn_datapath *
>> > > +get_router_dp_with_gw_port(struct hmap *ports,
>> > > +                           struct ovn_datapath *od,
>> > > +                           struct ovn_datapath *peer_od)
>> > > +{
>> > > +    if (!od) {
>> > > +        return NULL;
>> > > +    }
>> > > +
>> > > +    if (od->nbs) {
>> > > +        /* It's a logical switch datapath. */
>> > > +        if (peer_od) {
>> > > +            /* If peer datapath is not logical router, then
>> > > +             * something is wrong. */
>> > > +            ovs_assert(peer_od->nbr);
>> > > +        }
>> > > +
>> > > +        for (size_t i = 0; i < od->n_router_ports; i++) {
>> > > +            if (!od->router_ports[i]->peer) {
>> > > +                /* If there is no peer port connecting to the
>> > > +                 * router datapath, ignore it. */
>> > > +                continue;
>> > > +            }
>> > > +
>> > > +            struct ovn_datapath *router_dp = od->router_ports[i]->peer->od;
>> > > +            if (router_dp->l3dgw_port && router_dp->l3dgw_port->nbrp) {
>> > > +                /* Router datapath has a distributed gateway router port. */
>> > > +                return router_dp;
>> >
>> > I think we can't return when just one router_dp is found. There can be
>> > more than one connected router that has gateway router ports. So the
>> > return value of this function should be a set.
>> >
>> > > +            }
>> > > +        }
>> > > +
>> > > +        /* The logical switch datapath is not connected to any
>> > > +         * logical router with a distributed gateway port. Check if it
>> > > +         * is indirectly connected to a logical router with a gw port. */
>> > > +        for (size_t i = 0; i < od->n_router_ports; i++) {
>> > > +            if (!od->router_ports[i]->peer) {
>> > > +                continue;
>> > > +            }
>> > > +
>> > > +            struct ovn_datapath *router_dp =
>> > > +                od->router_ports[i]->peer->od;
>> > > +
>> > > +            /* If we don't check this, we will be in an infinite loop. */
>> > > +            if (router_dp != peer_od) {
>> >
>> > peer_od should also be a set, it should skip checking any datapath
>> > that has already been checked. Consider the case LR1 is connected with
>> > LS1, LS2 and LS3.
>> >
>> > > +                router_dp = get_router_dp_with_gw_port(ports, router_dp,
>> > > +                                                       od);
>> > > +                if (router_dp) {
>> > > +                    /* Found a logical router with gw port indirectly connected
>> > > +                     * to 'od'. */
>> > > +                    return router_dp;
>> > > +                }
>> > > +            }
>> > > +        }
>> > > +    } else if (od->nbr) {
>> > > +        /* It's a logical router datapath. */
>> > > +        if (peer_od) {
>> > > +            /* If peer datapath is not logical switch, then
>> > > +             * something is wrong. */
>> > > +            ovs_assert(peer_od->nbs);
>> >
>> > A router port can be peered with another router port directly, so this
>> > assert is not true.
>> >
>> > > +        }
>> > > +
>> > > +        /* Check if this logical router datapath is indirectly connected
>> > > +         * to another logical router via a transit logical switch(es). */
>> > > +        for (size_t i = 0; i < od->nbr->n_ports; i++) {
>> > > +            struct ovn_port *router_port =
>> > > +                ovn_port_find(ports, od->nbr->ports[i]->name);
>> > > +
>> > > +            if (!router_port || !router_port->peer) {
>> > > +                continue;
>> > > +            }
>> > > +            /* If we don't check this, we will be in an infinite loop. */
>> > > +            if (router_port->peer->od != peer_od) {
>> > > +                struct ovn_datapath *router_dp;
>> > > +                /* router_port->peer->od points a logical switch datapath. */
>> > > +                router_dp = get_router_dp_with_gw_port(ports,
>> > > +                                                       router_port->peer->od,
>> > > +                                                       od);
>> > > +                if (router_dp) {
>> > > +                    /* Found a logical router with gw port indirectly connected
>> > > +                    * to 'od'. */
>> > > +                    return router_dp;
>> > > +                }
>> > > +            }
>> > > +        }
>> > > +    }
>> > > +
>> > > +    return NULL;
>> > > +}
>> >
>> > In general, it may refer to the flood-fill approach implemented in
>> > ovn-controller for populating local datapaths in binding.c, which is
>> > similar to the purpose here. However, we should consider an
>> > optimization here since the flood-fill cost would apply for every
>> > port-binding. It can be optimized with a cache, which maps between
>> > each datapath and its related router_dps with gw ports, to avoid
>> > repeated traversing. (In ovn-controller it can be optimized, too, but
>> > the gain is not as big as here since only local port-bindings are
>> > checked in ovn-controller.)
>> >
>
>
> Thanks for the comments Han. I will work on them.
> I haven't thought of an optimization yet. I will try to explore based
> on your suggestions. At the same time, I personally think the code
> shouldn't get complicated because of the optimization. (I feel the present
> code in ovn-controller [1] is a bit complicated -
> [1] - https://github.com/openvswitch/ovs/blob/master/ovn/controller/bfd.c#L98
> )

Sorry, I should point out the location more clearly. I was talking
about this code:
https://github.com/openvswitch/ovs/blob/master/ovn/controller/binding.c#L113
It is about the algorithm that calculates related datapaths. I think
we need similar logic in northd, but since northd handles all port
bindings, it is worth to optimize with a cache to avoid repeat this
calculation for every port-binding.

>
> I think it is uncommon to have indirect logical router connections which eventually
> connect to a router with a gateway router port.
> May be I am wrong. Do you think it's common to have such setups ?
>

It may be uncommon, but I think it is important to make it correct in
these situations.

>
>> I forgot to mention, this implementation is for finding out the
>> required BFD sessions, however, it can be reused for a more generic
>> purpose - to figure out whether a tunnel is needed between each pair
>> of chassises. There happens to be a related discussion ongoing here:
>> https://mail.openvswitch.org/pipermail/ovs-discuss/2019-March/048281.html
>>
>
> I have seen thread. I will try to make it generic.
> If not I think we can revisit the code later and make it generic and more optimized.
> As a first step I think it should be reasonable to aim to have the solution for the HA_Chassis_Group's ref_chassis calculation.
> Does this sound good ?
>

I think it is fair to have separate patches to address the
optimizations. For current patch, if it is only about making HA group
generic for both GW and external port, I think we can simply put all
chassises as ref_chassis for all GW HA-groups, and enable bfd for
them, to make it work just like how it behaves today. Further
optimizations can be added separately.

Thanks,
Han
Numan Siddique March 7, 2019, 5:45 p.m. | #6
On Thu, Mar 7, 2019, 10:46 PM Han Zhou <zhouhan@gmail.com> wrote:

> On Thu, Mar 7, 2019 at 8:37 AM Numan Siddique <nusiddiq@redhat.com> wrote:
> >
> >
> >
> > On Tue, Mar 5, 2019 at 11:34 PM Han Zhou <zhouhan@gmail.com> wrote:
> >>
> >> > > +
> >> > > +static void
> >> > > +add_to_ha_ref_chassis_info(struct ha_ref_chassis_info *ref_ch_info,
> >> > > +                           const struct sbrec_chassis *chassis)
> >> > > +{
> >> > > +    for (size_t j = 0; j < ref_ch_info->n_ref_chassis; j++) {
> >> > > +        if (ref_ch_info->ref_chassis[j] == chassis) {
> >> > > +           return;
> >> > > +        }
> >> > > +    }
> >> > > +
> >> > > +    ref_ch_info->ref_chassis = xrealloc(ref_ch_info->ref_chassis,
> >> > > +                                        sizeof
> *ref_ch_info->ref_chassis *
> >> > > +
> (++ref_ch_info->n_ref_chassis));
> >> >
> >> > This may be inefficient, considering the amount of ref chassises to be
> >> > added for each HA group. It is better to xrealloc for original_size *
> >> > 2 every time and expand only when more space is needed.
> >> >
> >> > > +    ref_ch_info->ref_chassis[ref_ch_info->n_ref_chassis - 1] =
> >> > > +        CONST_CAST(struct sbrec_chassis *, chassis);
> >> > > +}
> >> > > +
> >> > > +static void
> >> > > +update_sb_ha_group_ref_chassis(struct shash *ha_ref_chassis_map)
> >> > > +{
> >> > > +    struct shash_node *node, *next;
> >> > > +    SHASH_FOR_EACH_SAFE (node, next, ha_ref_chassis_map) {
> >> > > +        struct ha_ref_chassis_info *ha_ref_info = node->data;
> >> > > +
> sbrec_ha_chassis_group_set_ref_chassis(ha_ref_info->ha_chassis_group,
> >> > > +
>  ha_ref_info->ref_chassis,
> >> > > +
>  ha_ref_info->n_ref_chassis);
> >> > > +        free(ha_ref_info->ref_chassis);
> >> > > +        free(ha_ref_info);
> >> > > +        shash_delete(ha_ref_chassis_map, node);
> >> > > +    }
> >> > > +}
> >> > > +
> >> > > +/* This function returns logical router datapath (with a
> distributed
> >> > > + * gateway port) to which 'od' is connected to - either directly
> >> > > + * or indirectly (via transit logical switches).
> >> > > + * Returns NULL if no logical router with gw port found.
> >> > > + */
> >> > > +static struct ovn_datapath *
> >> > > +get_router_dp_with_gw_port(struct hmap *ports,
> >> > > +                           struct ovn_datapath *od,
> >> > > +                           struct ovn_datapath *peer_od)
> >> > > +{
> >> > > +    if (!od) {
> >> > > +        return NULL;
> >> > > +    }
> >> > > +
> >> > > +    if (od->nbs) {
> >> > > +        /* It's a logical switch datapath. */
> >> > > +        if (peer_od) {
> >> > > +            /* If peer datapath is not logical router, then
> >> > > +             * something is wrong. */
> >> > > +            ovs_assert(peer_od->nbr);
> >> > > +        }
> >> > > +
> >> > > +        for (size_t i = 0; i < od->n_router_ports; i++) {
> >> > > +            if (!od->router_ports[i]->peer) {
> >> > > +                /* If there is no peer port connecting to the
> >> > > +                 * router datapath, ignore it. */
> >> > > +                continue;
> >> > > +            }
> >> > > +
> >> > > +            struct ovn_datapath *router_dp =
> od->router_ports[i]->peer->od;
> >> > > +            if (router_dp->l3dgw_port &&
> router_dp->l3dgw_port->nbrp) {
> >> > > +                /* Router datapath has a distributed gateway
> router port. */
> >> > > +                return router_dp;
> >> >
> >> > I think we can't return when just one router_dp is found. There can be
> >> > more than one connected router that has gateway router ports. So the
> >> > return value of this function should be a set.
> >> >
> >> > > +            }
> >> > > +        }
> >> > > +
> >> > > +        /* The logical switch datapath is not connected to any
> >> > > +         * logical router with a distributed gateway port. Check
> if it
> >> > > +         * is indirectly connected to a logical router with a gw
> port. */
> >> > > +        for (size_t i = 0; i < od->n_router_ports; i++) {
> >> > > +            if (!od->router_ports[i]->peer) {
> >> > > +                continue;
> >> > > +            }
> >> > > +
> >> > > +            struct ovn_datapath *router_dp =
> >> > > +                od->router_ports[i]->peer->od;
> >> > > +
> >> > > +            /* If we don't check this, we will be in an infinite
> loop. */
> >> > > +            if (router_dp != peer_od) {
> >> >
> >> > peer_od should also be a set, it should skip checking any datapath
> >> > that has already been checked. Consider the case LR1 is connected with
> >> > LS1, LS2 and LS3.
> >> >
> >> > > +                router_dp = get_router_dp_with_gw_port(ports,
> router_dp,
> >> > > +                                                       od);
> >> > > +                if (router_dp) {
> >> > > +                    /* Found a logical router with gw port
> indirectly connected
> >> > > +                     * to 'od'. */
> >> > > +                    return router_dp;
> >> > > +                }
> >> > > +            }
> >> > > +        }
> >> > > +    } else if (od->nbr) {
> >> > > +        /* It's a logical router datapath. */
> >> > > +        if (peer_od) {
> >> > > +            /* If peer datapath is not logical switch, then
> >> > > +             * something is wrong. */
> >> > > +            ovs_assert(peer_od->nbs);
> >> >
> >> > A router port can be peered with another router port directly, so this
> >> > assert is not true.
> >> >
> >> > > +        }
> >> > > +
> >> > > +        /* Check if this logical router datapath is indirectly
> connected
> >> > > +         * to another logical router via a transit logical
> switch(es). */
> >> > > +        for (size_t i = 0; i < od->nbr->n_ports; i++) {
> >> > > +            struct ovn_port *router_port =
> >> > > +                ovn_port_find(ports, od->nbr->ports[i]->name);
> >> > > +
> >> > > +            if (!router_port || !router_port->peer) {
> >> > > +                continue;
> >> > > +            }
> >> > > +            /* If we don't check this, we will be in an infinite
> loop. */
> >> > > +            if (router_port->peer->od != peer_od) {
> >> > > +                struct ovn_datapath *router_dp;
> >> > > +                /* router_port->peer->od points a logical switch
> datapath. */
> >> > > +                router_dp = get_router_dp_with_gw_port(ports,
> >> > > +
>  router_port->peer->od,
> >> > > +                                                       od);
> >> > > +                if (router_dp) {
> >> > > +                    /* Found a logical router with gw port
> indirectly connected
> >> > > +                    * to 'od'. */
> >> > > +                    return router_dp;
> >> > > +                }
> >> > > +            }
> >> > > +        }
> >> > > +    }
> >> > > +
> >> > > +    return NULL;
> >> > > +}
> >> >
> >> > In general, it may refer to the flood-fill approach implemented in
> >> > ovn-controller for populating local datapaths in binding.c, which is
> >> > similar to the purpose here. However, we should consider an
> >> > optimization here since the flood-fill cost would apply for every
> >> > port-binding. It can be optimized with a cache, which maps between
> >> > each datapath and its related router_dps with gw ports, to avoid
> >> > repeated traversing. (In ovn-controller it can be optimized, too, but
> >> > the gain is not as big as here since only local port-bindings are
> >> > checked in ovn-controller.)
> >> >
> >
> >
> > Thanks for the comments Han. I will work on them.
> > I haven't thought of an optimization yet. I will try to explore based
> > on your suggestions. At the same time, I personally think the code
> > shouldn't get complicated because of the optimization. (I feel the
> present
> > code in ovn-controller [1] is a bit complicated -
> > [1] -
> https://github.com/openvswitch/ovs/blob/master/ovn/controller/bfd.c#L98
> > )
>
> Sorry, I should point out the location more clearly. I was talking
> about this code:
>
> https://github.com/openvswitch/ovs/blob/master/ovn/controller/binding.c#L113
> It is about the algorithm that calculates related datapaths. I think
> we need similar logic in northd, but since northd handles all port
> bindings, it is worth to optimize with a cache to avoid repeat this
> calculation for every port-binding.
>

Thanks. Makes sense. I will work on it.



> >
> > I think it is uncommon to have indirect logical router connections which
> eventually
> > connect to a router with a gateway router port.
> > May be I am wrong. Do you think it's common to have such setups ?
> >
>
> It may be uncommon, but I think it is important to make it correct in
> these situations.
>

Agree.

Thanks
Numan


> >
> >> I forgot to mention, this implementation is for finding out the
> >> required BFD sessions, however, it can be reused for a more generic
> >> purpose - to figure out whether a tunnel is needed between each pair
> >> of chassises. There happens to be a related discussion ongoing here:
> >>
> https://mail.openvswitch.org/pipermail/ovs-discuss/2019-March/048281.html
> >>
> >
> > I have seen thread. I will try to make it generic.
> > If not I think we can revisit the code later and make it generic and
> more optimized.
> > As a first step I think it should be reasonable to aim to have the
> solution for the HA_Chassis_Group's ref_chassis calculation.
> > Does this sound good ?
> >
>
> I think it is fair to have separate patches to address the
> optimizations. For current patch, if it is only about making HA group
> generic for both GW and external port, I think we can simply put all
> chassises as ref_chassis for all GW HA-groups, and enable bfd for
> them, to make it work just like how it behaves today. Further
> optimizations can be added separately.
>
> Thanks,
> Han
>

Patch

diff --git a/ovn/lib/chassis-index.c b/ovn/lib/chassis-index.c
index a5dbf4ace..34d4a31eb 100644
--- a/ovn/lib/chassis-index.c
+++ b/ovn/lib/chassis-index.c
@@ -39,3 +39,29 @@  chassis_lookup_by_name(struct ovsdb_idl_index *sbrec_chassis_by_name,
 
     return retval;
 }
+
+struct ovsdb_idl_index *
+ha_chassis_group_index_create(struct ovsdb_idl *idl)
+{
+    return ovsdb_idl_index_create1(idl, &sbrec_ha_chassis_group_col_name);
+}
+
+/* Finds and returns the HA chassis group with the given 'name', or NULL
+ * if no such HA chassis group exists. */
+const struct sbrec_ha_chassis_group *
+ha_chassis_group_lookup_by_name(
+    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name,
+    const char *name)
+{
+    struct sbrec_ha_chassis_group *target =
+        sbrec_ha_chassis_group_index_init_row(sbrec_ha_chassis_grp_by_name);
+    sbrec_ha_chassis_group_set_name(target, name);
+
+    struct sbrec_ha_chassis_group *retval =
+        sbrec_ha_chassis_group_index_find(sbrec_ha_chassis_grp_by_name,
+                                          target);
+
+    sbrec_ha_chassis_group_index_destroy_row(target);
+
+    return retval;
+}
diff --git a/ovn/lib/chassis-index.h b/ovn/lib/chassis-index.h
index d5e5df926..9bc610ad2 100644
--- a/ovn/lib/chassis-index.h
+++ b/ovn/lib/chassis-index.h
@@ -23,4 +23,8 @@  struct ovsdb_idl_index *chassis_index_create(struct ovsdb_idl *);
 const struct sbrec_chassis *chassis_lookup_by_name(
     struct ovsdb_idl_index *sbrec_chassis_by_name, const char *name);
 
+struct ovsdb_idl_index *ha_chassis_group_index_create(struct ovsdb_idl *idl);
+const struct sbrec_ha_chassis_group *ha_chassis_group_lookup_by_name(
+    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name, const char *name);
+
 #endif /* ovn/lib/chassis-index.h */
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 838251dd1..f4fc5be2f 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -56,6 +56,7 @@  struct northd_context {
     struct ovsdb_idl *ovnsb_idl;
     struct ovsdb_idl_txn *ovnnb_txn;
     struct ovsdb_idl_txn *ovnsb_txn;
+    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name;
 };
 
 static const char *ovnnb_db;
@@ -1991,6 +1992,13 @@  sbpb_gw_chassis_needs_update(
         return false;
     }
 
+    if (lrp->n_gateway_chassis && !port_binding->ha_chassis_group) {
+        /* If there are gateway chassis in the NB DB, but there is
+         * no corresponding HA chassis group in SB DB we need to
+         * create the HA chassis group in SB DB for this lrp. */
+        return true;
+    }
+
     /* These arrays are used to collect valid Gateway_Chassis and valid
      * Chassis records from the Logical_Router_Port Gateway_Chassis list,
      * we ignore the ones we can't match on the SBDB */
@@ -2060,31 +2068,63 @@  sbpb_gw_chassis_needs_update(
     return false;
 }
 
+static struct sbrec_ha_chassis *
+create_sb_ha_chassis(struct northd_context *ctx,
+                     const struct sbrec_chassis *chassis, int priority)
+{
+    struct sbrec_ha_chassis *sb_ha_chassis =
+        sbrec_ha_chassis_insert(ctx->ovnsb_txn);
+    sbrec_ha_chassis_set_chassis(sb_ha_chassis, chassis);
+    sbrec_ha_chassis_set_priority(sb_ha_chassis, priority);
+    return sb_ha_chassis;
+}
+
 /* This functions translates the gw chassis on the nb database
  * to sb database entries, the only difference is that SB database
  * Gateway_Chassis table references the chassis directly instead
- * of using the name */
+ * of using the name.
+ *
+ * This function also creates a HA Chassis group in SB DB for
+ * the gateway chassis associated to a distributed gateway
+ * router port in the NB DB.
+ *
+ * An upcoming patch will delete the code to create the Gateway chassis
+ * in SB DB.*/
 static void
 copy_gw_chassis_from_nbrp_to_sbpb(
         struct northd_context *ctx,
         struct ovsdb_idl_index *sbrec_chassis_by_name,
         const struct nbrec_logical_router_port *lrp,
-        const struct sbrec_port_binding *port_binding) {
-
-    if (!lrp || !port_binding || !lrp->n_gateway_chassis) {
-        return;
-    }
-
+        const struct sbrec_port_binding *port_binding)
+{
     struct sbrec_gateway_chassis **gw_chassis = NULL;
     int n_gwc = 0;
     int n;
 
+    /* Make use of the new HA chassis group table to support HA
+     * for the distributed gateway router port. We can delete
+     * the old gateway_chassis code once ovn-controller supports
+     * HA chassis group. */
+    const struct sbrec_ha_chassis_group *sb_ha_chassis_group =
+        ha_chassis_group_lookup_by_name(
+            ctx->sbrec_ha_chassis_grp_by_name, lrp->name);
+    if (!sb_ha_chassis_group) {
+        sb_ha_chassis_group = sbrec_ha_chassis_group_insert(ctx->ovnsb_txn);
+        sbrec_ha_chassis_group_set_name(sb_ha_chassis_group, lrp->name);
+    }
+
+    struct sbrec_ha_chassis **sb_ha_chassis = xcalloc(lrp->n_gateway_chassis,
+                                                      sizeof *sb_ha_chassis);
+    size_t n_sb_ha_ch = 0;
     /* XXX: This can be improved. This code will generate a set of new
      * Gateway_Chassis and push them all in a single transaction, instead
      * this would be more optimal if we just add/update/remove the rows in
      * the southbound db that need to change. We don't expect lots of
      * changes to the Gateway_Chassis table, but if that proves to be wrong
-     * we should optimize this. */
+     * we should optimize this.
+     *
+     * Note: Remove the below code to add gateway_chassis row in OVN
+     * Southbound db once ovn-controller supports HA chassis group. */
     for (n = 0; n < lrp->n_gateway_chassis; n++) {
         struct nbrec_gateway_chassis *lrp_gwc = lrp->gateway_chassis[n];
         if (!lrp_gwc->chassis_name) {
@@ -2097,6 +2137,8 @@  copy_gw_chassis_from_nbrp_to_sbpb(
 
         gw_chassis = xrealloc(gw_chassis, (n_gwc + 1) * sizeof *gw_chassis);
 
+        /* This code to create gateway_chassis in SB DB needs to be deleted
+         * once ovn-controller supports making use of HA chassis groups. */
         struct sbrec_gateway_chassis *pb_gwc =
             sbrec_gateway_chassis_insert(ctx->ovnsb_txn);
 
@@ -2107,16 +2149,26 @@  copy_gw_chassis_from_nbrp_to_sbpb(
         sbrec_gateway_chassis_set_external_ids(pb_gwc, &lrp_gwc->external_ids);
 
         gw_chassis[n_gwc++] = pb_gwc;
+
+        sb_ha_chassis[n_sb_ha_ch] =
+            create_sb_ha_chassis(ctx, chassis, lrp_gwc->priority);
+        n_sb_ha_ch++;
     }
     sbrec_port_binding_set_gateway_chassis(port_binding, gw_chassis, n_gwc);
     free(gw_chassis);
+
+    sbrec_ha_chassis_group_set_ha_chassis(sb_ha_chassis_group,
+                                          sb_ha_chassis, n_sb_ha_ch);
+    sbrec_port_binding_set_ha_chassis_group(port_binding, sb_ha_chassis_group);
+    free(sb_ha_chassis);
 }
 
 static void
 ovn_port_update_sbrec(struct northd_context *ctx,
                       struct ovsdb_idl_index *sbrec_chassis_by_name,
                       const struct ovn_port *op,
-                      struct hmap *chassis_qdisc_queues)
+                      struct hmap *chassis_qdisc_queues,
+                      struct sset *active_ha_chassis_grps)
 {
     sbrec_port_binding_set_datapath(op->sb, op->od->sb);
     if (op->nbrp) {
@@ -2153,6 +2205,7 @@  ovn_port_update_sbrec(struct northd_context *ctx,
                                                       op->nbrp, op->sb);
                 }
 
+                sset_add(active_ha_chassis_grps, op->nbrp->name);
             } else if (redirect_chassis) {
                 /* Handle ports that had redirect-chassis option attached
                  * to them, and for backwards compatibility convert them
@@ -2163,20 +2216,23 @@  ovn_port_update_sbrec(struct northd_context *ctx,
                 if (chassis) {
                     /* If we found the chassis, and the gw chassis on record
                      * differs from what we expect go ahead and update */
+                    char *gwc_name = xasprintf("%s_%s", op->nbrp->name,
+                                               chassis->name);
                     if (op->sb->n_gateway_chassis != 1
                         || !op->sb->gateway_chassis[0]->chassis
                         || strcmp(op->sb->gateway_chassis[0]->chassis->name,
                                   chassis->name)
                         || op->sb->gateway_chassis[0]->priority != 0) {
+                        /* This code to create gateway_chassis in SB DB needs
+                         * to be deleted once ovn-controller supports making
+                         * use of HA chassis groups. */
+
                         /* Construct a single Gateway_Chassis entry on the
                          * Port_Binding attached to the redirect_chassis
                          * name */
                         struct sbrec_gateway_chassis *gw_chassis =
                             sbrec_gateway_chassis_insert(ctx->ovnsb_txn);
 
-                        char *gwc_name = xasprintf("%s_%s", op->nbrp->name,
-                                chassis->name);
-
                         /* XXX: Again, here, we could just update an existing
                          * Gateway_Chassis, instead of creating a new one
                          * and replacing it */
@@ -2187,8 +2243,31 @@  ovn_port_update_sbrec(struct northd_context *ctx,
                                 &op->nbrp->external_ids);
                         sbrec_port_binding_set_gateway_chassis(op->sb,
                                                                &gw_chassis, 1);
-                        free(gwc_name);
                     }
+
+                    /* Create HA chassis group in SB DB for the
+                     * redirect-chassis option. */
+                    const struct sbrec_ha_chassis_group *sb_ha_ch_grp;
+                    sb_ha_ch_grp = ha_chassis_group_lookup_by_name(
+                        ctx->sbrec_ha_chassis_grp_by_name, gwc_name);
+                    if (!sb_ha_ch_grp) {
+                        sb_ha_ch_grp =
+                            sbrec_ha_chassis_group_insert(ctx->ovnsb_txn);
+                        sbrec_ha_chassis_group_set_name(sb_ha_ch_grp,
+                                                        gwc_name);
+                    }
+
+                    if (sb_ha_ch_grp->n_ha_chassis != 1) {
+                        struct sbrec_ha_chassis **sb_ha_ch =
+                            xcalloc(1, sizeof *sb_ha_ch);
+                        sb_ha_ch[0] = create_sb_ha_chassis(ctx, chassis, 0);
+                        sbrec_ha_chassis_group_set_ha_chassis(sb_ha_ch_grp,
+                                                              sb_ha_ch, 1);
+                    }
+                    sbrec_port_binding_set_ha_chassis_group(op->sb,
+                                                            sb_ha_ch_grp);
+                    sset_add(active_ha_chassis_grps, gwc_name);
+                    free(gwc_name);
                 } else {
                     VLOG_WARN("chassis name '%s' from redirect from logical "
                               " router port '%s' redirect-chassis not found",
@@ -2359,6 +2438,18 @@  cleanup_mac_bindings(struct northd_context *ctx, struct hmap *ports)
     }
 }
 
+static void
+cleanup_sb_ha_chassis_groups(struct northd_context *ctx,
+                             struct sset *active_ha_chassis_groups)
+{
+    const struct sbrec_ha_chassis_group *b, *n;
+    SBREC_HA_CHASSIS_GROUP_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
+        if (!sset_contains(active_ha_chassis_groups, b->name)) {
+            sbrec_ha_chassis_group_delete(b);
+        }
+    }
+}
+
 /* Updates the southbound Port_Binding table so that it contains the logical
  * switch ports specified by the northbound database.
  *
@@ -2374,6 +2465,10 @@  build_ports(struct northd_context *ctx,
     struct hmap tag_alloc_table = HMAP_INITIALIZER(&tag_alloc_table);
     struct hmap chassis_qdisc_queues = HMAP_INITIALIZER(&chassis_qdisc_queues);
 
+    /* sset which stores the set of ha chassis group names used. */
+    struct sset active_ha_chassis_grps =
+        SSET_INITIALIZER(&active_ha_chassis_grps);
+
     join_logical_ports(ctx, datapaths, ports, &chassis_qdisc_queues,
                        &tag_alloc_table, &sb_only, &nb_only, &both);
 
@@ -2387,8 +2482,8 @@  build_ports(struct northd_context *ctx,
             tag_alloc_create_new_tag(&tag_alloc_table, op->nbsp);
         }
         ovn_port_update_sbrec(ctx, sbrec_chassis_by_name,
-                              op, &chassis_qdisc_queues);
-
+                              op, &chassis_qdisc_queues,
+                              &active_ha_chassis_grps);
         add_tnlid(&op->od->port_tnlids, op->sb->tunnel_key);
         if (op->sb->tunnel_key > op->od->port_key_hint) {
             op->od->port_key_hint = op->sb->tunnel_key;
@@ -2404,8 +2499,8 @@  build_ports(struct northd_context *ctx,
 
         op->sb = sbrec_port_binding_insert(ctx->ovnsb_txn);
         ovn_port_update_sbrec(ctx, sbrec_chassis_by_name, op,
-                              &chassis_qdisc_queues);
-
+                              &chassis_qdisc_queues,
+                              &active_ha_chassis_grps);
         sbrec_port_binding_set_logical_port(op->sb, op->key);
         sbrec_port_binding_set_tunnel_key(op->sb, tunnel_key);
     }
@@ -2427,6 +2522,8 @@  build_ports(struct northd_context *ctx,
 
     tag_alloc_destroy(&tag_alloc_table);
     destroy_chassis_queues(&chassis_qdisc_queues);
+    cleanup_sb_ha_chassis_groups(ctx, &active_ha_chassis_grps);
+    sset_destroy(&active_ha_chassis_grps);
 }
 
 #define OVN_MIN_MULTICAST 32768
@@ -7306,13 +7403,216 @@  ovnnb_db_run(struct northd_context *ctx,
     cleanup_macam(&macam);
 }
 
+/* Stores the list of chassis which references an ha_chassis_group.
+ */
+struct ha_ref_chassis_info {
+    const struct sbrec_ha_chassis_group *ha_chassis_group;
+    struct sbrec_chassis **ref_chassis;
+    size_t n_ref_chassis;
+};
+
+static void
+add_to_ha_ref_chassis_info(struct ha_ref_chassis_info *ref_ch_info,
+                           const struct sbrec_chassis *chassis)
+{
+    for (size_t j = 0; j < ref_ch_info->n_ref_chassis; j++) {
+        if (ref_ch_info->ref_chassis[j] == chassis) {
+           return;
+        }
+    }
+
+    ref_ch_info->ref_chassis = xrealloc(ref_ch_info->ref_chassis,
+                                        sizeof *ref_ch_info->ref_chassis *
+                                        (++ref_ch_info->n_ref_chassis));
+    ref_ch_info->ref_chassis[ref_ch_info->n_ref_chassis - 1] =
+        CONST_CAST(struct sbrec_chassis *, chassis);
+}
+
+static void
+update_sb_ha_group_ref_chassis(struct shash *ha_ref_chassis_map)
+{
+    struct shash_node *node, *next;
+    SHASH_FOR_EACH_SAFE (node, next, ha_ref_chassis_map) {
+        struct ha_ref_chassis_info *ha_ref_info = node->data;
+        sbrec_ha_chassis_group_set_ref_chassis(ha_ref_info->ha_chassis_group,
+                                               ha_ref_info->ref_chassis,
+                                               ha_ref_info->n_ref_chassis);
+        free(ha_ref_info->ref_chassis);
+        free(ha_ref_info);
+        shash_delete(ha_ref_chassis_map, node);
+    }
+}
+
+/* This function returns logical router datapath (with a distributed
+ * gateway port) to which 'od' is connected to - either directly
+ * or indirectly (via transit logical switches).
+ * Returns NULL if no logical router with gw port found.
+ */
+static struct ovn_datapath *
+get_router_dp_with_gw_port(struct hmap *ports,
+                           struct ovn_datapath *od,
+                           struct ovn_datapath *peer_od)
+{
+    if (!od) {
+        return NULL;
+    }
+
+    if (od->nbs) {
+        /* It's a logical switch datapath. */
+        if (peer_od) {
+            /* If peer datapath is not logical router, then
+             * something is wrong. */
+            ovs_assert(peer_od->nbr);
+        }
+
+        for (size_t i = 0; i < od->n_router_ports; i++) {
+            if (!od->router_ports[i]->peer) {
+                /* If there is no peer port connecting to the
+                 * router datapath, ignore it. */
+                continue;
+            }
+
+            struct ovn_datapath *router_dp = od->router_ports[i]->peer->od;
+            if (router_dp->l3dgw_port && router_dp->l3dgw_port->nbrp) {
+                /* Router datapath has a distributed gateway router port. */
+                return router_dp;
+            }
+        }
+
+        /* The logical switch datapath is not connected to any
+         * logical router with a distributed gateway port. Check if it
+         * is indirectly connected to a logical router with a gw port. */
+        for (size_t i = 0; i < od->n_router_ports; i++) {
+            if (!od->router_ports[i]->peer) {
+                continue;
+            }
+
+            struct ovn_datapath *router_dp =
+                od->router_ports[i]->peer->od;
+
+            /* If we don't check this, we will be in an infinite loop. */
+            if (router_dp != peer_od) {
+                router_dp = get_router_dp_with_gw_port(ports, router_dp,
+                                                       od);
+                if (router_dp) {
+                    /* Found a logical router with gw port indirectly connected
+                     * to 'od'. */
+                    return router_dp;
+                }
+            }
+        }
+    } else if (od->nbr) {
+        /* It's a logical router datapath. */
+        if (peer_od) {
+            /* If peer datapath is not logical switch, then
+             * something is wrong. */
+            ovs_assert(peer_od->nbs);
+        }
+
+        /* Check if this logical router datapath is indirectly connected
+         * to another logical router via a transit logical switch(es). */
+        for (size_t i = 0; i < od->nbr->n_ports; i++) {
+            struct ovn_port *router_port =
+                ovn_port_find(ports, od->nbr->ports[i]->name);
+
+            if (!router_port || !router_port->peer) {
+                continue;
+            }
+            /* If we don't check this, we will be in an infinite loop. */
+            if (router_port->peer->od != peer_od) {
+                struct ovn_datapath *router_dp;
+                /* router_port->peer->od points a logical switch datapath. */
+                router_dp = get_router_dp_with_gw_port(ports,
+                                                       router_port->peer->od,
+                                                       od);
+                if (router_dp) {
+                    /* Found a logical router with gw port indirectly connected
+                    * to 'od'. */
+                    return router_dp;
+                }
+            }
+        }
+    }
+
+    return NULL;
+}
+
+/* This function checks if the port binding 'sb' references
+ * a HA chassis group.
+ * Eg. Suppose a distributed logical router port - lr0-public
+ * uses an HA chassis group - hagrp1 and if hagrp1 has 3 ha
+ * chassis - gw1, gw2 and gw3.
+ * Or
+ * If the distributed logical router port - lr0-public has
+ * 3 gateway chassis - gw1, gw2 and gw3.
+ * ovn-northd creates ha chassis group - hagrp1 in SB DB
+ * and adds gw1, gw2 and gw3 to its ha_chassis list.
+ *
+ * If port binding 'sb' represents a logical switch port 'p1'
+ * and its logical switch is connected to the logical router
+ * 'lr0' directly or indirectly (i.e p1's logical switch is
+ *  connected to a router 'lr1' and 'lr1' has a path to lr0 via
+ *  transit logical switches) and 'sb' is claimed by chassis - 'c1' then
+ * this function adds c1 to the list of the reference chassis
+ *  - 'ref_chassis' of hagrp1.
+ */
+static void
+build_ha_chassis_group_ref_chassis(struct northd_context *ctx,
+                                   const struct sbrec_port_binding *sb,
+                                   struct ovn_port *op,
+                                   struct hmap *ports,
+                                   struct shash *ha_ref_chassis_map)
+{
+    struct ovn_datapath *router_dp =
+        get_router_dp_with_gw_port(ports, op->od, NULL);
+    if (!router_dp) {
+        return;
+    }
+
+    const struct sbrec_ha_chassis_group *sb_ha_chassis_grp = NULL;
+    /* Get the HA Chassis group associated with the lrp. */
+    struct nbrec_ha_chassis_group *nb_chassis_grp =
+        router_dp->l3dgw_port->nbrp->ha_chassis_group;
+
+    if (!nb_chassis_grp) {
+        /* Check if the legacy gateway chassis is used. */
+        if (router_dp->l3redirect_port && router_dp->l3redirect_port->sb) {
+            sb_ha_chassis_grp =
+                router_dp->l3redirect_port->sb->ha_chassis_group;
+        }
+    } else {
+        /* Get the HA chassis group row in the SB DB. */
+        sb_ha_chassis_grp = ha_chassis_group_lookup_by_name(
+            ctx->sbrec_ha_chassis_grp_by_name, nb_chassis_grp->name);
+    }
+
+    if (sb_ha_chassis_grp) {
+        struct ha_ref_chassis_info *ref_ch_info =
+            shash_find_data(ha_ref_chassis_map, sb_ha_chassis_grp->name);
+        ovs_assert(ref_ch_info);
+        add_to_ha_ref_chassis_info(ref_ch_info, sb->chassis);
+    }
+}
+
 /* Handle changes to the 'chassis' column of the 'Port_Binding' table.  When
  * this column is not empty, it means we need to set the corresponding logical
  * port as 'up' in the northbound DB. */
 static void
-update_logical_port_status(struct northd_context *ctx, struct hmap *ports)
+handle_port_binding_changes(struct northd_context *ctx, struct hmap *ports,
+                            struct shash *ha_ref_chassis_map)
 {
     const struct sbrec_port_binding *sb;
+    bool build_ha_chassis_ref = false;
+    if (ctx->ovnsb_txn) {
+        const struct sbrec_ha_chassis_group *ha_ch_grp;
+        SBREC_HA_CHASSIS_GROUP_FOR_EACH (ha_ch_grp, ctx->ovnsb_idl) {
+            struct ha_ref_chassis_info *ref_ch_info =
+                xzalloc(sizeof *ref_ch_info);
+            ref_ch_info->ha_chassis_group = ha_ch_grp;
+            build_ha_chassis_ref = true;
+            shash_add(ha_ref_chassis_map, ha_ch_grp->name, ref_ch_info);
+        }
+    }
 
     SBREC_PORT_BINDING_FOR_EACH(sb, ctx->ovnsb_idl) {
         struct ovn_port *op = ovn_port_find(ports, sb->logical_port);
@@ -7328,6 +7628,13 @@  update_logical_port_status(struct northd_context *ctx, struct hmap *ports)
         if (!op->nbsp->up || *op->nbsp->up != up) {
             nbrec_logical_switch_port_set_up(op->nbsp, &up, 1);
         }
+
+        if (build_ha_chassis_ref && ctx->ovnsb_txn && sb->chassis) {
+            /* Check and add the chassis which has claimed this 'sb'
+             * to the ha chassis group's ref_chassis if required. */
+            build_ha_chassis_group_ref_chassis(ctx, sb, op, ports,
+                                               ha_ref_chassis_map);
+        }
     }
 }
 
@@ -7655,8 +7962,13 @@  ovnsb_db_run(struct northd_context *ctx, struct ovsdb_idl_loop *sb_loop,
         return;
     }
 
-    update_logical_port_status(ctx, ports);
+    struct shash ha_ref_chassis_map = SHASH_INITIALIZER(&ha_ref_chassis_map);
+    handle_port_binding_changes(ctx, ports, &ha_ref_chassis_map);
     update_northbound_cfg(ctx, sb_loop);
+    if (ctx->ovnsb_txn) {
+        update_sb_ha_group_ref_chassis(&ha_ref_chassis_map);
+    }
+    shash_destroy(&ha_ref_chassis_map);
 }
 
 static void
@@ -7835,6 +8147,8 @@  main(int argc, char *argv[])
     ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_port_binding_col_chassis);
     ovsdb_idl_add_column(ovnsb_idl_loop.idl,
                          &sbrec_port_binding_col_gateway_chassis);
+    ovsdb_idl_add_column(ovnsb_idl_loop.idl,
+                         &sbrec_port_binding_col_ha_chassis_group);
     ovsdb_idl_add_column(ovnsb_idl_loop.idl,
                          &sbrec_gateway_chassis_col_chassis);
     ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_gateway_chassis_col_name);
@@ -7899,9 +8213,30 @@  main(int argc, char *argv[])
     ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_chassis_col_nb_cfg);
     ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_chassis_col_name);
 
+    ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_ha_chassis);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_col_chassis);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_col_priority);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_col_external_ids);
+
+    ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_ha_chassis_group);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_group_col_name);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_group_col_ha_chassis);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_group_col_external_ids);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_group_col_ref_chassis);
+
     struct ovsdb_idl_index *sbrec_chassis_by_name
         = chassis_index_create(ovnsb_idl_loop.idl);
 
+    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name
+        = ha_chassis_group_index_create(ovnsb_idl_loop.idl);
+
     /* Ensure that only a single ovn-northd is active in the deployment by
      * acquiring a lock called "ovn_northd" on the southbound database
      * and then only performing DB transactions if the lock is held. */
@@ -7916,6 +8251,7 @@  main(int argc, char *argv[])
             .ovnnb_txn = ovsdb_idl_loop_run(&ovnnb_idl_loop),
             .ovnsb_idl = ovnsb_idl_loop.idl,
             .ovnsb_txn = ovsdb_idl_loop_run(&ovnsb_idl_loop),
+            .sbrec_ha_chassis_grp_by_name = sbrec_ha_chassis_grp_by_name,
         };
 
         if (!had_lock && ovsdb_idl_has_lock(ovnsb_idl_loop.idl)) {
diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
index 10a59649a..48d27b960 100644
--- a/ovn/ovn-nb.ovsschema
+++ b/ovn/ovn-nb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Northbound",
-    "version": "5.14.1",
-    "cksum": "3758097843 20509",
+    "version": "5.15.0",
+    "cksum": "1078795414 21917",
     "tables": {
         "NB_Global": {
             "columns": {
@@ -271,6 +271,12 @@ 
                                      "refType": "strong"},
                              "min": 0,
                              "max": "unlimited"}},
+                "ha_chassis_group": {
+                    "type": {"key": {"type": "uuid",
+                                     "refTable": "HA_Chassis_Group",
+                                     "refType": "weak"},
+                             "min": 0,
+                             "max": 1}},
                 "options": {
                     "type": {"key": "string",
                              "value": "string",
@@ -392,5 +398,29 @@ 
                     "type": {"key": "string", "value": "string",
                              "min": 0, "max": "unlimited"}}},
             "indexes": [["name"]],
-            "isRoot": false}}
+            "isRoot": false},
+        "HA_Chassis": {
+            "columns": {
+                "chassis_name": {"type": "string"},
+                "priority": {"type": {"key": {"type": "integer",
+                                              "minInteger": 0,
+                                              "maxInteger": 32767}}},
+                "external_ids": {
+                    "type": {"key": "string", "value": "string",
+                             "min": 0, "max": "unlimited"}}},
+            "isRoot": false},
+        "HA_Chassis_Group": {
+            "columns": {
+                "name": {"type": "string"},
+                "ha_chassis": {
+                    "type": {"key": {"type": "uuid",
+                                     "refTable": "HA_Chassis",
+                                     "refType": "strong"},
+                             "min": 0,
+                             "max": "unlimited"}},
+                "external_ids": {
+                    "type": {"key": "string", "value": "string",
+                             "min": 0, "max": "unlimited"}}},
+            "indexes": [["name"]],
+            "isRoot": true}}
     }
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 18396507d..70d5a4689 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -1522,6 +1522,12 @@ 
     </column>
 
     <column name="gateway_chassis">
+      <p>
+        This column is ignored if the column
+        <ref column="ha_chassis_group" table="Logical_Router_Port"/>.
+        is set.
+      </p>
+
       <p>
         If set, this indicates that this logical router port represents
         a distributed gateway port that connects this router to a logical
@@ -1549,6 +1555,24 @@ 
       </p>
     </column>
 
+    <column name="ha_chassis_group">
+      <p>
+        If set, this indicates that this logical router port represents
+        a distributed gateway port that connects this router to a logical
+        switch with a localnet port.  There may be at most one such
+        logical router port on each logical router. The HA chassis which
+        are part of the HA chassis group will provide the gateway high
+        availability. Please see the <ref table="HA_Chassis_Group"/> for
+        more details.
+      </p>
+
+      <p>
+        When this column is set, the column
+        <ref column="gateway_chassis" table="Logical_Router_Port"/> will
+        be ignored.
+      </p>
+    </column>
+
     <column name="networks">
       <p>
         The IP addresses and netmasks of the router.  For example,
@@ -2605,4 +2629,55 @@ 
     </group>
   </table>
 
+  <table name="HA_Chassis_Group">
+    <p>
+      Table representing a group of chassis which can provide High availability
+      services. Each chassis in the group is represented by the table
+      <ref table="HA_Chassis"/>. The HA chassis with highest priority will
+      be the master of this group. If the master chassis failover is detected,
+      the HA chassis with the next higher priority takes over the
+      responsibility of providing the HA. If a distributed gateway router port
+      references a row in this table, then the master HA chassis in this group
+      provides the gateway functionality.
+    </p>
+
+    <column name="name">
+      Name of the <ref table="HA_Chassis_Group"/>. Name should be unique.
+    </column>
+
+    <column name="ha_chassis">
+      A list of HA chassis which belongs to this group.
+    </column>
+
+    <group title="Common Columns">
+      <column name="external_ids">
+        See <em>External IDs</em> at the beginning of this document.
+      </column>
+    </group>
+  </table>
+
+  <table name="HA_Chassis">
+    <column name="chassis_name">
+      <p>
+        Name of the chassis which is part of the HA chassis group.
+        The value must match the
+        <ref db="OVN_Southbound" table="Chassis" column="name"/> column
+        of the <ref db="OVN_Southbound" table="Chassis"/> table in the
+        <ref db="OVN_Southbound"/> database.
+      </p>
+    </column>
+
+    <column name="priority">
+      <p>
+        Priority of the chassis. Chassis with highest priority will be
+        the master.
+      </p>
+    </column>
+
+    <group title="Common Columns">
+      <column name="external_ids">
+        See <em>External IDs</em> at the beginning of this document.
+      </column>
+    </group>
+  </table>
 </database>
diff --git a/ovn/ovn-sb.ovsschema b/ovn/ovn-sb.ovsschema
index cc8c771a7..e05f964b0 100644
--- a/ovn/ovn-sb.ovsschema
+++ b/ovn/ovn-sb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Southbound",
-    "version": "2.1.0",
-    "cksum": "3806083220 15332",
+    "version": "2.2.0",
+    "cksum": "2001312516 17220",
     "tables": {
         "SB_Global": {
             "columns": {
@@ -147,6 +147,12 @@ 
                                      "refType": "strong"},
                              "min": 0,
                              "max": "unlimited"}},
+                "ha_chassis_group": {
+                    "type": {"key": {"type": "uuid",
+                                     "refTable": "HA_Chassis_Group",
+                                     "refType": "weak"},
+                             "min": 0,
+                             "max": 1}},
                 "options": {
                      "type": {"key": "string",
                               "value": "string",
@@ -309,4 +315,35 @@ 
                     "type": {"key": "string", "value": "string",
                              "min": 0, "max": "unlimited"}}},
             "indexes": [["name"]],
-            "isRoot": false}}}
+            "isRoot": false},
+        "HA_Chassis": {
+            "columns": {
+                "chassis": {"type": {"key": {"type": "uuid",
+                                             "refTable": "Chassis",
+                                             "refType": "weak"},
+                                     "min": 0, "max": 1}},
+                "priority": {"type": {"key": {"type": "integer",
+                                              "minInteger": 0,
+                                              "maxInteger": 32767}}},
+                "external_ids": {
+                    "type": {"key": "string", "value": "string",
+                             "min": 0, "max": "unlimited"}}},
+            "isRoot": false},
+        "HA_Chassis_Group": {
+            "columns": {
+                "name": {"type": "string"},
+                "ha_chassis": {
+                    "type": {"key": {"type": "uuid",
+                                     "refTable": "HA_Chassis",
+                                     "refType": "strong"},
+                             "min": 0,
+                             "max": "unlimited"}},
+                "ref_chassis": {"type": {"key": {"type": "uuid",
+                                                 "refTable": "Chassis",
+                                                 "refType": "weak"},
+                                         "min": 0, "max": "unlimited"}},
+                "external_ids": {
+                    "type": {"key": "string", "value": "string",
+                             "min": 0, "max": "unlimited"}}},
+            "indexes": [["name"]],
+            "isRoot": true}}}
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index 4e080abff..5c4a852a5 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -2246,6 +2246,16 @@  tcp.flags = RST;
         </p>
       </column>
 
+      <column name="ha_chassis_group">
+        <p>
+          This should only be populated for ports with
+          <ref column="type"/> set to <code>chassisredirect</code>.
+          This column defines the HA chassis group with a list of
+          HA chassis used as gateways where traffic will be redirected
+          through.
+        </p>
+      </column>
+
       <column name="tunnel_key">
         <p>
           A number that represents the logical port in the key (e.g. STT key or
@@ -3339,4 +3349,57 @@  tcp.flags = RST;
       <column name="external_ids"/>
     </group>
   </table>
+
+  <table name="HA_Chassis">
+    <column name="chassis">
+      <p>
+        The <ref table="Chassis"/> which provides the HA functionality.
+        </p>
+    </column>
+
+    <column name="priority">
+      <p>
+        Priority of the HA chassis. Chassis with highest priority will be
+        the master in the HA chassis group.
+      </p>
+    </column>
+
+    <group title="Common Columns">
+      <column name="external_ids">
+        See <em>External IDs</em> at the beginning of this document.
+      </column>
+    </group>
+  </table>
+
+  <table name="HA_Chassis_Group">
+    <p>
+      Table representing a group of chassis which can provide High availability
+      services. Each chassis in the group is represented by the table
+      <ref table="HA_Chassis"/>. The HA chassis with highest priority will
+      be the master of this group. If the master chassis failover is detected,
+      the HA chassis with the next higher priority takes over the
+      responsibility of providing the HA. If <ref db="OVN_Southbound"
+      table="Port_Binding" column="ha_chassis_group"/> column of the table
+      <ref db="OVN_Southbound" table="Port_Binding"/> references this table,
+      then this HA chassis group provides the gateway functionality and
+      redirects the gateway traffic to the master of this group.
+    </p>
+    <column name="name">
+      Name of the <ref table="HA_Chassis_Group"/>. Name should be unique.
+    </column>
+
+    <column name="ha_chassis">
+      A list of <ref table="HA_Chassis"/> which belongs to this group.
+    </column>
+
+    <column name="ref_chassis">
+      A list of <ref table="chassis"/> which references this HA chassis group.
+    </column>
+
+    <group title="Common Columns">
+      <column name="external_ids">
+        See <em>External IDs</em> at the beginning of this document.
+      </column>
+    </group>
+  </table>
 </database>
diff --git a/ovn/utilities/ovn-nbctl.8.xml b/ovn/utilities/ovn-nbctl.8.xml
index a7a9c2701..483602970 100644
--- a/ovn/utilities/ovn-nbctl.8.xml
+++ b/ovn/utilities/ovn-nbctl.8.xml
@@ -908,6 +908,47 @@ 
       </dd>
     </dl>
 
+    <h1> HA Chassis Group commands</h1>
+
+    <dl>
+      <dt><code>ha-chassis-group-add</code> <var>group</var></dt>
+      <dd>
+        Creates a new HA chassis group in the <code>HA_Chassis_Group</code>
+        table named <code>group</code>.
+      </dd>
+
+      <dt><code>ha-chassis-group-del</code> <var>group</var></dt>
+      <dd>
+        Deletes the HA chassis group <code>group</code>. It is an error if
+        <code>group</code> does not exist.
+      </dd>
+
+      <dt><code>ha-chassis-group-list</code></dt>
+      <dd>
+        Lists the HA chassis group <code>group</code> along with the
+        <code>HA chassis</code> if any associated with it.
+      </dd>
+
+      <dt><code>ha-chassis-group-add-chassis</code> <var>group</var>
+      <var>chassis</var> <var>priority</var></dt>
+      <dd>
+        Adds a new HA chassis <code>chassis</code> to the
+        HA Chassis group <code>group</code> with the specified priority.
+        If the <code>chassis</code> already exists, then the
+        <code>priority</code> is updated.
+        The <code>chassis</code> should be the name of the chassis in the
+        <code>OVN_Southbound</code>.
+      </dd>
+
+      <dt><code>ha-chassis-group-remove-chassis</code> <var>group</var>
+      <var>chassis</var></dt>
+      <dd>
+        Removes the HA chassis <code>chassis</code> from the HA chassis
+        group <code>group</code>. It is an error if <code>chassis</code> does
+        not exist.
+      </dd>
+    </dl>
+
     <h1>Database Commands</h1>
     <p>These commands query and modify the contents of <code>ovsdb</code> tables.
     They are a slight abstraction of the <code>ovsdb</code> interface and
diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
index 8a72e9574..8f4e96d6e 100644
--- a/ovn/utilities/ovn-nbctl.c
+++ b/ovn/utilities/ovn-nbctl.c
@@ -694,6 +694,15 @@  Port group commands:\n\
   pg-set-ports PG PORTS       Set PORTS on port group PG\n\
   pg-del PG                   Delete port group PG\n\
 \n\
+HA chassis group commands:\n\
+  ha-chassis-group-add GRP  Create an HA chassis group GRP\n\
+  ha-chassis-group-del GRP  Delete the HA chassis group GRP\n\
+  ha-chassis-group-list     List the HA chassis groups\n\
+  ha-chassis-group-add-chassis GRP CHASSIS [PRIORITY] Adds an HA\
+chassis with optional PRIORITY to the HA chassis group GRP\n\
+  ha-chassis-group-del-chassis GRP CHASSIS Deletes the HA chassis\
+CHASSIS from the HA chassis group GRP\n\
+\n\
 %s\
 %s\
 \n\
@@ -4756,6 +4765,201 @@  cmd_pg_del(struct ctl_context *ctx)
     nbrec_port_group_delete(pg);
 }
 
+static const struct nbrec_ha_chassis_group*
+ha_chassis_group_by_name_or_uuid(struct ctl_context *ctx, const char *id,
+                                 bool must_exist)
+{
+    struct uuid ch_grp_uuid;
+    const struct nbrec_ha_chassis_group *ha_ch_grp = NULL;
+    bool is_uuid = uuid_from_string(&ch_grp_uuid, id);
+    if (is_uuid) {
+        ha_ch_grp = nbrec_ha_chassis_group_get_for_uuid(ctx->idl,
+                                                        &ch_grp_uuid);
+    }
+
+    if (!ha_ch_grp) {
+        const struct nbrec_ha_chassis_group *iter;
+        NBREC_HA_CHASSIS_GROUP_FOR_EACH (iter, ctx->idl) {
+            if (!strcmp(iter->name, id)) {
+                ha_ch_grp = iter;
+                break;
+            }
+        }
+    }
+
+    if (!ha_ch_grp && must_exist) {
+        ctx->error = xasprintf("%s: ha_chassi_group %s not found",
+                               id, is_uuid ? "UUID" : "name");
+    }
+
+    return ha_ch_grp;
+}
+
+static void
+cmd_ha_ch_grp_add(struct ctl_context *ctx)
+{
+    const char *name = ctx->argv[1];
+    struct nbrec_ha_chassis_group *ha_ch_grp =
+        nbrec_ha_chassis_group_insert(ctx->txn);
+    nbrec_ha_chassis_group_set_name(ha_ch_grp, name);
+}
+
+static void
+cmd_ha_ch_grp_del(struct ctl_context *ctx)
+{
+    const char *name_or_id = ctx->argv[1];
+
+    const struct nbrec_ha_chassis_group *ha_ch_grp =
+        ha_chassis_group_by_name_or_uuid(ctx, name_or_id, true);
+
+    if (ha_ch_grp) {
+        nbrec_ha_chassis_group_delete(ha_ch_grp);
+    }
+}
+
+static void
+cmd_ha_ch_grp_list(struct ctl_context *ctx)
+{
+    const struct nbrec_ha_chassis_group *ha_ch_grp;
+
+    NBREC_HA_CHASSIS_GROUP_FOR_EACH (ha_ch_grp, ctx->idl) {
+        ds_put_format(&ctx->output, UUID_FMT " (%s)\n",
+                      UUID_ARGS(&ha_ch_grp->header_.uuid), ha_ch_grp->name);
+        const struct nbrec_ha_chassis *ha_ch;
+        for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
+            ha_ch = ha_ch_grp->ha_chassis[i];
+            ds_put_format(&ctx->output,
+                          "    "UUID_FMT " (%s)\n"
+                          "    priority %lu\n\n",
+                          UUID_ARGS(&ha_ch->header_.uuid), ha_ch->chassis_name,
+                          ha_ch->priority);
+        }
+        ds_put_cstr(&ctx->output, "\n");
+    }
+}
+
+static void
+cmd_ha_ch_grp_add_chassis(struct ctl_context *ctx)
+{
+    const struct nbrec_ha_chassis_group *ha_ch_grp =
+        ha_chassis_group_by_name_or_uuid(ctx, ctx->argv[1], true);
+
+    if (!ha_ch_grp) {
+        return;
+    }
+
+    const char *chassis_name = ctx->argv[2];
+    int64_t priority;
+    char *error = parse_priority(ctx->argv[3], &priority);
+    if (error) {
+        ctx->error = error;
+        return;
+    }
+
+    struct nbrec_ha_chassis *ha_chassis = NULL;
+    for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
+        if (!strcmp(ha_ch_grp->ha_chassis[i]->chassis_name, chassis_name)) {
+            ha_chassis = ha_ch_grp->ha_chassis[i];
+            break;
+        }
+    }
+
+    if (ha_chassis) {
+        nbrec_ha_chassis_set_priority(ha_chassis, priority);
+        return;
+    }
+
+    ha_chassis = nbrec_ha_chassis_insert(ctx->txn);
+    nbrec_ha_chassis_set_chassis_name(ha_chassis, chassis_name);
+    nbrec_ha_chassis_set_priority(ha_chassis, priority);
+
+    nbrec_ha_chassis_group_verify_ha_chassis(ha_ch_grp);
+
+    struct nbrec_ha_chassis **new_ha_chs =
+        xmalloc(sizeof *new_ha_chs * (ha_ch_grp->n_ha_chassis + 1));
+    nullable_memcpy(new_ha_chs, ha_ch_grp->ha_chassis,
+                    sizeof *new_ha_chs * ha_ch_grp->n_ha_chassis);
+    new_ha_chs[ha_ch_grp->n_ha_chassis] =
+        CONST_CAST(struct nbrec_ha_chassis *, ha_chassis);
+    nbrec_ha_chassis_group_set_ha_chassis(ha_ch_grp, new_ha_chs,
+                                          ha_ch_grp->n_ha_chassis + 1);
+    free(new_ha_chs);
+}
+
+static void
+cmd_ha_ch_grp_remove_chassis(struct ctl_context *ctx)
+{
+    const struct nbrec_ha_chassis_group *ha_ch_grp =
+        ha_chassis_group_by_name_or_uuid(ctx, ctx->argv[1], true);
+
+    if (!ha_ch_grp) {
+        return;
+    }
+
+    const char *chassis_name = ctx->argv[2];
+    struct nbrec_ha_chassis *ha_chassis = NULL;
+    size_t idx = 0;
+    for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
+        if (!strcmp(ha_ch_grp->ha_chassis[i]->chassis_name, chassis_name)) {
+            ha_chassis = ha_ch_grp->ha_chassis[i];
+            idx = i;
+            break;
+        }
+    }
+
+    if (!ha_chassis) {
+        ctx->error = xasprintf("%s: ha chassis not found in %s ha "
+                               "chassis group", chassis_name, ctx->argv[1]);
+        return;
+    }
+
+    struct nbrec_ha_chassis **new_ha_ch
+        = xmemdup(ha_ch_grp->ha_chassis,
+                  sizeof *new_ha_ch * ha_ch_grp->n_ha_chassis);
+    new_ha_ch[idx] = new_ha_ch[ha_ch_grp->n_ha_chassis - 1];
+    nbrec_ha_chassis_group_verify_ha_chassis(ha_ch_grp);
+    nbrec_ha_chassis_group_set_ha_chassis(ha_ch_grp, new_ha_ch,
+                                          ha_ch_grp->n_ha_chassis - 1);
+    free(new_ha_ch);
+    nbrec_ha_chassis_delete(ha_chassis);
+}
+
+static void
+cmd_ha_ch_grp_set_chassis_prio(struct ctl_context *ctx)
+{
+    const struct nbrec_ha_chassis_group *ha_ch_grp =
+        ha_chassis_group_by_name_or_uuid(ctx, ctx->argv[1], true);
+
+    if (!ha_ch_grp) {
+        return;
+    }
+
+    int64_t priority;
+    char *error = parse_priority(ctx->argv[3], &priority);
+    if (error) {
+        ctx->error = error;
+        return;
+    }
+
+    const char *chassis_name = ctx->argv[2];
+    struct nbrec_ha_chassis *ha_chassis = NULL;
+
+    for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
+        if (!strcmp(ha_ch_grp->ha_chassis[i]->chassis_name, chassis_name)) {
+            ha_chassis = ha_ch_grp->ha_chassis[i];
+            break;
+        }
+    }
+
+    if (!ha_chassis) {
+        ctx->error = xasprintf("%s: ha chassis not found in %s ha "
+                               "chassis group", chassis_name, ctx->argv[1]);
+        return;
+    }
+
+    nbrec_ha_chassis_set_priority(ha_chassis, priority);
+}
+
 static const struct ctl_table_class tables[NBREC_N_TABLES] = {
     [NBREC_TABLE_DHCP_OPTIONS].row_ids
     = {{&nbrec_logical_switch_port_col_name, NULL,
@@ -4790,6 +4994,9 @@  static const struct ctl_table_class tables[NBREC_N_TABLES] = {
     = {&nbrec_port_group_col_name, NULL, NULL},
 
     [NBREC_TABLE_ACL].row_ids[0] = {&nbrec_acl_col_name, NULL, NULL},
+
+    [NBREC_TABLE_HA_CHASSIS_GROUP].row_ids[0]
+    = {&nbrec_ha_chassis_group_col_name, NULL, NULL},
 };
 
 static char *
@@ -5230,6 +5437,20 @@  static const struct ctl_command_syntax nbctl_commands[] = {
     {"pg-set-ports", 2, INT_MAX, "", NULL, cmd_pg_set_ports, NULL, "", RW },
     {"pg-del", 1, 1, "", NULL, cmd_pg_del, NULL, "", RW },
 
+    /* HA chassis group commands. */
+    {"ha-chassis-group-add", 1, 1, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_add, NULL, "", RW },
+    {"ha-chassis-group-del", 1, 1, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_del, NULL, "", RW },
+    {"ha-chassis-group-list", 0, 0, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_list, NULL, "", RO },
+    {"ha-chassis-group-add-chassis", 3, 3, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_add_chassis, NULL, "", RW },
+    {"ha-chassis-group-remove-chassis", 2, 2, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_remove_chassis, NULL, "", RW },
+    {"ha-chassis-group-set-chassis-prio", 3, 3, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_set_chassis_prio, NULL, "", RW },
+
     {NULL, 0, 0, NULL, NULL, NULL, NULL, "", RO},
 };
 
diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c
index ee97a4710..c5ff9318f 100644
--- a/ovn/utilities/ovn-sbctl.c
+++ b/ovn/utilities/ovn-sbctl.c
@@ -1189,6 +1189,12 @@  static const struct ctl_table_class tables[SBREC_N_TABLES] = {
 
     [SBREC_TABLE_ADDRESS_SET].row_ids[0]
     = {&sbrec_address_set_col_name, NULL, NULL},
+
+    [SBREC_TABLE_HA_CHASSIS_GROUP].row_ids[0]
+    = {&sbrec_ha_chassis_group_col_name, NULL, NULL},
+
+    [SBREC_TABLE_HA_CHASSIS].row_ids[0]
+    = {&sbrec_ha_chassis_col_chassis, NULL, NULL},
 };
 
 
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 1878eb2df..b4e8995be 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -34,6 +34,33 @@  AT_CHECK([ovn-sbctl --bare --columns gateway_chassis find port_binding logical_p
 AT_CHECK([ovn-sbctl --bare --columns gateway_chassis find port_binding logical_port="cr-alice" | grep $gwc2_uuid | wc -l], [0], [1
 ])
 
+# There should be one ha_chassis_group with the name "alice"
+ha_chassi_grp_name=`ovn-sbctl --bare --columns name find \
+ha_chassis_group name="alice"`
+
+AT_CHECK([test $ha_chassi_grp_name = alice])
+
+ha_chgrp_uuid=`ovn-sbctl --bare --columns _uuid find ha_chassis_group name=alice`
+
+AT_CHECK([ovn-sbctl --bare --columns ha_chassis_group find port_binding \
+logical_port="cr-alice" | grep $ha_chgrp_uuid | wc -l], [0], [1
+])
+
+ha_ch=`ovn-sbctl --bare --columns ha_chassis  find ha_chassis_group`
+# Trim the spaces.
+ha_ch=`echo $ha_ch | sed 's/ //g'`
+
+ha_ch_list=''
+for i in `ovn-sbctl --bare --columns _uuid list ha_chassis | sort`
+do
+    ha_ch_list="$ha_ch_list $i"
+done
+
+# Trim the spaces.
+ha_ch_list=`echo $ha_ch_list | sed 's/ //g'`
+
+AT_CHECK([test "$ha_ch_list" = "$ha_ch"])
+
 # delete the 2nd Gateway_Chassis on NBDB for alice port
 
 ovn-nbctl --wait=sb set Logical_Router_Port alice gateway_chassis=${nb_gwc1_uuid}
@@ -45,6 +72,10 @@  AT_CHECK([ovn-sbctl --bare --columns gateway_chassis find port_binding logical_p
 
 AT_CHECK([ovn-sbctl find Gateway_Chassis name=alice_gw2], [0], [])
 
+# There should be only 1 row in ha_chassis SB DB table.
+AT_CHECK([ovn-sbctl --bare --columns _uuid list ha_chassis | wc -l], [0], [1
+])
+
 # delete all the gateway_chassis on NBDB for alice port
 
 ovn-nbctl --wait=sb clear Logical_Router_Port alice gateway_chassis
@@ -54,6 +85,12 @@  ovn-nbctl --wait=sb clear Logical_Router_Port alice gateway_chassis
 AT_CHECK([ovn-sbctl find Gateway_Chassis name=alice_gw1], [0], [])
 AT_CHECK([ovn-sbctl find Gateway_Chassis name=alice_gw2], [0], [])
 
+# expect that the ha_chassis doesn't exist anymore
+AT_CHECK([ovn-sbctl list ha_chassis | wc -l], [0], [0
+])
+AT_CHECK([ovn-sbctl list ha_chassis_group | wc -l], [0], [0
+])
+
 AT_CLEANUP
 
 AT_SETUP([ovn -- check Gateway_Chassis propagation from NBDB to SBDB backwards compatibility])
@@ -301,3 +338,228 @@  as northd
 OVS_APP_EXIT_AND_WAIT([ovn-northd])
 
 AT_CLEANUP
+
+AT_SETUP([ovn -- check HA_Chassis_Group propagation from NBDB to SBDB])
+AT_SKIP_IF([test $HAVE_PYTHON = no])
+ovn_start
+
+ovn-nbctl --wait=sb ha-chassis-group-add hagrp1
+
+# ovn-northd should not create HA chassis group and HA chassis rows
+# unless the HA chassis group in OVN NB DB is associated to
+# a logical router port. ovn-northd still doesn't support
+# associating a HA chassis group to a logical router port.
+AT_CHECK([ovn-sbctl --bare --columns name find ha_chassis_group name="hagrp1" \
+| wc -l], [0], [0
+])
+
+ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp1 ch1 30
+ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp1 ch2 20
+ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp1 ch3 10
+
+# There should be no HA_Chassis rows in SB DB.
+AT_CHECK([ovn-sbctl list ha_chassis | grep chassis | awk '{print $3}' \
+| grep -v '-' | wc -l ], [0], [0
+])
+
+# Add chassis ch1.
+ovn-sbctl chassis-add ch1 geneve 127.0.0.2
+
+OVS_WAIT_UNTIL([test 1 = `ovn-sbctl list chassis | grep ch1 | wc -l`])
+
+# There should be no HA_Chassis rows
+AT_CHECK([ovn-sbctl list ha_chassis | grep chassis | awk '{print $3}' \
+| grep -v '-' | wc -l ], [0], [0
+])
+
+ovn-nbctl ha-chassis-group-del hagrp1
+OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis_group | wc -l`])
+OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis | wc -l`])
+
+# Create a logical router port and attach Gateway chassis.
+# ovn-northd should create HA chassis group rows in SB DB.
+ovn-nbctl lr-add lr0
+
+ovn-nbctl lrp-add lr0 lr0-public 00:00:20:20:12:13 172.168.0.100/24
+ovn-nbctl lrp-set-gateway-chassis lr0-public ch1 20
+
+OVS_WAIT_UNTIL([test 1 = `ovn-sbctl --bare --columns name find ha_chassis_group \
+name="lr0-public" | wc -l`])
+
+OVS_WAIT_UNTIL([test 1 = `ovn-sbctl --bare --columns _uuid \
+find ha_chassis | wc -l`])
+
+ovn-nbctl lrp-set-gateway-chassis lr0-public ch2 10
+
+ovn-sbctl --bare --columns _uuid find ha_chassis
+OVS_WAIT_UNTIL([test 2 = `ovn-sbctl list ha_chassis | grep chassis | wc -l`])
+
+# Test if 'ref_chassis' column is properly set or not in
+# SB DB ha_chassis_group.
+ovn-nbctl ls-add sw0
+ovn-nbctl lsp-add sw0 sw0-p1
+
+ovn-sbctl chassis-add ch2 geneve 127.0.0.3
+ovn-sbctl chassis-add ch3 geneve 127.0.0.4
+ovn-sbctl chassis-add comp1 geneve 127.0.0.5
+ovn-sbctl chassis-add comp2 geneve 127.0.0.6
+
+ovn-nbctl lrp-add lr0 lr0-sw0 00:00:20:20:12:14 10.0.0.1/24
+ovn-nbctl lsp-add sw0 sw0-lr0
+ovn-nbctl lsp-set-type sw0-lr0 router
+ovn-nbctl lsp-set-addresses sw0-lr0 router
+ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0
+
+ovn-sbctl lsp-bind sw0-p1 comp1
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p1` = xup])
+
+comp1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="comp1"`
+comp2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="comp2"`
+ch2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="comp1"`
+
+echo "comp1_ch_uuid = $comp1_ch_uuid"
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp1_ch_uuid" = "$ref_ch_list"])
+
+# unbind sw0-p1
+ovn-sbctl lsp-unbind sw0-p1
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p1` = xdown])
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "" = "$ref_ch_list"])
+
+# Bind sw0-p1 in comp2
+ovn-sbctl lsp-bind sw0-p1 comp2
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp2_ch_uuid" = "$ref_ch_list"])
+
+ovn-nbctl ls-add sw1
+ovn-nbctl lsp-add sw1 sw1-p1
+ovn-nbctl lr-add lr1
+ovn-nbctl lrp-add lr1 lr1-sw1 00:00:20:20:12:15 20.0.0.1/24
+ovn-nbctl lsp-add sw1 sw1-lr1
+ovn-nbctl lsp-set-type sw1-lr1 router
+ovn-nbctl lsp-set-addresses sw1-lr1 router
+ovn-nbctl lsp-set-options sw1-lr1 router-port=lr1-sw1
+
+# Bind sw1-p1 in comp1.
+ovn-sbctl lsp-bind sw1-p1 comp1
+# Wait until sw1-p1 is up
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw1-p1` = xup])
+
+# sw1-p1 is not connected to lr0. So comp1 should not be in 'ref_chassis'
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp2_ch_uuid" = "$ref_ch_list"])
+
+# Now attach sw0 to lr1
+ovn-nbctl lrp-add lr1 lr1-sw0 00:00:20:20:12:16 10.0.0.10/24
+ovn-nbctl lsp-add sw0 sw0-lr1
+ovn-nbctl lsp-set-type sw0-lr1 router
+ovn-nbctl lsp-set-addresses sw0-lr1 router
+ovn-nbctl lsp-set-options sw0-lr1 router-port=lr1-sw0
+
+# Both comp1 and comp2 should be in 'ref_chassis' as sw1 is indirectly
+# connected to lr0
+exp_ref_ch_list=''
+for i in `ovn-sbctl --bare --columns _uuid list chassis | sort`
+do
+    if test $i = $comp1_ch_uuid; then
+        exp_ref_ch_list="${exp_ref_ch_list}$i"
+    elif test $i = $comp2_ch_uuid; then
+        exp_ref_ch_list="${exp_ref_ch_list}$i"
+    fi
+done
+
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
+# Unind sw1-p1. comp2 should not be in the ref_chassis.
+ovn-sbctl lsp-unbind sw1-p1
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw1-p1` = xdown])
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp2_ch_uuid" = "$ref_ch_list"])
+
+# Create sw2 and attach it to lr2
+ovn-nbctl ls-add sw2
+ovn-nbctl lsp-add sw2 sw2-p1
+ovn-nbctl lr-add lr2
+ovn-nbctl lrp-add lr2 lr2-sw2 00:00:20:20:12:17 30.0.0.1/24
+ovn-nbctl lsp-add sw2 sw2-lr2
+ovn-nbctl lsp-set-type sw2-lr2 router
+ovn-nbctl lsp-set-addresses sw2-lr2 router
+ovn-nbctl lsp-set-options sw2-lr2 router-port=lr2-sw2
+
+# Bind sw2-p1 to comp1
+ovn-sbctl lsp-bind sw2-p1 comp1
+# Wait until sw2-p1 is up
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw2-p1` = xup])
+
+# sw2-p1 is not connected to lr0. So comp1 should not be in 'ref_chassis'
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp2_ch_uuid" = "$ref_ch_list"])
+
+# Now attach sw1 to lr2. With this sw2-p1 is indirectly connected to lr0.
+ovn-nbctl lrp-add lr2 lr2-sw1 00:00:20:20:12:18 20.0.0.10/24
+ovn-nbctl lsp-add sw1 sw1-lr2
+ovn-nbctl lsp-set-type sw1-lr2 router
+ovn-nbctl lsp-set-addresses sw1-lr2 router
+ovn-nbctl lsp-set-options sw1-lr2 router-port=lr2-sw1
+
+# sw2-p1 is indirectly connected to lr0. So comp1 (and comp2) should be in
+# 'ref_chassis'
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
+# Create sw0-p2 and bind it to comp1
+ovn-nbctl lsp-add sw0 sw0-p2
+ovn-sbctl lsp-bind sw0-p2 comp1
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p2` = xup])
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
+# unbind sw0-p2
+ovn-sbctl lsp-unbind sw0-p2
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p2` = xdown])
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
+# Delete lr1-sw0. comp1 should be deleted from ref_chassis as there is no link
+# from sw1 and sw2 to lr0.
+ovn-nbctl lrp-del lr1-sw0
+
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp2_ch_uuid" = "$ref_ch_list"])
+
+AT_CLEANUP
diff --git a/tests/ovn.at b/tests/ovn.at
index ec79651bd..dedbadd1b 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -8386,6 +8386,16 @@  as ext1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
 
 AT_CHECK([ovn-nbctl --timeout=3 --wait=sb sync], [0], [ignore])
 
+# hv1 should be in 'ref_chassis' of the ha_chasssi_group as logical
+# switch 'foo' can reach the router 'R1' (which has gw router port)
+# via foo1 -> foo -> R0 -> join -> R1
+hv1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv1"`
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$hv1_ch_uuid" = "$ref_ch_list"])
+
 # Allow some time for ovn-northd and ovn-controller to catch up.
 # XXX This should be more systematic.
 sleep 2
@@ -9754,6 +9764,32 @@  echo "------ Port_Binding chassisredirect -------"
 ovn-sbctl find Port_Binding type=chassisredirect
 echo "-------------------------------------------"
 
+# There should be one ha_chassis_group with the name "outside"
+ha_chassi_grp_name=`ovn-sbctl --bare --columns name find \
+ha_chassis_group name="outside"`
+
+AT_CHECK([test $ha_chassi_grp_name = outside])
+
+# There should be 2 ha_chassis rows in SB DB.
+AT_CHECK([ovn-sbctl list ha_chassis | grep chassis | awk '{print $3}' \
+| grep '-' | wc -l ], [0], [2
+])
+
+ha_ch=`ovn-sbctl --bare --columns ha_chassis  find ha_chassis_group`
+# Trim the spaces.
+ha_ch=`echo $ha_ch | sed 's/ //g'`
+
+ha_ch_list=''
+for i in `ovn-sbctl --bare --columns _uuid list ha_chassis | sort`
+do
+    ha_ch_list="$ha_ch_list $i"
+done
+
+# Trim the spaces.
+ha_ch_list=`echo $ha_ch_list | sed 's/ //g'`
+
+AT_CHECK([test "$ha_ch_list" = "$ha_ch"])
+
 for chassis in gw1 gw2 hv1 hv2; do
     as $chassis
     echo "------ $chassis dump ----------"
@@ -9798,33 +9834,56 @@  as hv2 ovs-ofctl dump-flows br-int table=32
 gw1_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw1)
 gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
 
-AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv1_gw1_ofport,$hv1_gw2_ofport | wc -l], [0], [1
+OVS_WAIT_UNTIL([as hv1 ovs-ofctl dump-flows br-int table=32 | \
+grep active_backup | grep slaves:$hv1_gw1_ofport,$hv1_gw2_ofport \
+| wc -l], [0], [1
 ])
 
-AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv2_gw1_ofport,$hv2_gw2_ofport | wc -l], [0], [1
+OVS_WAIT_UNTIL([as hv2 ovs-ofctl dump-flows br-int table=32 | \
+grep active_backup | grep slaves:$hv2_gw1_ofport,$hv2_gw2_ofport \
+| wc -l], [0], [1
 ])
 
-sleep 3 # let BFD sessions settle so we get the right flows on the right chassis
-
 # make sure that flows for handling the outside router port reside on gw1
-AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
+OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[1
 ]])
-AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
+
+OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[0
 ]])
 
 # make sure ARP responder flows for outside router port reside on gw1 too
-AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=9 | grep arp_tpa=192.168.0.101 | wc -l], [0], [[1
+OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=9 | \
+grep arp_tpa=192.168.0.101 | wc -l], [0], [[1
 ]])
-AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=9 | grep arp_tpa=192.168.0.101 | wc -l], [0], [[0
+OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=9 | grep arp_tpa=192.168.0.101 | wc -l], [0], [[0
 ]])
 
-
-
 # check that the chassis redirect port has been claimed by the gw1 chassis
-AT_CHECK([ovn-sbctl --columns chassis --bare find Port_Binding logical_port=cr-outside | grep $gw1_chassis | wc -l],
-         [0],[[1
+OVS_WAIT_UNTIL([ovn-sbctl --columns chassis --bare find Port_Binding \
+logical_port=cr-outside | grep $gw1_chassis | wc -l], [0],[[1
 ]])
 
+hv1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv1"`
+hv2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv2"`
+
+exp_ref_ch_list=''
+for i in `ovn-sbctl --bare --columns _uuid list chassis | sort`
+do
+    if test $i = $hv1_ch_uuid; then
+        exp_ref_ch_list="${exp_ref_ch_list}$i"
+    elif test $i = $hv2_ch_uuid; then
+        exp_ref_ch_list="${exp_ref_ch_list}$i"
+    fi
+done
+
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
 
 # at this point, we invert the priority of the gw chassis between gw1 and gw2
 
@@ -9839,15 +9898,19 @@  ovn-nbctl --id=@gc0 create Gateway_Chassis \
 ovn-nbctl --wait=hv --timeout=3 sync
 
 # we make sure that the hypervisors noticed, and inverted the slave ports
-AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv1_gw2_ofport,$hv1_gw1_ofport | wc -l], [0], [1
+OVS_WAIT_UNTIL([as hv1 ovs-ofctl dump-flows br-int table=32 | \
+grep active_backup | grep slaves:$hv1_gw2_ofport,$hv1_gw1_ofport \
+| wc -l], [0], [1
 ])
 
-AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv2_gw2_ofport,$hv2_gw1_ofport | wc -l], [0], [1
+OVS_WAIT_UNTIL([as hv2 ovs-ofctl dump-flows br-int table=32 | \
+grep active_backup | grep slaves:$hv2_gw2_ofport,$hv2_gw1_ofport \
+| wc -l], [0], [1
 ])
 
 # check that the chassis redirect port has been reclaimed by the gw2 chassis
-AT_CHECK([ovn-sbctl --columns chassis --bare find Port_Binding logical_port=cr-outside | grep $gw2_chassis | wc -l],
-         [0],[[1
+OVS_WAIT_UNTIL([ovn-sbctl --columns chassis --bare find Port_Binding \
+logical_port=cr-outside | grep $gw2_chassis | wc -l], [0],[[1
 ]])
 
 # check BFD enablement on tunnel ports from gw1 #########
@@ -9896,31 +9959,32 @@  AT_CHECK([ovs-vsctl --bare --columns bfd find Interface name=ovn-hv1-0],[0],
          [[
 ]])
 
-sleep 3  # let BFD sessions settle so we get the right flows on the right chassis
-
 # make sure that flows for handling the outside router port reside on gw2 now
-AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
+OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[1
 ]])
-AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
+OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[0
 ]])
 
 # disconnect GW2 from the network, GW1 should take over
 as gw2
 port=${sandbox}_br-phys
 as main ovs-vsctl del-port n1 $port
-sleep 4
 
 bfd_dump
 
 # make sure that flows for handling the outside router port reside on gw2 now
-AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
+OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[1
 ]])
-AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
+OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[0
 ]])
 
 # check that the chassis redirect port has been reclaimed by the gw1 chassis
-AT_CHECK([ovn-sbctl --columns chassis --bare find Port_Binding logical_port=cr-outside | grep $gw1_chassis | wc -l],
-         [0],[[1
+OVS_WAIT_UNTIL([ovn-sbctl --columns chassis --bare find Port_Binding \
+logical_port=cr-outside | grep $gw1_chassis | wc -l], [0],[[1
 ]])
 
 ovn-nbctl --wait=hv set NB_Global . options:"bfd-min-rx"=2000
@@ -9950,6 +10014,37 @@  for chassis in gw1 hv1 hv2; do
 ])
 done
 
+# Delete the inside1 vif. The ref_chassis in ha_chassis_group shouldn't have
+# reference to hv1.
+as hv1 ovs-vsctl del-port hv1-vif1
+
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$hv2_ch_uuid" = "$ref_ch_list"])
+
+# Delete the inside2 vif.
+ovn-sbctl show
+
+echo "Deleting hv2-vif1"
+as hv2 ovs-vsctl del-port hv2-vif1
+
+# ref_chassis of ha_chassis_group should be empty
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     exp_ref_ch_list=""
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
+# Delete the Gateway_Chassis for lrp - outside
+ovn-nbctl clear Logical_Router_Port outside gateway_chassis
+
+# There shoud be no ha_chassis_group rows in SB DB.
+OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis_group | wc -l`])
+OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis | wc -l`])
+
 OVN_CLEANUP([gw1],[gw2],[hv1],[hv2])
 
 AT_CLEANUP
@@ -10186,10 +10281,7 @@  ovn-nbctl --wait=hv --timeout=3 sync
 gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
 ovn-sbctl destroy Chassis $gw2_chassis
 
-# Ensure ovn-controller has processed latest sbdb update
-# ovn-nbctl --wait=hv sync
-
-AT_CHECK([grep "Releasing lport" gw1/ovn-controller.log], [1], [])
+OVS_WAIT_UNTIL([test 0 = `grep -c "Releasing lport" gw1/ovn-controller.log`])
 
 OVN_CLEANUP([gw1],[gw2],[hv1])