[ovs-dev,v4,2/5] ovn: Add generic HA chassis group
diff mbox series

Message ID 20190314193147.24444-1-nusiddiq@redhat.com
State Superseded
Headers show
Series
  • ovn: Add HA chassis group and 'external' port
Related show

Commit Message

Numan Siddique March 14, 2019, 7:31 p.m. UTC
From: Numan Siddique <nusiddiq@redhat.com>

This patch adds the tables - 'HA_Chassis_Group' and 'HA_Chassis' in
both OVN Northbound and Southbound DBs to support generic HA Chassis
groups in OVN. CMS can create a group of HA chassis with the priorities
assigned to each chassis in the group. An HA chassis group can be associated to
a distributed logical router port. An upcoming patch will make
use of it while supporting  'external'* logical ports.

HA chassis group is similar to the existing gateway chassis support in
OVN which is used by the distributed gateway router ports.
This patch tries to abstract this so that, the HA chassis support
can be leveraged by not just distributed gateway router ports.

If a logical router port has a set of gateway chassis associated to
it, ovn-northd will create HA chassis group in Southbound
DB and add these gateway chassis to this group. ovn-northd would still create
gateway chassis in Southbound DB as ovn-controller still doesn't support
using the HA chassis group.

Next patch in the series will add the support in ovn-controller to
make use of HA chassis group instead of gateway chassis. The patch following
that will delete creation of gateway chassis in Southbound DB.

HA_Chasss_Group table in Southbound DB has a column - 'ref_chassis'.
This column is used to store the list of chassis which references the
HA chassis group. This information will be used by ovn-controller in an
upcoming patch to establish BFD sessions with the required chassis.

Suppose if there is an HA chassis group - 'hagrp1' in the Southbound
DB and it has HA chasiss list - ha1, ha2 and ha3 and this HA chassis
group is used by a distributed logical router port, then ovn-northd
will update the 'ref_chassis' with the list of chassis which has claimed
all the logical switch ports which are connected to the logical router
which has this distributed logical router port.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
---
 ovn/lib/chassis-index.c       |  26 ++
 ovn/lib/chassis-index.h       |   4 +
 ovn/northd/ovn-northd.c       | 449 ++++++++++++++++++++++++++++++++--
 ovn/ovn-nb.ovsschema          |  36 ++-
 ovn/ovn-nb.xml                |  75 ++++++
 ovn/ovn-sb.ovsschema          |  43 +++-
 ovn/ovn-sb.xml                |  63 +++++
 ovn/utilities/ovn-nbctl.8.xml |  41 ++++
 ovn/utilities/ovn-nbctl.c     | 221 +++++++++++++++++
 ovn/utilities/ovn-sbctl.c     |   6 +
 tests/ovn-northd.at           | 262 ++++++++++++++++++++
 tests/ovn.at                  | 150 +++++++++---
 12 files changed, 1314 insertions(+), 62 deletions(-)

Comments

0-day Robot March 14, 2019, 8:24 p.m. UTC | #1
Bleep bloop.  Greetings Numan Siddique, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line lacks whitespace around operator
#1180 FILE: ovn/utilities/ovn-nbctl.c:698:
  ha-chassis-group-add GRP  Create an HA chassis group GRP\n\

WARNING: Line lacks whitespace around operator
#1181 FILE: ovn/utilities/ovn-nbctl.c:699:
  ha-chassis-group-del GRP  Delete the HA chassis group GRP\n\

WARNING: Line lacks whitespace around operator
#1182 FILE: ovn/utilities/ovn-nbctl.c:700:
  ha-chassis-group-list     List the HA chassis groups\n\

WARNING: Line lacks whitespace around operator
#1183 FILE: ovn/utilities/ovn-nbctl.c:701:
  ha-chassis-group-add-chassis GRP CHASSIS [PRIORITY] Adds an HA\

WARNING: Line lacks whitespace around operator
#1185 FILE: ovn/utilities/ovn-nbctl.c:703:
  ha-chassis-group-del-chassis GRP CHASSIS Deletes the HA chassis\

Lines checked: 1974, Warnings: 5, Errors: 0


Please check this out.  If you feel there has been an error, please email aconole@bytheb.org

Thanks,
0-day Robot
Han Zhou March 26, 2019, 2:37 a.m. UTC | #2
Mostly looks good to me, just minor comments.

On Thu, Mar 14, 2019 at 12:32 PM <nusiddiq@redhat.com> wrote:
>
> +static void
> +build_lrouter_groups__(struct hmap *ports, struct ovn_datapath *od)
> +{
> +    ovs_assert((od && od->nbr && od->lr_group));
> +
> +    if (od->l3dgw_port && od->l3redirect_port) {
> +        /* It's a logical router with gateway port. If it
> +         * has HA_Chassis_Group associated to it in SB DB, then store the
> +         * ha chassis group name. */
> +        if (od->l3redirect_port->sb->ha_chassis_group) {
> +            sset_add(&od->lr_group->ha_chassis_groups,
> +                     od->l3redirect_port->sb->ha_chassis_group->name);
> +        }
> +    }
> +
> +    for (size_t i = 0; i < od->nbr->n_ports; i++) {
> +        struct ovn_port *router_port =
> +            ovn_port_find(ports, od->nbr->ports[i]->name);
> +
> +        if (!router_port || !router_port->peer) {
> +            continue;
> +        }
> +
> +        /* Get the peer logical switch/logical router datapath. */
> +        struct ovn_datapath *peer_dp = router_port->peer->od;
> +        if (peer_dp->nbr) {
> +            if (!peer_dp->lr_group) {
> +                peer_dp->lr_group = od->lr_group;
> +                od->lr_group->router_dps[od->lr_group->n_router_dps++]
> +                    = peer_dp;
> +                build_lrouter_groups__(ports, peer_dp);
> +            }
> +        } else {
> +            for (size_t j = 0; j < peer_dp->n_router_ports; j++) {
> +                if (!peer_dp->router_ports[j]->peer) {
> +                    /* If there is no peer port connecting to the
> +                    * router datapath, ignore it. */

s/router datapath/router port

> +                    continue;
> +                }
> +

> +static void
> +build_lrouter_groups(struct hmap *ports, struct ovs_list *lr_list)
> +{
> +    struct ovn_datapath *od;
> +    size_t n_router_dps = ovs_list_size(lr_list);
> +
> +    LIST_FOR_EACH (od, lr_list, lr_list) {
> +        if (!od->lr_group) {
> +            od->lr_group = xzalloc(sizeof *od->lr_group);
> +            /* Each logical router group can have max
> +             * 'n_router_dps'. So allocate enough memory. */
> +            od->lr_group->router_dps = xcalloc(sizeof *od, n_router_dps);
> +            od->lr_group->router_dps[od->lr_group->n_router_dps] = od;

Here it is more clear to be: od->lr_group->router_dps[0] = od;

> +            od->lr_group->n_router_dps = 1;
> +            sset_init(&od->lr_group->ha_chassis_groups);
> +            build_lrouter_groups__(ports, od);
> +        }
> +    }
> +}
> +

>  static void
> -destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports)
> +destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports,
> +                            struct ovs_list *lr_list)
>  {
> +    struct ovn_datapath *router_dp;
> +    LIST_FOR_EACH_POP (router_dp, lr_list, lr_list) {
> +        if (router_dp->lr_group) {
> +            struct lrouter_group *lr_group = router_dp->lr_group;
> +
> +            for (size_t i = 0; i < router_dp->lr_group->n_router_dps; i++) {
> +                if (router_dp->lr_group->router_dps[i] != router_dp) {

This if-condition seems not needed.

> +                    router_dp->lr_group->router_dps[i]->lr_group = NULL;
> +                }
> +            }

s/router_dp->lr_group/lr_group in above for-loop.

> +
> +            free(lr_group->router_dps);
> +            sset_destroy(&lr_group->ha_chassis_groups);
> +            free(lr_group);
> +            router_dp->lr_group = NULL;
> +        }
> +    }
> +

> diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
> index 10a59649a..48d27b960 100644
> --- a/ovn/ovn-nb.ovsschema
> +++ b/ovn/ovn-nb.ovsschema
> @@ -1,7 +1,7 @@
>  {
>      "name": "OVN_Northbound",
> -    "version": "5.14.1",
> -    "cksum": "3758097843 20509",
> +    "version": "5.15.0",
> +    "cksum": "1078795414 21917",
>      "tables": {
>          "NB_Global": {
>              "columns": {
> @@ -271,6 +271,12 @@
>                                       "refType": "strong"},
>                               "min": 0,
>                               "max": "unlimited"}},
> +                "ha_chassis_group": {
> +                    "type": {"key": {"type": "uuid",
> +                                     "refTable": "HA_Chassis_Group",
> +                                     "refType": "weak"},

Shall this be strong ref instead? In normal case logical ports hosted
by a HA-chassis-group should be deleted before the HA-chassis-group is
removed. (same for SB schema)

> +                             "min": 0,
> +                             "max": 1}},
>                  "options": {
>                      "type": {"key": "string",
>                               "value": "string",
> @@ -392,5 +398,29 @@
>                      "type": {"key": "string", "value": "string",
>                               "min": 0, "max": "unlimited"}}},
>              "indexes": [["name"]],
> -            "isRoot": false}}
> +            "isRoot": false},
> +        "HA_Chassis": {
> +            "columns": {
> +                "chassis_name": {"type": "string"},
> +                "priority": {"type": {"key": {"type": "integer",
> +                                              "minInteger": 0,
> +                                              "maxInteger": 32767}}},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}},
> +            "isRoot": false},
> +        "HA_Chassis_Group": {
> +            "columns": {
> +                "name": {"type": "string"},
> +                "ha_chassis": {
> +                    "type": {"key": {"type": "uuid",
> +                                     "refTable": "HA_Chassis",
> +                                     "refType": "strong"},

Is it better to be weak ref here, considering that multiple a
HA-chassis can belong to multiple groups? (same for SB schema)

> +                             "min": 0,
> +                             "max": "unlimited"}},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}},
> +            "indexes": [["name"]],
> +            "isRoot": true}}
>      }

I will get back to 3/5 - 5/5 reviewing asap.
Numan Siddique March 26, 2019, 4:59 p.m. UTC | #3
Thanks Han for the review. I will address them
Please see comments inline

Thanks
Numan


On Tue, Mar 26, 2019 at 8:07 AM Han Zhou <zhouhan@gmail.com> wrote:

> Mostly looks good to me, just minor comments.
>
> On Thu, Mar 14, 2019 at 12:32 PM <nusiddiq@redhat.com> wrote:
> >
> > +static void
> > +build_lrouter_groups__(struct hmap *ports, struct ovn_datapath *od)
> > +{
> > +    ovs_assert((od && od->nbr && od->lr_group));
> > +
> > +    if (od->l3dgw_port && od->l3redirect_port) {
> > +        /* It's a logical router with gateway port. If it
> > +         * has HA_Chassis_Group associated to it in SB DB, then store
> the
> > +         * ha chassis group name. */
> > +        if (od->l3redirect_port->sb->ha_chassis_group) {
> > +            sset_add(&od->lr_group->ha_chassis_groups,
> > +                     od->l3redirect_port->sb->ha_chassis_group->name);
> > +        }
> > +    }
> > +
> > +    for (size_t i = 0; i < od->nbr->n_ports; i++) {
> > +        struct ovn_port *router_port =
> > +            ovn_port_find(ports, od->nbr->ports[i]->name);
> > +
> > +        if (!router_port || !router_port->peer) {
> > +            continue;
> > +        }
> > +
> > +        /* Get the peer logical switch/logical router datapath. */
> > +        struct ovn_datapath *peer_dp = router_port->peer->od;
> > +        if (peer_dp->nbr) {
> > +            if (!peer_dp->lr_group) {
> > +                peer_dp->lr_group = od->lr_group;
> > +                od->lr_group->router_dps[od->lr_group->n_router_dps++]
> > +                    = peer_dp;
> > +                build_lrouter_groups__(ports, peer_dp);
> > +            }
> > +        } else {
> > +            for (size_t j = 0; j < peer_dp->n_router_ports; j++) {
> > +                if (!peer_dp->router_ports[j]->peer) {
> > +                    /* If there is no peer port connecting to the
> > +                    * router datapath, ignore it. */
>
> s/router datapath/router port
>
> > +                    continue;
> > +                }
> > +
>
> > +static void
> > +build_lrouter_groups(struct hmap *ports, struct ovs_list *lr_list)
> > +{
> > +    struct ovn_datapath *od;
> > +    size_t n_router_dps = ovs_list_size(lr_list);
> > +
> > +    LIST_FOR_EACH (od, lr_list, lr_list) {
> > +        if (!od->lr_group) {
> > +            od->lr_group = xzalloc(sizeof *od->lr_group);
> > +            /* Each logical router group can have max
> > +             * 'n_router_dps'. So allocate enough memory. */
> > +            od->lr_group->router_dps = xcalloc(sizeof *od,
> n_router_dps);
> > +            od->lr_group->router_dps[od->lr_group->n_router_dps] = od;
>
> Here it is more clear to be: od->lr_group->router_dps[0] = od;
>
> > +            od->lr_group->n_router_dps = 1;
> > +            sset_init(&od->lr_group->ha_chassis_groups);
> > +            build_lrouter_groups__(ports, od);
> > +        }
> > +    }
> > +}
> > +
>
> >  static void
> > -destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports)
> > +destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports,
> > +                            struct ovs_list *lr_list)
> >  {
> > +    struct ovn_datapath *router_dp;
> > +    LIST_FOR_EACH_POP (router_dp, lr_list, lr_list) {
> > +        if (router_dp->lr_group) {
> > +            struct lrouter_group *lr_group = router_dp->lr_group;
> > +
> > +            for (size_t i = 0; i < router_dp->lr_group->n_router_dps;
> i++) {
> > +                if (router_dp->lr_group->router_dps[i] != router_dp) {
>
> This if-condition seems not needed.
>
> > +                    router_dp->lr_group->router_dps[i]->lr_group = NULL;
> > +                }
> > +            }
>
> s/router_dp->lr_group/lr_group in above for-loop.
>
> > +
> > +            free(lr_group->router_dps);
> > +            sset_destroy(&lr_group->ha_chassis_groups);
> > +            free(lr_group);
> > +            router_dp->lr_group = NULL;
> > +        }
> > +    }
> > +
>
> > diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
> > index 10a59649a..48d27b960 100644
> > --- a/ovn/ovn-nb.ovsschema
> > +++ b/ovn/ovn-nb.ovsschema
> > @@ -1,7 +1,7 @@
> >  {
> >      "name": "OVN_Northbound",
> > -    "version": "5.14.1",
> > -    "cksum": "3758097843 20509",
> > +    "version": "5.15.0",
> > +    "cksum": "1078795414 21917",
> >      "tables": {
> >          "NB_Global": {
> >              "columns": {
> > @@ -271,6 +271,12 @@
> >                                       "refType": "strong"},
> >                               "min": 0,
> >                               "max": "unlimited"}},
> > +                "ha_chassis_group": {
> > +                    "type": {"key": {"type": "uuid",
> > +                                     "refTable": "HA_Chassis_Group",
> > +                                     "refType": "weak"},
>
> Shall this be strong ref instead? In normal case logical ports hosted
> by a HA-chassis-group should be deleted before the HA-chassis-group is
> removed. (same for SB schema)
>

I would prefer weak ref because it would be easier for CMS to manage the
references.
It doesn't need to keep track of the ref count before deleting the
HA_Chassis_Group row.



>
> > +                             "min": 0,
> > +                             "max": 1}},
> >                  "options": {
> >                      "type": {"key": "string",
> >                               "value": "string",
> > @@ -392,5 +398,29 @@
> >                      "type": {"key": "string", "value": "string",
> >                               "min": 0, "max": "unlimited"}}},
> >              "indexes": [["name"]],
> > -            "isRoot": false}}
> > +            "isRoot": false},
> > +        "HA_Chassis": {
> > +            "columns": {
> > +                "chassis_name": {"type": "string"},
> > +                "priority": {"type": {"key": {"type": "integer",
> > +                                              "minInteger": 0,
> > +                                              "maxInteger": 32767}}},
> > +                "external_ids": {
> > +                    "type": {"key": "string", "value": "string",
> > +                             "min": 0, "max": "unlimited"}}},
> > +            "isRoot": false},
> > +        "HA_Chassis_Group": {
> > +            "columns": {
> > +                "name": {"type": "string"},
> > +                "ha_chassis": {
> > +                    "type": {"key": {"type": "uuid",
> > +                                     "refTable": "HA_Chassis",
> > +                                     "refType": "strong"},
>
> Is it better to be weak ref here, considering that multiple a
> HA-chassis can belong to multiple groups? (same for SB schema)
>

If you see HA_Chassis is non root. and hence I think it should be strong
ref.
And I think it would be easier for CMS to manage. Deleting HA_Chassis_Group
will delete its
ha chassis list.

I think it would complicate the code in ovn-northd when syncing with the SB
DB if HA_Chassis is
root and it is referenced by multiple HA_Chassis_Group rows.

So I would prefer HA_Chassis to be non root and HA_Chassis_Group has a
strong ref to HA_Chassis table.

Thanks
Numan


>
> > +                             "min": 0,
> > +                             "max": "unlimited"}},
> > +                "external_ids": {
> > +                    "type": {"key": "string", "value": "string",
> > +                             "min": 0, "max": "unlimited"}}},
> > +            "indexes": [["name"]],
> > +            "isRoot": true}}
> >      }
>
> I will get back to 3/5 - 5/5 reviewing asap.
>
Han Zhou March 26, 2019, 5:51 p.m. UTC | #4
On Tue, Mar 26, 2019 at 9:59 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>
>
> Thanks Han for the review. I will address them
> Please see comments inline
>
> Thanks
> Numan
>
>
> On Tue, Mar 26, 2019 at 8:07 AM Han Zhou <zhouhan@gmail.com> wrote:
>>
>> Mostly looks good to me, just minor comments.
>>
>> On Thu, Mar 14, 2019 at 12:32 PM <nusiddiq@redhat.com> wrote:
>> >
>> > +static void
>> > +build_lrouter_groups__(struct hmap *ports, struct ovn_datapath *od)
>> > +{
>> > +    ovs_assert((od && od->nbr && od->lr_group));
>> > +
>> > +    if (od->l3dgw_port && od->l3redirect_port) {
>> > +        /* It's a logical router with gateway port. If it
>> > +         * has HA_Chassis_Group associated to it in SB DB, then store the
>> > +         * ha chassis group name. */
>> > +        if (od->l3redirect_port->sb->ha_chassis_group) {
>> > +            sset_add(&od->lr_group->ha_chassis_groups,
>> > +                     od->l3redirect_port->sb->ha_chassis_group->name);
>> > +        }
>> > +    }
>> > +
>> > +    for (size_t i = 0; i < od->nbr->n_ports; i++) {
>> > +        struct ovn_port *router_port =
>> > +            ovn_port_find(ports, od->nbr->ports[i]->name);
>> > +
>> > +        if (!router_port || !router_port->peer) {
>> > +            continue;
>> > +        }
>> > +
>> > +        /* Get the peer logical switch/logical router datapath. */
>> > +        struct ovn_datapath *peer_dp = router_port->peer->od;
>> > +        if (peer_dp->nbr) {
>> > +            if (!peer_dp->lr_group) {
>> > +                peer_dp->lr_group = od->lr_group;
>> > +                od->lr_group->router_dps[od->lr_group->n_router_dps++]
>> > +                    = peer_dp;
>> > +                build_lrouter_groups__(ports, peer_dp);
>> > +            }
>> > +        } else {
>> > +            for (size_t j = 0; j < peer_dp->n_router_ports; j++) {
>> > +                if (!peer_dp->router_ports[j]->peer) {
>> > +                    /* If there is no peer port connecting to the
>> > +                    * router datapath, ignore it. */
>>
>> s/router datapath/router port
>>
>> > +                    continue;
>> > +                }
>> > +
>>
>> > +static void
>> > +build_lrouter_groups(struct hmap *ports, struct ovs_list *lr_list)
>> > +{
>> > +    struct ovn_datapath *od;
>> > +    size_t n_router_dps = ovs_list_size(lr_list);
>> > +
>> > +    LIST_FOR_EACH (od, lr_list, lr_list) {
>> > +        if (!od->lr_group) {
>> > +            od->lr_group = xzalloc(sizeof *od->lr_group);
>> > +            /* Each logical router group can have max
>> > +             * 'n_router_dps'. So allocate enough memory. */
>> > +            od->lr_group->router_dps = xcalloc(sizeof *od, n_router_dps);
>> > +            od->lr_group->router_dps[od->lr_group->n_router_dps] = od;
>>
>> Here it is more clear to be: od->lr_group->router_dps[0] = od;
>>
>> > +            od->lr_group->n_router_dps = 1;
>> > +            sset_init(&od->lr_group->ha_chassis_groups);
>> > +            build_lrouter_groups__(ports, od);
>> > +        }
>> > +    }
>> > +}
>> > +
>>
>> >  static void
>> > -destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports)
>> > +destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports,
>> > +                            struct ovs_list *lr_list)
>> >  {
>> > +    struct ovn_datapath *router_dp;
>> > +    LIST_FOR_EACH_POP (router_dp, lr_list, lr_list) {
>> > +        if (router_dp->lr_group) {
>> > +            struct lrouter_group *lr_group = router_dp->lr_group;
>> > +
>> > +            for (size_t i = 0; i < router_dp->lr_group->n_router_dps; i++) {
>> > +                if (router_dp->lr_group->router_dps[i] != router_dp) {
>>
>> This if-condition seems not needed.
>>
>> > +                    router_dp->lr_group->router_dps[i]->lr_group = NULL;
>> > +                }
>> > +            }
>>
>> s/router_dp->lr_group/lr_group in above for-loop.
>>
>> > +
>> > +            free(lr_group->router_dps);
>> > +            sset_destroy(&lr_group->ha_chassis_groups);
>> > +            free(lr_group);
>> > +            router_dp->lr_group = NULL;
>> > +        }
>> > +    }
>> > +
>>
>> > diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
>> > index 10a59649a..48d27b960 100644
>> > --- a/ovn/ovn-nb.ovsschema
>> > +++ b/ovn/ovn-nb.ovsschema
>> > @@ -1,7 +1,7 @@
>> >  {
>> >      "name": "OVN_Northbound",
>> > -    "version": "5.14.1",
>> > -    "cksum": "3758097843 20509",
>> > +    "version": "5.15.0",
>> > +    "cksum": "1078795414 21917",
>> >      "tables": {
>> >          "NB_Global": {
>> >              "columns": {
>> > @@ -271,6 +271,12 @@
>> >                                       "refType": "strong"},
>> >                               "min": 0,
>> >                               "max": "unlimited"}},
>> > +                "ha_chassis_group": {
>> > +                    "type": {"key": {"type": "uuid",
>> > +                                     "refTable": "HA_Chassis_Group",
>> > +                                     "refType": "weak"},
>>
>> Shall this be strong ref instead? In normal case logical ports hosted
>> by a HA-chassis-group should be deleted before the HA-chassis-group is
>> removed. (same for SB schema)
>
>
> I would prefer weak ref because it would be easier for CMS to manage the references.
> It doesn't need to keep track of the ref count before deleting the HA_Chassis_Group row.
>

Sorry that I don't understand why it requires ref count. With strong
reference, it just prevents operators to delete HA_Chassis_Group by
mistake (i.e. when they are still used by routers). CMS can just
simply return an error in this case, without any ref counter
maintenance. Otherwise, if it is weak reference, an operator can
delete HA_Chassis_Group even if it is still in use, which causes the
related logical router ports being converted from centralized to
distributed, implicitly, which I think is not good.

>
>>
>>
>> > +                             "min": 0,
>> > +                             "max": 1}},
>> >                  "options": {
>> >                      "type": {"key": "string",
>> >                               "value": "string",
>> > @@ -392,5 +398,29 @@
>> >                      "type": {"key": "string", "value": "string",
>> >                               "min": 0, "max": "unlimited"}}},
>> >              "indexes": [["name"]],
>> > -            "isRoot": false}}
>> > +            "isRoot": false},
>> > +        "HA_Chassis": {
>> > +            "columns": {
>> > +                "chassis_name": {"type": "string"},
>> > +                "priority": {"type": {"key": {"type": "integer",
>> > +                                              "minInteger": 0,
>> > +                                              "maxInteger": 32767}}},
>> > +                "external_ids": {
>> > +                    "type": {"key": "string", "value": "string",
>> > +                             "min": 0, "max": "unlimited"}}},
>> > +            "isRoot": false},
>> > +        "HA_Chassis_Group": {
>> > +            "columns": {
>> > +                "name": {"type": "string"},
>> > +                "ha_chassis": {
>> > +                    "type": {"key": {"type": "uuid",
>> > +                                     "refTable": "HA_Chassis",
>> > +                                     "refType": "strong"},
>>
>> Is it better to be weak ref here, considering that multiple a
>> HA-chassis can belong to multiple groups? (same for SB schema)
>
>
> If you see HA_Chassis is non root. and hence I think it should be strong ref.
> And I think it would be easier for CMS to manage. Deleting HA_Chassis_Group will delete its
> ha chassis list.

"Deleting HA_Chassis_Group will delete its ha chassis list." => This
is the behavior of *weak* reference, right? It seems we have reverse
understanding about strong v.s. weak references :)

>
> I think it would complicate the code in ovn-northd when syncing with the SB DB if HA_Chassis is
> root and it is referenced by multiple HA_Chassis_Group rows.
>
> So I would prefer HA_Chassis to be non root and HA_Chassis_Group has a strong ref to HA_Chassis table.
>
> Thanks
> Numan
>
>>
>>
>> > +                             "min": 0,
>> > +                             "max": "unlimited"}},
>> > +                "external_ids": {
>> > +                    "type": {"key": "string", "value": "string",
>> > +                             "min": 0, "max": "unlimited"}}},
>> > +            "indexes": [["name"]],
>> > +            "isRoot": true}}
>> >      }
>>
>> I will get back to 3/5 - 5/5 reviewing asap.
Numan Siddique March 26, 2019, 6:08 p.m. UTC | #5
On Tue, Mar 26, 2019 at 11:21 PM Han Zhou <zhouhan@gmail.com> wrote:

> On Tue, Mar 26, 2019 at 9:59 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> >
> >
> > Thanks Han for the review. I will address them
> > Please see comments inline
> >
> > Thanks
> > Numan
> >
> >
> > On Tue, Mar 26, 2019 at 8:07 AM Han Zhou <zhouhan@gmail.com> wrote:
> >>
> >> Mostly looks good to me, just minor comments.
> >>
> >> On Thu, Mar 14, 2019 at 12:32 PM <nusiddiq@redhat.com> wrote:
> >> >
> >> > +static void
> >> > +build_lrouter_groups__(struct hmap *ports, struct ovn_datapath *od)
> >> > +{
> >> > +    ovs_assert((od && od->nbr && od->lr_group));
> >> > +
> >> > +    if (od->l3dgw_port && od->l3redirect_port) {
> >> > +        /* It's a logical router with gateway port. If it
> >> > +         * has HA_Chassis_Group associated to it in SB DB, then
> store the
> >> > +         * ha chassis group name. */
> >> > +        if (od->l3redirect_port->sb->ha_chassis_group) {
> >> > +            sset_add(&od->lr_group->ha_chassis_groups,
> >> > +
>  od->l3redirect_port->sb->ha_chassis_group->name);
> >> > +        }
> >> > +    }
> >> > +
> >> > +    for (size_t i = 0; i < od->nbr->n_ports; i++) {
> >> > +        struct ovn_port *router_port =
> >> > +            ovn_port_find(ports, od->nbr->ports[i]->name);
> >> > +
> >> > +        if (!router_port || !router_port->peer) {
> >> > +            continue;
> >> > +        }
> >> > +
> >> > +        /* Get the peer logical switch/logical router datapath. */
> >> > +        struct ovn_datapath *peer_dp = router_port->peer->od;
> >> > +        if (peer_dp->nbr) {
> >> > +            if (!peer_dp->lr_group) {
> >> > +                peer_dp->lr_group = od->lr_group;
> >> > +
> od->lr_group->router_dps[od->lr_group->n_router_dps++]
> >> > +                    = peer_dp;
> >> > +                build_lrouter_groups__(ports, peer_dp);
> >> > +            }
> >> > +        } else {
> >> > +            for (size_t j = 0; j < peer_dp->n_router_ports; j++) {
> >> > +                if (!peer_dp->router_ports[j]->peer) {
> >> > +                    /* If there is no peer port connecting to the
> >> > +                    * router datapath, ignore it. */
> >>
> >> s/router datapath/router port
> >>
> >> > +                    continue;
> >> > +                }
> >> > +
> >>
> >> > +static void
> >> > +build_lrouter_groups(struct hmap *ports, struct ovs_list *lr_list)
> >> > +{
> >> > +    struct ovn_datapath *od;
> >> > +    size_t n_router_dps = ovs_list_size(lr_list);
> >> > +
> >> > +    LIST_FOR_EACH (od, lr_list, lr_list) {
> >> > +        if (!od->lr_group) {
> >> > +            od->lr_group = xzalloc(sizeof *od->lr_group);
> >> > +            /* Each logical router group can have max
> >> > +             * 'n_router_dps'. So allocate enough memory. */
> >> > +            od->lr_group->router_dps = xcalloc(sizeof *od,
> n_router_dps);
> >> > +            od->lr_group->router_dps[od->lr_group->n_router_dps] =
> od;
> >>
> >> Here it is more clear to be: od->lr_group->router_dps[0] = od;
> >>
> >> > +            od->lr_group->n_router_dps = 1;
> >> > +            sset_init(&od->lr_group->ha_chassis_groups);
> >> > +            build_lrouter_groups__(ports, od);
> >> > +        }
> >> > +    }
> >> > +}
> >> > +
> >>
> >> >  static void
> >> > -destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap
> *ports)
> >> > +destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap
> *ports,
> >> > +                            struct ovs_list *lr_list)
> >> >  {
> >> > +    struct ovn_datapath *router_dp;
> >> > +    LIST_FOR_EACH_POP (router_dp, lr_list, lr_list) {
> >> > +        if (router_dp->lr_group) {
> >> > +            struct lrouter_group *lr_group = router_dp->lr_group;
> >> > +
> >> > +            for (size_t i = 0; i <
> router_dp->lr_group->n_router_dps; i++) {
> >> > +                if (router_dp->lr_group->router_dps[i] != router_dp)
> {
> >>
> >> This if-condition seems not needed.
> >>
> >> > +                    router_dp->lr_group->router_dps[i]->lr_group =
> NULL;
> >> > +                }
> >> > +            }
> >>
> >> s/router_dp->lr_group/lr_group in above for-loop.
> >>
> >> > +
> >> > +            free(lr_group->router_dps);
> >> > +            sset_destroy(&lr_group->ha_chassis_groups);
> >> > +            free(lr_group);
> >> > +            router_dp->lr_group = NULL;
> >> > +        }
> >> > +    }
> >> > +
> >>
> >> > diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
> >> > index 10a59649a..48d27b960 100644
> >> > --- a/ovn/ovn-nb.ovsschema
> >> > +++ b/ovn/ovn-nb.ovsschema
> >> > @@ -1,7 +1,7 @@
> >> >  {
> >> >      "name": "OVN_Northbound",
> >> > -    "version": "5.14.1",
> >> > -    "cksum": "3758097843 20509",
> >> > +    "version": "5.15.0",
> >> > +    "cksum": "1078795414 21917",
> >> >      "tables": {
> >> >          "NB_Global": {
> >> >              "columns": {
> >> > @@ -271,6 +271,12 @@
> >> >                                       "refType": "strong"},
> >> >                               "min": 0,
> >> >                               "max": "unlimited"}},
> >> > +                "ha_chassis_group": {
> >> > +                    "type": {"key": {"type": "uuid",
> >> > +                                     "refTable": "HA_Chassis_Group",
> >> > +                                     "refType": "weak"},
> >>
> >> Shall this be strong ref instead? In normal case logical ports hosted
> >> by a HA-chassis-group should be deleted before the HA-chassis-group is
> >> removed. (same for SB schema)
> >
> >
> > I would prefer weak ref because it would be easier for CMS to manage the
> references.
> > It doesn't need to keep track of the ref count before deleting the
> HA_Chassis_Group row.
> >
>
> Sorry that I don't understand why it requires ref count. With strong
> reference, it just prevents operators to delete HA_Chassis_Group by
> mistake (i.e. when they are still used by routers). CMS can just
> simply return an error in this case, without any ref counter
> maintenance. Otherwise, if it is weak reference, an operator can
> delete HA_Chassis_Group even if it is still in use, which causes the
> related logical router ports being converted from centralized to
> distributed, implicitly, which I think is not good.
>
>
Agree that operator can delete HA_Chassis_Group by mistake even if
logical ports/logical router ports are still referencing the
HA_Chassis_Group.

What I mean is, CMS cannot delete the HA_Chassis_Group rows unless
it clears all the references to HA_Chassis_Group.

In a recent commit, we changed the logical switches and logical routers
referencing the Load_Balancer to weak mainly because of this.
 -
https://github.com/openvswitch/ovs/commit/612f80fa8ebf88dad2e204364c6c02b451dca36c

That's why I thought to have a weak reference.
You still think its good to change to strong ?




> >
> >>
> >>
> >> > +                             "min": 0,
> >> > +                             "max": 1}},
> >> >                  "options": {
> >> >                      "type": {"key": "string",
> >> >                               "value": "string",
> >> > @@ -392,5 +398,29 @@
> >> >                      "type": {"key": "string", "value": "string",
> >> >                               "min": 0, "max": "unlimited"}}},
> >> >              "indexes": [["name"]],
> >> > -            "isRoot": false}}
> >> > +            "isRoot": false},
> >> > +        "HA_Chassis": {
> >> > +            "columns": {
> >> > +                "chassis_name": {"type": "string"},
> >> > +                "priority": {"type": {"key": {"type": "integer",
> >> > +                                              "minInteger": 0,
> >> > +                                              "maxInteger": 32767}}},
> >> > +                "external_ids": {
> >> > +                    "type": {"key": "string", "value": "string",
> >> > +                             "min": 0, "max": "unlimited"}}},
> >> > +            "isRoot": false},
> >> > +        "HA_Chassis_Group": {
> >> > +            "columns": {
> >> > +                "name": {"type": "string"},
> >> > +                "ha_chassis": {
> >> > +                    "type": {"key": {"type": "uuid",
> >> > +                                     "refTable": "HA_Chassis",
> >> > +                                     "refType": "strong"},
> >>
> >> Is it better to be weak ref here, considering that multiple a
> >> HA-chassis can belong to multiple groups? (same for SB schema)
> >
> >
> > If you see HA_Chassis is non root. and hence I think it should be strong
> ref.
> > And I think it would be easier for CMS to manage. Deleting
> HA_Chassis_Group will delete its
> > ha chassis list.
>
> "Deleting HA_Chassis_Group will delete its ha chassis list." => This
> is the behavior of *weak* reference, right? It seems we have reverse
> understanding about strong v.s. weak references :)
>

Deleting HA_Chassis_Group will delete the HA_Chassis rows because -
HA_Chassis is "non root".
(Just like how Logical switch references the Logical_Switch_Ports.
Logical_Switch_Port is "non root"
and is strongly referenced by Logical_Switch and deleting Logical_Switch
row, also deletes the Logical_Switch_Ports
of that Logical_Switch).  Same is used here.

As I stated earlier, I think its better that HA_Chassis is not referenced
by multiple HA_Chassis_Group rows.  Otherwise
the code in ovn-northd would get complicated to sync in SB DB.

Thanks
Numan



>
> >
> > I think it would complicate the code in ovn-northd when syncing with the
> SB DB if HA_Chassis is
> > root and it is referenced by multiple HA_Chassis_Group rows.
> >
> > So I would prefer HA_Chassis to be non root and HA_Chassis_Group has a
> strong ref to HA_Chassis table.
> >
> > Thanks
> > Numan
> >
> >>
> >>
> >> > +                             "min": 0,
> >> > +                             "max": "unlimited"}},
> >> > +                "external_ids": {
> >> > +                    "type": {"key": "string", "value": "string",
> >> > +                             "min": 0, "max": "unlimited"}}},
> >> > +            "indexes": [["name"]],
> >> > +            "isRoot": true}}
> >> >      }
> >>
> >> I will get back to 3/5 - 5/5 reviewing asap.
>
Han Zhou March 26, 2019, 7:02 p.m. UTC | #6
On Tue, Mar 26, 2019 at 11:08 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>
>
>
> On Tue, Mar 26, 2019 at 11:21 PM Han Zhou <zhouhan@gmail.com> wrote:
>>
>> On Tue, Mar 26, 2019 at 9:59 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>> >
>> >
>> > Thanks Han for the review. I will address them
>> > Please see comments inline
>> >
>> > Thanks
>> > Numan
>> >
>> >
>> > On Tue, Mar 26, 2019 at 8:07 AM Han Zhou <zhouhan@gmail.com> wrote:
>> >>
>> >> Mostly looks good to me, just minor comments.
>> >>
>> >> On Thu, Mar 14, 2019 at 12:32 PM <nusiddiq@redhat.com> wrote:
>> >> >
>> >> > +static void
>> >> > +build_lrouter_groups__(struct hmap *ports, struct ovn_datapath *od)
>> >> > +{
>> >> > +    ovs_assert((od && od->nbr && od->lr_group));
>> >> > +
>> >> > +    if (od->l3dgw_port && od->l3redirect_port) {
>> >> > +        /* It's a logical router with gateway port. If it
>> >> > +         * has HA_Chassis_Group associated to it in SB DB, then store the
>> >> > +         * ha chassis group name. */
>> >> > +        if (od->l3redirect_port->sb->ha_chassis_group) {
>> >> > +            sset_add(&od->lr_group->ha_chassis_groups,
>> >> > +                     od->l3redirect_port->sb->ha_chassis_group->name);
>> >> > +        }
>> >> > +    }
>> >> > +
>> >> > +    for (size_t i = 0; i < od->nbr->n_ports; i++) {
>> >> > +        struct ovn_port *router_port =
>> >> > +            ovn_port_find(ports, od->nbr->ports[i]->name);
>> >> > +
>> >> > +        if (!router_port || !router_port->peer) {
>> >> > +            continue;
>> >> > +        }
>> >> > +
>> >> > +        /* Get the peer logical switch/logical router datapath. */
>> >> > +        struct ovn_datapath *peer_dp = router_port->peer->od;
>> >> > +        if (peer_dp->nbr) {
>> >> > +            if (!peer_dp->lr_group) {
>> >> > +                peer_dp->lr_group = od->lr_group;
>> >> > +                od->lr_group->router_dps[od->lr_group->n_router_dps++]
>> >> > +                    = peer_dp;
>> >> > +                build_lrouter_groups__(ports, peer_dp);
>> >> > +            }
>> >> > +        } else {
>> >> > +            for (size_t j = 0; j < peer_dp->n_router_ports; j++) {
>> >> > +                if (!peer_dp->router_ports[j]->peer) {
>> >> > +                    /* If there is no peer port connecting to the
>> >> > +                    * router datapath, ignore it. */
>> >>
>> >> s/router datapath/router port
>> >>
>> >> > +                    continue;
>> >> > +                }
>> >> > +
>> >>
>> >> > +static void
>> >> > +build_lrouter_groups(struct hmap *ports, struct ovs_list *lr_list)
>> >> > +{
>> >> > +    struct ovn_datapath *od;
>> >> > +    size_t n_router_dps = ovs_list_size(lr_list);
>> >> > +
>> >> > +    LIST_FOR_EACH (od, lr_list, lr_list) {
>> >> > +        if (!od->lr_group) {
>> >> > +            od->lr_group = xzalloc(sizeof *od->lr_group);
>> >> > +            /* Each logical router group can have max
>> >> > +             * 'n_router_dps'. So allocate enough memory. */
>> >> > +            od->lr_group->router_dps = xcalloc(sizeof *od, n_router_dps);
>> >> > +            od->lr_group->router_dps[od->lr_group->n_router_dps] = od;
>> >>
>> >> Here it is more clear to be: od->lr_group->router_dps[0] = od;
>> >>
>> >> > +            od->lr_group->n_router_dps = 1;
>> >> > +            sset_init(&od->lr_group->ha_chassis_groups);
>> >> > +            build_lrouter_groups__(ports, od);
>> >> > +        }
>> >> > +    }
>> >> > +}
>> >> > +
>> >>
>> >> >  static void
>> >> > -destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports)
>> >> > +destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports,
>> >> > +                            struct ovs_list *lr_list)
>> >> >  {
>> >> > +    struct ovn_datapath *router_dp;
>> >> > +    LIST_FOR_EACH_POP (router_dp, lr_list, lr_list) {
>> >> > +        if (router_dp->lr_group) {
>> >> > +            struct lrouter_group *lr_group = router_dp->lr_group;
>> >> > +
>> >> > +            for (size_t i = 0; i < router_dp->lr_group->n_router_dps; i++) {
>> >> > +                if (router_dp->lr_group->router_dps[i] != router_dp) {
>> >>
>> >> This if-condition seems not needed.
>> >>
>> >> > +                    router_dp->lr_group->router_dps[i]->lr_group = NULL;
>> >> > +                }
>> >> > +            }
>> >>
>> >> s/router_dp->lr_group/lr_group in above for-loop.
>> >>
>> >> > +
>> >> > +            free(lr_group->router_dps);
>> >> > +            sset_destroy(&lr_group->ha_chassis_groups);
>> >> > +            free(lr_group);
>> >> > +            router_dp->lr_group = NULL;
>> >> > +        }
>> >> > +    }
>> >> > +
>> >>
>> >> > diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
>> >> > index 10a59649a..48d27b960 100644
>> >> > --- a/ovn/ovn-nb.ovsschema
>> >> > +++ b/ovn/ovn-nb.ovsschema
>> >> > @@ -1,7 +1,7 @@
>> >> >  {
>> >> >      "name": "OVN_Northbound",
>> >> > -    "version": "5.14.1",
>> >> > -    "cksum": "3758097843 20509",
>> >> > +    "version": "5.15.0",
>> >> > +    "cksum": "1078795414 21917",
>> >> >      "tables": {
>> >> >          "NB_Global": {
>> >> >              "columns": {
>> >> > @@ -271,6 +271,12 @@
>> >> >                                       "refType": "strong"},
>> >> >                               "min": 0,
>> >> >                               "max": "unlimited"}},
>> >> > +                "ha_chassis_group": {
>> >> > +                    "type": {"key": {"type": "uuid",
>> >> > +                                     "refTable": "HA_Chassis_Group",
>> >> > +                                     "refType": "weak"},
>> >>
>> >> Shall this be strong ref instead? In normal case logical ports hosted
>> >> by a HA-chassis-group should be deleted before the HA-chassis-group is
>> >> removed. (same for SB schema)
>> >
>> >
>> > I would prefer weak ref because it would be easier for CMS to manage the references.
>> > It doesn't need to keep track of the ref count before deleting the HA_Chassis_Group row.
>> >
>>
>> Sorry that I don't understand why it requires ref count. With strong
>> reference, it just prevents operators to delete HA_Chassis_Group by
>> mistake (i.e. when they are still used by routers). CMS can just
>> simply return an error in this case, without any ref counter
>> maintenance. Otherwise, if it is weak reference, an operator can
>> delete HA_Chassis_Group even if it is still in use, which causes the
>> related logical router ports being converted from centralized to
>> distributed, implicitly, which I think is not good.
>>
>
> Agree that operator can delete HA_Chassis_Group by mistake even if
> logical ports/logical router ports are still referencing the HA_Chassis_Group.
>
> What I mean is, CMS cannot delete the HA_Chassis_Group rows unless
> it clears all the references to HA_Chassis_Group.
>
> In a recent commit, we changed the logical switches and logical routers
> referencing the Load_Balancer to weak mainly because of this.
>  - https://github.com/openvswitch/ovs/commit/612f80fa8ebf88dad2e204364c6c02b451dca36c
>
> That's why I thought to have a weak reference.
> You still think its good to change to strong ?
>

Yes I am well aware of the LB change - in fact we are working on a
workaround in go-ovn project for older version of OVN that still has
strong reference for LB.
However, I think that is a different scenario. For a logical switch, a
LB is something additionally added, and from an operators point of
view it is pretty common to add/remove LB(s) to/from a logical switch.
However, for a logical router port, it is usually pre-defined that it
is a GW port or distributed port, when the deployment model is
decided. Although it is possible to convert a GW port to a distributed
port, it is uncommon. If it is really what the operator wants to do,
it should be handled explicitly by updating the logical router port,
rather than implicitly by removing the HA_Chassis_Group. Does this
make sense?

>
>
>>
>> >
>> >>
>> >>
>> >> > +                             "min": 0,
>> >> > +                             "max": 1}},
>> >> >                  "options": {
>> >> >                      "type": {"key": "string",
>> >> >                               "value": "string",
>> >> > @@ -392,5 +398,29 @@
>> >> >                      "type": {"key": "string", "value": "string",
>> >> >                               "min": 0, "max": "unlimited"}}},
>> >> >              "indexes": [["name"]],
>> >> > -            "isRoot": false}}
>> >> > +            "isRoot": false},
>> >> > +        "HA_Chassis": {
>> >> > +            "columns": {
>> >> > +                "chassis_name": {"type": "string"},
>> >> > +                "priority": {"type": {"key": {"type": "integer",
>> >> > +                                              "minInteger": 0,
>> >> > +                                              "maxInteger": 32767}}},
>> >> > +                "external_ids": {
>> >> > +                    "type": {"key": "string", "value": "string",
>> >> > +                             "min": 0, "max": "unlimited"}}},
>> >> > +            "isRoot": false},
>> >> > +        "HA_Chassis_Group": {
>> >> > +            "columns": {
>> >> > +                "name": {"type": "string"},
>> >> > +                "ha_chassis": {
>> >> > +                    "type": {"key": {"type": "uuid",
>> >> > +                                     "refTable": "HA_Chassis",
>> >> > +                                     "refType": "strong"},
>> >>
>> >> Is it better to be weak ref here, considering that multiple a
>> >> HA-chassis can belong to multiple groups? (same for SB schema)
>> >
>> >
>> > If you see HA_Chassis is non root. and hence I think it should be strong ref.
>> > And I think it would be easier for CMS to manage. Deleting HA_Chassis_Group will delete its
>> > ha chassis list.
>>
>> "Deleting HA_Chassis_Group will delete its ha chassis list." => This
>> is the behavior of *weak* reference, right? It seems we have reverse
>> understanding about strong v.s. weak references :)
>
>
> Deleting HA_Chassis_Group will delete the HA_Chassis rows because - HA_Chassis is "non root".
> (Just like how Logical switch references the Logical_Switch_Ports. Logical_Switch_Port is "non root"
> and is strongly referenced by Logical_Switch and deleting Logical_Switch row, also deletes the Logical_Switch_Ports
> of that Logical_Switch).  Same is used here.
>

You are right. I just went over the rfc7047 again and realized it was
my misunderstanding. Thanks for correcting me! So I agree with you
that to keep HA_Chassis non-root, strong reference is the only choice.
(though I don't quite understand the design in rfc7047 why weak
reference can't prevent a non-root row being deleted - only strong
reference prevents that.)

> As I stated earlier, I think its better that HA_Chassis is not referenced by multiple HA_Chassis_Group rows.  Otherwise
> the code in ovn-northd would get complicated to sync in SB DB.

Also, forget about my statement "a HA-chassis can belong to multiple
groups". My brain wasn't clear :(


>
> Thanks
> Numan
>
>
>>
>>
>> >
>> > I think it would complicate the code in ovn-northd when syncing with the SB DB if HA_Chassis is
>> > root and it is referenced by multiple HA_Chassis_Group rows.
>> >
>> > So I would prefer HA_Chassis to be non root and HA_Chassis_Group has a strong ref to HA_Chassis table.
>> >
>> > Thanks
>> > Numan
>> >
>> >>
>> >>
>> >> > +                             "min": 0,
>> >> > +                             "max": "unlimited"}},
>> >> > +                "external_ids": {
>> >> > +                    "type": {"key": "string", "value": "string",
>> >> > +                             "min": 0, "max": "unlimited"}}},
>> >> > +            "indexes": [["name"]],
>> >> > +            "isRoot": true}}
>> >> >      }
>> >>
>> >> I will get back to 3/5 - 5/5 reviewing asap.
Numan Siddique March 27, 2019, 10 a.m. UTC | #7
On Wed, Mar 27, 2019 at 12:32 AM Han Zhou <zhouhan@gmail.com> wrote:

> On Tue, Mar 26, 2019 at 11:08 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> >
> >
> >
> > On Tue, Mar 26, 2019 at 11:21 PM Han Zhou <zhouhan@gmail.com> wrote:
> >>
> >> On Tue, Mar 26, 2019 at 9:59 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> >> >
> >> >
> >> > Thanks Han for the review. I will address them
> >> > Please see comments inline
> >> >
> >> > Thanks
> >> > Numan
> >> >
> >> >
> >> > On Tue, Mar 26, 2019 at 8:07 AM Han Zhou <zhouhan@gmail.com> wrote:
> >> >>
> >> >> Mostly looks good to me, just minor comments.
> >> >>
> >> >> On Thu, Mar 14, 2019 at 12:32 PM <nusiddiq@redhat.com> wrote:
> >> >> >
> >> >> > +static void
> >> >> > +build_lrouter_groups__(struct hmap *ports, struct ovn_datapath
> *od)
> >> >> > +{
> >> >> > +    ovs_assert((od && od->nbr && od->lr_group));
> >> >> > +
> >> >> > +    if (od->l3dgw_port && od->l3redirect_port) {
> >> >> > +        /* It's a logical router with gateway port. If it
> >> >> > +         * has HA_Chassis_Group associated to it in SB DB, then
> store the
> >> >> > +         * ha chassis group name. */
> >> >> > +        if (od->l3redirect_port->sb->ha_chassis_group) {
> >> >> > +            sset_add(&od->lr_group->ha_chassis_groups,
> >> >> > +
>  od->l3redirect_port->sb->ha_chassis_group->name);
> >> >> > +        }
> >> >> > +    }
> >> >> > +
> >> >> > +    for (size_t i = 0; i < od->nbr->n_ports; i++) {
> >> >> > +        struct ovn_port *router_port =
> >> >> > +            ovn_port_find(ports, od->nbr->ports[i]->name);
> >> >> > +
> >> >> > +        if (!router_port || !router_port->peer) {
> >> >> > +            continue;
> >> >> > +        }
> >> >> > +
> >> >> > +        /* Get the peer logical switch/logical router datapath. */
> >> >> > +        struct ovn_datapath *peer_dp = router_port->peer->od;
> >> >> > +        if (peer_dp->nbr) {
> >> >> > +            if (!peer_dp->lr_group) {
> >> >> > +                peer_dp->lr_group = od->lr_group;
> >> >> > +
> od->lr_group->router_dps[od->lr_group->n_router_dps++]
> >> >> > +                    = peer_dp;
> >> >> > +                build_lrouter_groups__(ports, peer_dp);
> >> >> > +            }
> >> >> > +        } else {
> >> >> > +            for (size_t j = 0; j < peer_dp->n_router_ports; j++) {
> >> >> > +                if (!peer_dp->router_ports[j]->peer) {
> >> >> > +                    /* If there is no peer port connecting to the
> >> >> > +                    * router datapath, ignore it. */
> >> >>
> >> >> s/router datapath/router port
> >> >>
> >> >> > +                    continue;
> >> >> > +                }
> >> >> > +
> >> >>
> >> >> > +static void
> >> >> > +build_lrouter_groups(struct hmap *ports, struct ovs_list *lr_list)
> >> >> > +{
> >> >> > +    struct ovn_datapath *od;
> >> >> > +    size_t n_router_dps = ovs_list_size(lr_list);
> >> >> > +
> >> >> > +    LIST_FOR_EACH (od, lr_list, lr_list) {
> >> >> > +        if (!od->lr_group) {
> >> >> > +            od->lr_group = xzalloc(sizeof *od->lr_group);
> >> >> > +            /* Each logical router group can have max
> >> >> > +             * 'n_router_dps'. So allocate enough memory. */
> >> >> > +            od->lr_group->router_dps = xcalloc(sizeof *od,
> n_router_dps);
> >> >> > +            od->lr_group->router_dps[od->lr_group->n_router_dps]
> = od;
> >> >>
> >> >> Here it is more clear to be: od->lr_group->router_dps[0] = od;
> >> >>
> >> >> > +            od->lr_group->n_router_dps = 1;
> >> >> > +            sset_init(&od->lr_group->ha_chassis_groups);
> >> >> > +            build_lrouter_groups__(ports, od);
> >> >> > +        }
> >> >> > +    }
> >> >> > +}
> >> >> > +
> >> >>
> >> >> >  static void
> >> >> > -destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap
> *ports)
> >> >> > +destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap
> *ports,
> >> >> > +                            struct ovs_list *lr_list)
> >> >> >  {
> >> >> > +    struct ovn_datapath *router_dp;
> >> >> > +    LIST_FOR_EACH_POP (router_dp, lr_list, lr_list) {
> >> >> > +        if (router_dp->lr_group) {
> >> >> > +            struct lrouter_group *lr_group = router_dp->lr_group;
> >> >> > +
> >> >> > +            for (size_t i = 0; i <
> router_dp->lr_group->n_router_dps; i++) {
> >> >> > +                if (router_dp->lr_group->router_dps[i] !=
> router_dp) {
> >> >>
> >> >> This if-condition seems not needed.
>

Actually it is needed. I first thought the same. But there were crashes
because of NULL pointer deference.
If we don't have this check, then the check "i <
router_dp->lr_group->n_router_dps" in the for loop would
result in the NULL pointer dereference.



> >> >>
> >> >> > +                    router_dp->lr_group->router_dps[i]->lr_group
> = NULL;
> >> >> > +                }
> >> >> > +            }
> >> >>
> >> >> s/router_dp->lr_group/lr_group in above for-loop.
> >> >>
> >> >> > +
> >> >> > +            free(lr_group->router_dps);
> >> >> > +            sset_destroy(&lr_group->ha_chassis_groups);
> >> >> > +            free(lr_group);
> >> >> > +            router_dp->lr_group = NULL;
> >> >> > +        }
> >> >> > +    }
> >> >> > +
> >> >>
> >> >> > diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
> >> >> > index 10a59649a..48d27b960 100644
> >> >> > --- a/ovn/ovn-nb.ovsschema
> >> >> > +++ b/ovn/ovn-nb.ovsschema
> >> >> > @@ -1,7 +1,7 @@
> >> >> >  {
> >> >> >      "name": "OVN_Northbound",
> >> >> > -    "version": "5.14.1",
> >> >> > -    "cksum": "3758097843 20509",
> >> >> > +    "version": "5.15.0",
> >> >> > +    "cksum": "1078795414 21917",
> >> >> >      "tables": {
> >> >> >          "NB_Global": {
> >> >> >              "columns": {
> >> >> > @@ -271,6 +271,12 @@
> >> >> >                                       "refType": "strong"},
> >> >> >                               "min": 0,
> >> >> >                               "max": "unlimited"}},
> >> >> > +                "ha_chassis_group": {
> >> >> > +                    "type": {"key": {"type": "uuid",
> >> >> > +                                     "refTable":
> "HA_Chassis_Group",
> >> >> > +                                     "refType": "weak"},
> >> >>
> >> >> Shall this be strong ref instead? In normal case logical ports hosted
> >> >> by a HA-chassis-group should be deleted before the HA-chassis-group
> is
> >> >> removed. (same for SB schema)
> >> >
> >> >
> >> > I would prefer weak ref because it would be easier for CMS to manage
> the references.
> >> > It doesn't need to keep track of the ref count before deleting the
> HA_Chassis_Group row.
> >> >
> >>
> >> Sorry that I don't understand why it requires ref count. With strong
> >> reference, it just prevents operators to delete HA_Chassis_Group by
> >> mistake (i.e. when they are still used by routers). CMS can just
> >> simply return an error in this case, without any ref counter
> >> maintenance. Otherwise, if it is weak reference, an operator can
> >> delete HA_Chassis_Group even if it is still in use, which causes the
> >> related logical router ports being converted from centralized to
> >> distributed, implicitly, which I think is not good.
> >>
> >
> > Agree that operator can delete HA_Chassis_Group by mistake even if
> > logical ports/logical router ports are still referencing the
> HA_Chassis_Group.
> >
> > What I mean is, CMS cannot delete the HA_Chassis_Group rows unless
> > it clears all the references to HA_Chassis_Group.
> >
> > In a recent commit, we changed the logical switches and logical routers
> > referencing the Load_Balancer to weak mainly because of this.
> >  -
> https://github.com/openvswitch/ovs/commit/612f80fa8ebf88dad2e204364c6c02b451dca36c
> >
> > That's why I thought to have a weak reference.
> > You still think its good to change to strong ?
> >
>
> Yes I am well aware of the LB change - in fact we are working on a
> workaround in go-ovn project for older version of OVN that still has
> strong reference for LB.
> However, I think that is a different scenario. For a logical switch, a
> LB is something additionally added, and from an operators point of
> view it is pretty common to add/remove LB(s) to/from a logical switch.
> However, for a logical router port, it is usually pre-defined that it
> is a GW port or distributed port, when the deployment model is
> decided. Although it is possible to convert a GW port to a distributed
> port, it is uncommon. If it is really what the operator wants to do,
> it should be handled explicitly by updating the logical router port,
> rather than implicitly by removing the HA_Chassis_Group. Does this
> make sense?
>
>
CMS could use an HA_Chassis_Group both for associating it with multiple
logical router ports and also with multiple "external" ports.

 I am still little concerned from CMS point of view.

Eg.
ovn-nbctl ha-chassis-group-add-chassis hagrp1 ch1 30
ovn-nbctl ha-chassis-group-add-chassis hagrp1 ch2 20
ovn-nbctl ha-chassis-group-add-chassis hagrp1 ch3 10

ovn-nbctl set logical_router_port lr0-public ha_chassis_group=<HAGRP1_UUID>
ovn-nbctl set logical_router_port lr1-public ha_chassis_group=<HAGRP1_UUID>
ovn-nbctl set logical_router_port lr2-public ha_chassis_group=<HAGRP1_UUID>

ovn-nbctl set logical_switch_port ext-p1 ha_chasssi_group=<HAGRP1_UUID>
ovn-nbctl set logical_switch_port ext-p2 ha_chasssi_group=<HAGRP1_UUID>
ovn-nbctl set logical_switch_port ext-p3 ha_chasssi_group=<HAGRP1_UUID>
..
ovn-nbctl set logical_switch_port ext-pn ha_chasssi_group=<HAGRP1_UUID>

In the above case, CMS should know the list of logical router ports/logical
switch ports
and can delete the "hagrp1" only when all the references are cleared.

But I agree with your point that unlike load balancer which can be
independently created
or deleted that's not the case with HA_Chassis_Group.

I will go ahead and change it to "strong". We can probably revert it to
"false" later if
it is hard for CMS to manage the references.

Thanks


>
> >
> >>
> >> >
> >> >>
> >> >>
> >> >> > +                             "min": 0,
> >> >> > +                             "max": 1}},
> >> >> >                  "options": {
> >> >> >                      "type": {"key": "string",
> >> >> >                               "value": "string",
> >> >> > @@ -392,5 +398,29 @@
> >> >> >                      "type": {"key": "string", "value": "string",
> >> >> >                               "min": 0, "max": "unlimited"}}},
> >> >> >              "indexes": [["name"]],
> >> >> > -            "isRoot": false}}
> >> >> > +            "isRoot": false},
> >> >> > +        "HA_Chassis": {
> >> >> > +            "columns": {
> >> >> > +                "chassis_name": {"type": "string"},
> >> >> > +                "priority": {"type": {"key": {"type": "integer",
> >> >> > +                                              "minInteger": 0,
> >> >> > +                                              "maxInteger":
> 32767}}},
> >> >> > +                "external_ids": {
> >> >> > +                    "type": {"key": "string", "value": "string",
> >> >> > +                             "min": 0, "max": "unlimited"}}},
> >> >> > +            "isRoot": false},
> >> >> > +        "HA_Chassis_Group": {
> >> >> > +            "columns": {
> >> >> > +                "name": {"type": "string"},
> >> >> > +                "ha_chassis": {
> >> >> > +                    "type": {"key": {"type": "uuid",
> >> >> > +                                     "refTable": "HA_Chassis",
> >> >> > +                                     "refType": "strong"},
> >> >>
> >> >> Is it better to be weak ref here, considering that multiple a
> >> >> HA-chassis can belong to multiple groups? (same for SB schema)
> >> >
> >> >
> >> > If you see HA_Chassis is non root. and hence I think it should be
> strong ref.
> >> > And I think it would be easier for CMS to manage. Deleting
> HA_Chassis_Group will delete its
> >> > ha chassis list.
> >>
> >> "Deleting HA_Chassis_Group will delete its ha chassis list." => This
> >> is the behavior of *weak* reference, right? It seems we have reverse
> >> understanding about strong v.s. weak references :)
> >
> >
> > Deleting HA_Chassis_Group will delete the HA_Chassis rows because -
> HA_Chassis is "non root".
> > (Just like how Logical switch references the Logical_Switch_Ports.
> Logical_Switch_Port is "non root"
> > and is strongly referenced by Logical_Switch and deleting Logical_Switch
> row, also deletes the Logical_Switch_Ports
> > of that Logical_Switch).  Same is used here.
> >
>
> You are right. I just went over the rfc7047 again and realized it was
> my misunderstanding. Thanks for correcting me! So I agree with you
> that to keep HA_Chassis non-root, strong reference is the only choice.
> (though I don't quite understand the design in rfc7047 why weak
> reference can't prevent a non-root row being deleted - only strong
> reference prevents that.)
>
> > As I stated earlier, I think its better that HA_Chassis is not
> referenced by multiple HA_Chassis_Group rows.  Otherwise
> > the code in ovn-northd would get complicated to sync in SB DB.
>
> Also, forget about my statement "a HA-chassis can belong to multiple
> groups". My brain wasn't clear :(
>
>
No worries :).

Thanks for all the comments and feedback.

 Numan


> >
> > Thanks
> > Numan
> >
> >
> >>
> >>
> >> >
> >> > I think it would complicate the code in ovn-northd when syncing with
> the SB DB if HA_Chassis is
> >> > root and it is referenced by multiple HA_Chassis_Group rows.
> >> >
> >> > So I would prefer HA_Chassis to be non root and HA_Chassis_Group has
> a strong ref to HA_Chassis table.
> >> >
> >> > Thanks
> >> > Numan
> >> >
> >> >>
> >> >>
> >> >> > +                             "min": 0,
> >> >> > +                             "max": "unlimited"}},
> >> >> > +                "external_ids": {
> >> >> > +                    "type": {"key": "string", "value": "string",
> >> >> > +                             "min": 0, "max": "unlimited"}}},
> >> >> > +            "indexes": [["name"]],
> >> >> > +            "isRoot": true}}
> >> >> >      }
> >> >>
> >> >> I will get back to 3/5 - 5/5 reviewing asap.
>
Han Zhou March 27, 2019, 5:54 p.m. UTC | #8
On Wed, Mar 27, 2019 at 3:00 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>
>
>
> On Wed, Mar 27, 2019 at 12:32 AM Han Zhou <zhouhan@gmail.com> wrote:
>>
>> On Tue, Mar 26, 2019 at 11:08 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>> >
>> >
>> >
>> > On Tue, Mar 26, 2019 at 11:21 PM Han Zhou <zhouhan@gmail.com> wrote:
>> >>
>> >> On Tue, Mar 26, 2019 at 9:59 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>> >> >
>> >> >
>> >> > Thanks Han for the review. I will address them
>> >> > Please see comments inline
>> >> >
>> >> > Thanks
>> >> > Numan
>> >> >
>> >> >
>> >> > On Tue, Mar 26, 2019 at 8:07 AM Han Zhou <zhouhan@gmail.com> wrote:
>> >> >>
>> >> >> Mostly looks good to me, just minor comments.
>> >> >>
>> >> >> On Thu, Mar 14, 2019 at 12:32 PM <nusiddiq@redhat.com> wrote:
>> >> >> >
>> >> >> > +static void
>> >> >> > +build_lrouter_groups__(struct hmap *ports, struct ovn_datapath *od)
>> >> >> > +{
>> >> >> > +    ovs_assert((od && od->nbr && od->lr_group));
>> >> >> > +
>> >> >> > +    if (od->l3dgw_port && od->l3redirect_port) {
>> >> >> > +        /* It's a logical router with gateway port. If it
>> >> >> > +         * has HA_Chassis_Group associated to it in SB DB, then store the
>> >> >> > +         * ha chassis group name. */
>> >> >> > +        if (od->l3redirect_port->sb->ha_chassis_group) {
>> >> >> > +            sset_add(&od->lr_group->ha_chassis_groups,
>> >> >> > +                     od->l3redirect_port->sb->ha_chassis_group->name);
>> >> >> > +        }
>> >> >> > +    }
>> >> >> > +
>> >> >> > +    for (size_t i = 0; i < od->nbr->n_ports; i++) {
>> >> >> > +        struct ovn_port *router_port =
>> >> >> > +            ovn_port_find(ports, od->nbr->ports[i]->name);
>> >> >> > +
>> >> >> > +        if (!router_port || !router_port->peer) {
>> >> >> > +            continue;
>> >> >> > +        }
>> >> >> > +
>> >> >> > +        /* Get the peer logical switch/logical router datapath. */
>> >> >> > +        struct ovn_datapath *peer_dp = router_port->peer->od;
>> >> >> > +        if (peer_dp->nbr) {
>> >> >> > +            if (!peer_dp->lr_group) {
>> >> >> > +                peer_dp->lr_group = od->lr_group;
>> >> >> > +                od->lr_group->router_dps[od->lr_group->n_router_dps++]
>> >> >> > +                    = peer_dp;
>> >> >> > +                build_lrouter_groups__(ports, peer_dp);
>> >> >> > +            }
>> >> >> > +        } else {
>> >> >> > +            for (size_t j = 0; j < peer_dp->n_router_ports; j++) {
>> >> >> > +                if (!peer_dp->router_ports[j]->peer) {
>> >> >> > +                    /* If there is no peer port connecting to the
>> >> >> > +                    * router datapath, ignore it. */
>> >> >>
>> >> >> s/router datapath/router port
>> >> >>
>> >> >> > +                    continue;
>> >> >> > +                }
>> >> >> > +
>> >> >>
>> >> >> > +static void
>> >> >> > +build_lrouter_groups(struct hmap *ports, struct ovs_list *lr_list)
>> >> >> > +{
>> >> >> > +    struct ovn_datapath *od;
>> >> >> > +    size_t n_router_dps = ovs_list_size(lr_list);
>> >> >> > +
>> >> >> > +    LIST_FOR_EACH (od, lr_list, lr_list) {
>> >> >> > +        if (!od->lr_group) {
>> >> >> > +            od->lr_group = xzalloc(sizeof *od->lr_group);
>> >> >> > +            /* Each logical router group can have max
>> >> >> > +             * 'n_router_dps'. So allocate enough memory. */
>> >> >> > +            od->lr_group->router_dps = xcalloc(sizeof *od, n_router_dps);
>> >> >> > +            od->lr_group->router_dps[od->lr_group->n_router_dps] = od;
>> >> >>
>> >> >> Here it is more clear to be: od->lr_group->router_dps[0] = od;
>> >> >>
>> >> >> > +            od->lr_group->n_router_dps = 1;
>> >> >> > +            sset_init(&od->lr_group->ha_chassis_groups);
>> >> >> > +            build_lrouter_groups__(ports, od);
>> >> >> > +        }
>> >> >> > +    }
>> >> >> > +}
>> >> >> > +
>> >> >>
>> >> >> >  static void
>> >> >> > -destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports)
>> >> >> > +destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports,
>> >> >> > +                            struct ovs_list *lr_list)
>> >> >> >  {
>> >> >> > +    struct ovn_datapath *router_dp;
>> >> >> > +    LIST_FOR_EACH_POP (router_dp, lr_list, lr_list) {
>> >> >> > +        if (router_dp->lr_group) {
>> >> >> > +            struct lrouter_group *lr_group = router_dp->lr_group;
>> >> >> > +
>> >> >> > +            for (size_t i = 0; i < router_dp->lr_group->n_router_dps; i++) {
>> >> >> > +                if (router_dp->lr_group->router_dps[i] != router_dp) {
>> >> >>
>> >> >> This if-condition seems not needed.
>
>
> Actually it is needed. I first thought the same. But there were crashes because of NULL pointer deference.
> If we don't have this check, then the check "i < router_dp->lr_group->n_router_dps" in the for loop would
> result in the NULL pointer dereference.
>

It seems there shouldn't be NULL pointer deference any more after
address the below comment, because you already have the local var
lr_group saved the pointer.

>
>>
>> >> >>
>> >> >> > +                    router_dp->lr_group->router_dps[i]->lr_group = NULL;
>> >> >> > +                }
>> >> >> > +            }
>> >> >>
>> >> >> s/router_dp->lr_group/lr_group in above for-loop.
>> >> >>
>> >> >> > +
>> >> >> > +            free(lr_group->router_dps);
>> >> >> > +            sset_destroy(&lr_group->ha_chassis_groups);
>> >> >> > +            free(lr_group);
>> >> >> > +            router_dp->lr_group = NULL;
>> >> >> > +        }
>> >> >> > +    }
>> >> >> > +



>> >> >> > diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
>> >> >> > index 10a59649a..48d27b960 100644
>> >> >> > --- a/ovn/ovn-nb.ovsschema
>> >> >> > +++ b/ovn/ovn-nb.ovsschema
>> >> >> > @@ -1,7 +1,7 @@
>> >> >> >  {
>> >> >> >      "name": "OVN_Northbound",
>> >> >> > -    "version": "5.14.1",
>> >> >> > -    "cksum": "3758097843 20509",
>> >> >> > +    "version": "5.15.0",
>> >> >> > +    "cksum": "1078795414 21917",
>> >> >> >      "tables": {
>> >> >> >          "NB_Global": {
>> >> >> >              "columns": {
>> >> >> > @@ -271,6 +271,12 @@
>> >> >> >                                       "refType": "strong"},
>> >> >> >                               "min": 0,
>> >> >> >                               "max": "unlimited"}},
>> >> >> > +                "ha_chassis_group": {
>> >> >> > +                    "type": {"key": {"type": "uuid",
>> >> >> > +                                     "refTable": "HA_Chassis_Group",
>> >> >> > +                                     "refType": "weak"},
>> >> >>
>> >> >> Shall this be strong ref instead? In normal case logical ports hosted
>> >> >> by a HA-chassis-group should be deleted before the HA-chassis-group is
>> >> >> removed. (same for SB schema)
>> >> >
>> >> >
>> >> > I would prefer weak ref because it would be easier for CMS to manage the references.
>> >> > It doesn't need to keep track of the ref count before deleting the HA_Chassis_Group row.
>> >> >
>> >>
>> >> Sorry that I don't understand why it requires ref count. With strong
>> >> reference, it just prevents operators to delete HA_Chassis_Group by
>> >> mistake (i.e. when they are still used by routers). CMS can just
>> >> simply return an error in this case, without any ref counter
>> >> maintenance. Otherwise, if it is weak reference, an operator can
>> >> delete HA_Chassis_Group even if it is still in use, which causes the
>> >> related logical router ports being converted from centralized to
>> >> distributed, implicitly, which I think is not good.
>> >>
>> >
>> > Agree that operator can delete HA_Chassis_Group by mistake even if
>> > logical ports/logical router ports are still referencing the HA_Chassis_Group.
>> >
>> > What I mean is, CMS cannot delete the HA_Chassis_Group rows unless
>> > it clears all the references to HA_Chassis_Group.
>> >
>> > In a recent commit, we changed the logical switches and logical routers
>> > referencing the Load_Balancer to weak mainly because of this.
>> >  - https://github.com/openvswitch/ovs/commit/612f80fa8ebf88dad2e204364c6c02b451dca36c
>> >
>> > That's why I thought to have a weak reference.
>> > You still think its good to change to strong ?
>> >
>>
>> Yes I am well aware of the LB change - in fact we are working on a
>> workaround in go-ovn project for older version of OVN that still has
>> strong reference for LB.
>> However, I think that is a different scenario. For a logical switch, a
>> LB is something additionally added, and from an operators point of
>> view it is pretty common to add/remove LB(s) to/from a logical switch.
>> However, for a logical router port, it is usually pre-defined that it
>> is a GW port or distributed port, when the deployment model is
>> decided. Although it is possible to convert a GW port to a distributed
>> port, it is uncommon. If it is really what the operator wants to do,
>> it should be handled explicitly by updating the logical router port,
>> rather than implicitly by removing the HA_Chassis_Group. Does this
>> make sense?
>>
>
> CMS could use an HA_Chassis_Group both for associating it with multiple
> logical router ports and also with multiple "external" ports.
>
>  I am still little concerned from CMS point of view.
>
> Eg.
> ovn-nbctl ha-chassis-group-add-chassis hagrp1 ch1 30
> ovn-nbctl ha-chassis-group-add-chassis hagrp1 ch2 20
> ovn-nbctl ha-chassis-group-add-chassis hagrp1 ch3 10
>
> ovn-nbctl set logical_router_port lr0-public ha_chassis_group=<HAGRP1_UUID>
> ovn-nbctl set logical_router_port lr1-public ha_chassis_group=<HAGRP1_UUID>
> ovn-nbctl set logical_router_port lr2-public ha_chassis_group=<HAGRP1_UUID>
>
> ovn-nbctl set logical_switch_port ext-p1 ha_chasssi_group=<HAGRP1_UUID>
> ovn-nbctl set logical_switch_port ext-p2 ha_chasssi_group=<HAGRP1_UUID>
> ovn-nbctl set logical_switch_port ext-p3 ha_chasssi_group=<HAGRP1_UUID>
> ..
> ovn-nbctl set logical_switch_port ext-pn ha_chasssi_group=<HAGRP1_UUID>
>
> In the above case, CMS should know the list of logical router ports/logical switch ports
> and can delete the "hagrp1" only when all the references are cleared.
>
> But I agree with your point that unlike load balancer which can be independently created
> or deleted that's not the case with HA_Chassis_Group.
>
> I will go ahead and change it to "strong". We can probably revert it to "false" later if
> it is hard for CMS to manage the references.
>

Thanks Numan. I'd like to clarify more about it. For external ports,
missing HA_Chassis_Group is an error situation, because an external
port without HA_Chassis_Group cannot work. It is not something we
want. With weak reference, it can happen if HA_Chassis_Group is
deleted (by mistake) before deleting/updating the external ports. To
prevent this, CMS would have to do some validation, so that it can
warn the user and prevent this error situation to happen. To do such
validation, CMS would have to keep track of the references by itself.
However, with strong reference, the burden of the validation is moved
to OVSDB integrity check, which is free, and CMS simply return error,
without keeping tracking anything. I don't think there is a normal
scenario that *requires* deleting a HA_Chassis_Group before
deleting/updating the ports that references it. So I think strong
reference actually ensures correctness and it is also more convenient
from CMS point of view.

Thanks,
Han
Numan Siddique March 27, 2019, 6:14 p.m. UTC | #9
On Wed, Mar 27, 2019 at 11:24 PM Han Zhou <zhouhan@gmail.com> wrote:

> On Wed, Mar 27, 2019 at 3:00 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> >
> >
> >
> > On Wed, Mar 27, 2019 at 12:32 AM Han Zhou <zhouhan@gmail.com> wrote:
> >>
> >> On Tue, Mar 26, 2019 at 11:08 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> >> >
> >> >
> >> >
> >> > On Tue, Mar 26, 2019 at 11:21 PM Han Zhou <zhouhan@gmail.com> wrote:
> >> >>
> >> >> On Tue, Mar 26, 2019 at 9:59 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> >> >> >
> >> >> >
> >> >> > Thanks Han for the review. I will address them
> >> >> > Please see comments inline
> >> >> >
> >> >> > Thanks
> >> >> > Numan
> >> >> >
> >> >> >
> >> >> > On Tue, Mar 26, 2019 at 8:07 AM Han Zhou <zhouhan@gmail.com>
> wrote:
> >> >> >>
> >> >> >> Mostly looks good to me, just minor comments.
> >> >> >>
> >> >> >> On Thu, Mar 14, 2019 at 12:32 PM <nusiddiq@redhat.com> wrote:
> >> >> >> >
> >> >> >> > +static void
> >> >> >> > +build_lrouter_groups__(struct hmap *ports, struct ovn_datapath
> *od)
> >> >> >> > +{
> >> >> >> > +    ovs_assert((od && od->nbr && od->lr_group));
> >> >> >> > +
> >> >> >> > +    if (od->l3dgw_port && od->l3redirect_port) {
> >> >> >> > +        /* It's a logical router with gateway port. If it
> >> >> >> > +         * has HA_Chassis_Group associated to it in SB DB,
> then store the
> >> >> >> > +         * ha chassis group name. */
> >> >> >> > +        if (od->l3redirect_port->sb->ha_chassis_group) {
> >> >> >> > +            sset_add(&od->lr_group->ha_chassis_groups,
> >> >> >> > +
>  od->l3redirect_port->sb->ha_chassis_group->name);
> >> >> >> > +        }
> >> >> >> > +    }
> >> >> >> > +
> >> >> >> > +    for (size_t i = 0; i < od->nbr->n_ports; i++) {
> >> >> >> > +        struct ovn_port *router_port =
> >> >> >> > +            ovn_port_find(ports, od->nbr->ports[i]->name);
> >> >> >> > +
> >> >> >> > +        if (!router_port || !router_port->peer) {
> >> >> >> > +            continue;
> >> >> >> > +        }
> >> >> >> > +
> >> >> >> > +        /* Get the peer logical switch/logical router
> datapath. */
> >> >> >> > +        struct ovn_datapath *peer_dp = router_port->peer->od;
> >> >> >> > +        if (peer_dp->nbr) {
> >> >> >> > +            if (!peer_dp->lr_group) {
> >> >> >> > +                peer_dp->lr_group = od->lr_group;
> >> >> >> > +
> od->lr_group->router_dps[od->lr_group->n_router_dps++]
> >> >> >> > +                    = peer_dp;
> >> >> >> > +                build_lrouter_groups__(ports, peer_dp);
> >> >> >> > +            }
> >> >> >> > +        } else {
> >> >> >> > +            for (size_t j = 0; j < peer_dp->n_router_ports;
> j++) {
> >> >> >> > +                if (!peer_dp->router_ports[j]->peer) {
> >> >> >> > +                    /* If there is no peer port connecting to
> the
> >> >> >> > +                    * router datapath, ignore it. */
> >> >> >>
> >> >> >> s/router datapath/router port
> >> >> >>
> >> >> >> > +                    continue;
> >> >> >> > +                }
> >> >> >> > +
> >> >> >>
> >> >> >> > +static void
> >> >> >> > +build_lrouter_groups(struct hmap *ports, struct ovs_list
> *lr_list)
> >> >> >> > +{
> >> >> >> > +    struct ovn_datapath *od;
> >> >> >> > +    size_t n_router_dps = ovs_list_size(lr_list);
> >> >> >> > +
> >> >> >> > +    LIST_FOR_EACH (od, lr_list, lr_list) {
> >> >> >> > +        if (!od->lr_group) {
> >> >> >> > +            od->lr_group = xzalloc(sizeof *od->lr_group);
> >> >> >> > +            /* Each logical router group can have max
> >> >> >> > +             * 'n_router_dps'. So allocate enough memory. */
> >> >> >> > +            od->lr_group->router_dps = xcalloc(sizeof *od,
> n_router_dps);
> >> >> >> > +
> od->lr_group->router_dps[od->lr_group->n_router_dps] = od;
> >> >> >>
> >> >> >> Here it is more clear to be: od->lr_group->router_dps[0] = od;
> >> >> >>
> >> >> >> > +            od->lr_group->n_router_dps = 1;
> >> >> >> > +            sset_init(&od->lr_group->ha_chassis_groups);
> >> >> >> > +            build_lrouter_groups__(ports, od);
> >> >> >> > +        }
> >> >> >> > +    }
> >> >> >> > +}
> >> >> >> > +
> >> >> >>
> >> >> >> >  static void
> >> >> >> > -destroy_datapaths_and_ports(struct hmap *datapaths, struct
> hmap *ports)
> >> >> >> > +destroy_datapaths_and_ports(struct hmap *datapaths, struct
> hmap *ports,
> >> >> >> > +                            struct ovs_list *lr_list)
> >> >> >> >  {
> >> >> >> > +    struct ovn_datapath *router_dp;
> >> >> >> > +    LIST_FOR_EACH_POP (router_dp, lr_list, lr_list) {
> >> >> >> > +        if (router_dp->lr_group) {
> >> >> >> > +            struct lrouter_group *lr_group =
> router_dp->lr_group;
> >> >> >> > +
> >> >> >> > +            for (size_t i = 0; i <
> router_dp->lr_group->n_router_dps; i++) {
> >> >> >> > +                if (router_dp->lr_group->router_dps[i] !=
> router_dp) {
> >> >> >>
> >> >> >> This if-condition seems not needed.
> >
> >
> > Actually it is needed. I first thought the same. But there were crashes
> because of NULL pointer deference.
> > If we don't have this check, then the check "i <
> router_dp->lr_group->n_router_dps" in the for loop would
> > result in the NULL pointer dereference.

>
>
> It seems there shouldn't be NULL pointer deference any more after
> address the below comment, because you already have the local var
> lr_group saved the pointer.
>

Thanks. I will try this out. I think it should work now.


>
> >
> >>
> >> >> >>
> >> >> >> > +
> router_dp->lr_group->router_dps[i]->lr_group = NULL;
> >> >> >> > +                }
> >> >> >> > +            }
> >> >> >>
> >> >> >> s/router_dp->lr_group/lr_group in above for-loop.
> >> >> >>
> >> >> >> > +
> >> >> >> > +            free(lr_group->router_dps);
> >> >> >> > +            sset_destroy(&lr_group->ha_chassis_groups);
> >> >> >> > +            free(lr_group);
> >> >> >> > +            router_dp->lr_group = NULL;
> >> >> >> > +        }
> >> >> >> > +    }
> >> >> >> > +
>
>
>
> >> >> >> > diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
> >> >> >> > index 10a59649a..48d27b960 100644
> >> >> >> > --- a/ovn/ovn-nb.ovsschema
> >> >> >> > +++ b/ovn/ovn-nb.ovsschema
> >> >> >> > @@ -1,7 +1,7 @@
> >> >> >> >  {
> >> >> >> >      "name": "OVN_Northbound",
> >> >> >> > -    "version": "5.14.1",
> >> >> >> > -    "cksum": "3758097843 20509",
> >> >> >> > +    "version": "5.15.0",
> >> >> >> > +    "cksum": "1078795414 21917",
> >> >> >> >      "tables": {
> >> >> >> >          "NB_Global": {
> >> >> >> >              "columns": {
> >> >> >> > @@ -271,6 +271,12 @@
> >> >> >> >                                       "refType": "strong"},
> >> >> >> >                               "min": 0,
> >> >> >> >                               "max": "unlimited"}},
> >> >> >> > +                "ha_chassis_group": {
> >> >> >> > +                    "type": {"key": {"type": "uuid",
> >> >> >> > +                                     "refTable":
> "HA_Chassis_Group",
> >> >> >> > +                                     "refType": "weak"},
> >> >> >>
> >> >> >> Shall this be strong ref instead? In normal case logical ports
> hosted
> >> >> >> by a HA-chassis-group should be deleted before the
> HA-chassis-group is
> >> >> >> removed. (same for SB schema)
> >> >> >
> >> >> >
> >> >> > I would prefer weak ref because it would be easier for CMS to
> manage the references.
> >> >> > It doesn't need to keep track of the ref count before deleting the
> HA_Chassis_Group row.
> >> >> >
> >> >>
> >> >> Sorry that I don't understand why it requires ref count. With strong
> >> >> reference, it just prevents operators to delete HA_Chassis_Group by
> >> >> mistake (i.e. when they are still used by routers). CMS can just
> >> >> simply return an error in this case, without any ref counter
> >> >> maintenance. Otherwise, if it is weak reference, an operator can
> >> >> delete HA_Chassis_Group even if it is still in use, which causes the
> >> >> related logical router ports being converted from centralized to
> >> >> distributed, implicitly, which I think is not good.
> >> >>
> >> >
> >> > Agree that operator can delete HA_Chassis_Group by mistake even if
> >> > logical ports/logical router ports are still referencing the
> HA_Chassis_Group.
> >> >
> >> > What I mean is, CMS cannot delete the HA_Chassis_Group rows unless
> >> > it clears all the references to HA_Chassis_Group.
> >> >
> >> > In a recent commit, we changed the logical switches and logical
> routers
> >> > referencing the Load_Balancer to weak mainly because of this.
> >> >  -
> https://github.com/openvswitch/ovs/commit/612f80fa8ebf88dad2e204364c6c02b451dca36c
> >> >
> >> > That's why I thought to have a weak reference.
> >> > You still think its good to change to strong ?
> >> >
> >>
> >> Yes I am well aware of the LB change - in fact we are working on a
> >> workaround in go-ovn project for older version of OVN that still has
> >> strong reference for LB.
> >> However, I think that is a different scenario. For a logical switch, a
> >> LB is something additionally added, and from an operators point of
> >> view it is pretty common to add/remove LB(s) to/from a logical switch.
> >> However, for a logical router port, it is usually pre-defined that it
> >> is a GW port or distributed port, when the deployment model is
> >> decided. Although it is possible to convert a GW port to a distributed
> >> port, it is uncommon. If it is really what the operator wants to do,
> >> it should be handled explicitly by updating the logical router port,
> >> rather than implicitly by removing the HA_Chassis_Group. Does this
> >> make sense?
> >>
> >
> > CMS could use an HA_Chassis_Group both for associating it with multiple
> > logical router ports and also with multiple "external" ports.
> >
> >  I am still little concerned from CMS point of view.
> >
> > Eg.
> > ovn-nbctl ha-chassis-group-add-chassis hagrp1 ch1 30
> > ovn-nbctl ha-chassis-group-add-chassis hagrp1 ch2 20
> > ovn-nbctl ha-chassis-group-add-chassis hagrp1 ch3 10
> >
> > ovn-nbctl set logical_router_port lr0-public
> ha_chassis_group=<HAGRP1_UUID>
> > ovn-nbctl set logical_router_port lr1-public
> ha_chassis_group=<HAGRP1_UUID>
> > ovn-nbctl set logical_router_port lr2-public
> ha_chassis_group=<HAGRP1_UUID>
> >
> > ovn-nbctl set logical_switch_port ext-p1 ha_chasssi_group=<HAGRP1_UUID>
> > ovn-nbctl set logical_switch_port ext-p2 ha_chasssi_group=<HAGRP1_UUID>
> > ovn-nbctl set logical_switch_port ext-p3 ha_chasssi_group=<HAGRP1_UUID>
> > ..
> > ovn-nbctl set logical_switch_port ext-pn ha_chasssi_group=<HAGRP1_UUID>
> >
> > In the above case, CMS should know the list of logical router
> ports/logical switch ports
> > and can delete the "hagrp1" only when all the references are cleared.
> >
> > But I agree with your point that unlike load balancer which can be
> independently created
> > or deleted that's not the case with HA_Chassis_Group.
> >
> > I will go ahead and change it to "strong". We can probably revert it to
> "false" later if
> > it is hard for CMS to manage the references.
> >
>
> Thanks Numan. I'd like to clarify more about it. For external ports,
> missing HA_Chassis_Group is an error situation, because an external
> port without HA_Chassis_Group cannot work. It is not something we
> want. With weak reference, it can happen if HA_Chassis_Group is
> deleted (by mistake) before deleting/updating the external ports. To
> prevent this, CMS would have to do some validation, so that it can
> warn the user and prevent this error situation to happen. To do such
> validation, CMS would have to keep track of the references by itself.
> However, with strong reference, the burden of the validation is moved
> to OVSDB integrity check, which is free, and CMS simply return error,
> without keeping tracking anything. I don't think there is a normal
> scenario that *requires* deleting a HA_Chassis_Group before
> deleting/updating the ports that references it. So I think strong
> reference actually ensures correctness and it is also more convenient
> from CMS point of view.
>

Thanks for the detailed explanation. The p5 of this series is setting
"weak" reference.
I will change it to strong and submit v6.

Thanks
Numan


>
> Thanks,
> Han
>

Patch
diff mbox series

diff --git a/ovn/lib/chassis-index.c b/ovn/lib/chassis-index.c
index a5dbf4ace..34d4a31eb 100644
--- a/ovn/lib/chassis-index.c
+++ b/ovn/lib/chassis-index.c
@@ -39,3 +39,29 @@  chassis_lookup_by_name(struct ovsdb_idl_index *sbrec_chassis_by_name,
 
     return retval;
 }
+
+struct ovsdb_idl_index *
+ha_chassis_group_index_create(struct ovsdb_idl *idl)
+{
+    return ovsdb_idl_index_create1(idl, &sbrec_ha_chassis_group_col_name);
+}
+
+/* Finds and returns the HA chassis group with the given 'name', or NULL
+ * if no such HA chassis group exists. */
+const struct sbrec_ha_chassis_group *
+ha_chassis_group_lookup_by_name(
+    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name,
+    const char *name)
+{
+    struct sbrec_ha_chassis_group *target =
+        sbrec_ha_chassis_group_index_init_row(sbrec_ha_chassis_grp_by_name);
+    sbrec_ha_chassis_group_set_name(target, name);
+
+    struct sbrec_ha_chassis_group *retval =
+        sbrec_ha_chassis_group_index_find(sbrec_ha_chassis_grp_by_name,
+                                          target);
+
+    sbrec_ha_chassis_group_index_destroy_row(target);
+
+    return retval;
+}
diff --git a/ovn/lib/chassis-index.h b/ovn/lib/chassis-index.h
index d5e5df926..9bc610ad2 100644
--- a/ovn/lib/chassis-index.h
+++ b/ovn/lib/chassis-index.h
@@ -23,4 +23,8 @@  struct ovsdb_idl_index *chassis_index_create(struct ovsdb_idl *);
 const struct sbrec_chassis *chassis_lookup_by_name(
     struct ovsdb_idl_index *sbrec_chassis_by_name, const char *name);
 
+struct ovsdb_idl_index *ha_chassis_group_index_create(struct ovsdb_idl *idl);
+const struct sbrec_ha_chassis_group *ha_chassis_group_lookup_by_name(
+    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name, const char *name);
+
 #endif /* ovn/lib/chassis-index.h */
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 63fc0f13b..48092d8f5 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -56,6 +56,7 @@  struct northd_context {
     struct ovsdb_idl *ovnsb_idl;
     struct ovsdb_idl_txn *ovnnb_txn;
     struct ovsdb_idl_txn *ovnsb_txn;
+    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name;
 };
 
 static const char *ovnnb_db;
@@ -450,10 +451,25 @@  struct ovn_datapath {
     struct ovn_port *l3redirect_port;
     struct ovn_port *localnet_port;
 
+    struct ovs_list lr_list; /* In list of logical router datapaths. */
+    /* The logical router group to which this datapath belongs.
+     * Valid only if it is logical router datapath. NULL otherwise. */
+    struct lrouter_group *lr_group;
+
     /* Port groups related to the datapath, used only when nbs is NOT NULL. */
     struct hmap nb_pgs;
 };
 
+/* A group of logical router datapaths which are connected - either
+ * directly or indirectly.
+ * Each logical router can belong to only one group. */
+struct lrouter_group {
+    struct ovn_datapath **router_dps;
+    int n_router_dps;
+    /* Set of ha_chassis_groups which are associated with the router dps. */
+    struct sset ha_chassis_groups;
+};
+
 struct macam_node {
     struct hmap_node hmap_node;
     struct eth_addr mac_addr; /* Allocated MAC address. */
@@ -483,6 +499,7 @@  ovn_datapath_create(struct hmap *datapaths, const struct uuid *key,
     hmap_init(&od->nb_pgs);
     od->port_key_hint = 0;
     hmap_insert(datapaths, &od->key_node, uuid_hash(&od->key));
+    od->lr_group = NULL;
     return od;
 }
 
@@ -665,7 +682,7 @@  ovn_datapath_update_external_ids(struct ovn_datapath *od)
 static void
 join_datapaths(struct northd_context *ctx, struct hmap *datapaths,
                struct ovs_list *sb_only, struct ovs_list *nb_only,
-               struct ovs_list *both)
+               struct ovs_list *both, struct ovs_list *lr_list)
 {
     ovs_list_init(sb_only);
     ovs_list_init(nb_only);
@@ -746,6 +763,7 @@  join_datapaths(struct northd_context *ctx, struct hmap *datapaths,
                                      NULL, nbr, NULL);
             ovs_list_push_back(nb_only, &od->list);
         }
+        ovs_list_push_back(lr_list, &od->lr_list);
     }
 }
 
@@ -762,11 +780,12 @@  ovn_datapath_allocate_key(struct hmap *dp_tnlids)
  * Initializes 'datapaths' to contain a "struct ovn_datapath" for every logical
  * switch and router. */
 static void
-build_datapaths(struct northd_context *ctx, struct hmap *datapaths)
+build_datapaths(struct northd_context *ctx, struct hmap *datapaths,
+                struct ovs_list *lr_list)
 {
     struct ovs_list sb_only, nb_only, both;
 
-    join_datapaths(ctx, datapaths, &sb_only, &nb_only, &both);
+    join_datapaths(ctx, datapaths, &sb_only, &nb_only, &both, lr_list);
 
     if (!ovs_list_is_empty(&nb_only)) {
         /* First index the in-use datapath tunnel IDs. */
@@ -1980,6 +1999,13 @@  sbpb_gw_chassis_needs_update(
         return false;
     }
 
+    if (lrp->n_gateway_chassis && !port_binding->ha_chassis_group) {
+        /* If there are gateway chassis in the NB DB, but there is
+         * no corresponding HA chassis group in SB DB we need to
+         * create the HA chassis group in SB DB for this lrp. */
+        return true;
+    }
+
     /* These arrays are used to collect valid Gateway_Chassis and valid
      * Chassis records from the Logical_Router_Port Gateway_Chassis list,
      * we ignore the ones we can't match on the SBDB */
@@ -2049,31 +2075,63 @@  sbpb_gw_chassis_needs_update(
     return false;
 }
 
+static struct sbrec_ha_chassis *
+create_sb_ha_chassis(struct northd_context *ctx,
+                     const struct sbrec_chassis *chassis, int priority)
+{
+    struct sbrec_ha_chassis *sb_ha_chassis =
+        sbrec_ha_chassis_insert(ctx->ovnsb_txn);
+    sbrec_ha_chassis_set_chassis(sb_ha_chassis, chassis);
+    sbrec_ha_chassis_set_priority(sb_ha_chassis, priority);
+    return sb_ha_chassis;
+}
+
 /* This functions translates the gw chassis on the nb database
  * to sb database entries, the only difference is that SB database
  * Gateway_Chassis table references the chassis directly instead
- * of using the name */
+ * of using the name.
+ *
+ * This function also creates a HA Chassis group in SB DB for
+ * the gateway chassis associated to a distributed gateway
+ * router port in the NB DB.
+ *
+ * An upcoming patch will delete the code to create the Gateway chassis
+ * in SB DB.*/
 static void
 copy_gw_chassis_from_nbrp_to_sbpb(
         struct northd_context *ctx,
         struct ovsdb_idl_index *sbrec_chassis_by_name,
         const struct nbrec_logical_router_port *lrp,
-        const struct sbrec_port_binding *port_binding) {
-
-    if (!lrp || !port_binding || !lrp->n_gateway_chassis) {
-        return;
-    }
-
+        const struct sbrec_port_binding *port_binding)
+{
     struct sbrec_gateway_chassis **gw_chassis = NULL;
     int n_gwc = 0;
     int n;
 
+    /* Make use of the new HA chassis group table to support HA
+     * for the distributed gateway router port. We can delete
+     * the old gateway_chassis code once ovn-controller supports
+     * HA chassis group. */
+    const struct sbrec_ha_chassis_group *sb_ha_chassis_group =
+        ha_chassis_group_lookup_by_name(
+            ctx->sbrec_ha_chassis_grp_by_name, lrp->name);
+    if (!sb_ha_chassis_group) {
+        sb_ha_chassis_group = sbrec_ha_chassis_group_insert(ctx->ovnsb_txn);
+        sbrec_ha_chassis_group_set_name(sb_ha_chassis_group, lrp->name);
+    }
+
+    struct sbrec_ha_chassis **sb_ha_chassis = xcalloc(lrp->n_gateway_chassis,
+                                                      sizeof *sb_ha_chassis);
+    size_t n_sb_ha_ch = 0;
     /* XXX: This can be improved. This code will generate a set of new
      * Gateway_Chassis and push them all in a single transaction, instead
      * this would be more optimal if we just add/update/remove the rows in
      * the southbound db that need to change. We don't expect lots of
      * changes to the Gateway_Chassis table, but if that proves to be wrong
-     * we should optimize this. */
+     * we should optimize this.
+     *
+     * Note: Remove the below code to add gateway_chassis row in OVN
+     * Southbound db once ovn-controller supports HA chassis group. */
     for (n = 0; n < lrp->n_gateway_chassis; n++) {
         struct nbrec_gateway_chassis *lrp_gwc = lrp->gateway_chassis[n];
         if (!lrp_gwc->chassis_name) {
@@ -2086,6 +2144,8 @@  copy_gw_chassis_from_nbrp_to_sbpb(
 
         gw_chassis = xrealloc(gw_chassis, (n_gwc + 1) * sizeof *gw_chassis);
 
+        /* This code to create gateway_chassis in SB DB needs to be deleted
+         * once ovn-controller supports making use of HA chassis groups. */
         struct sbrec_gateway_chassis *pb_gwc =
             sbrec_gateway_chassis_insert(ctx->ovnsb_txn);
 
@@ -2096,16 +2156,26 @@  copy_gw_chassis_from_nbrp_to_sbpb(
         sbrec_gateway_chassis_set_external_ids(pb_gwc, &lrp_gwc->external_ids);
 
         gw_chassis[n_gwc++] = pb_gwc;
+
+        sb_ha_chassis[n_sb_ha_ch] =
+            create_sb_ha_chassis(ctx, chassis, lrp_gwc->priority);
+        n_sb_ha_ch++;
     }
     sbrec_port_binding_set_gateway_chassis(port_binding, gw_chassis, n_gwc);
     free(gw_chassis);
+
+    sbrec_ha_chassis_group_set_ha_chassis(sb_ha_chassis_group,
+                                          sb_ha_chassis, n_sb_ha_ch);
+    sbrec_port_binding_set_ha_chassis_group(port_binding, sb_ha_chassis_group);
+    free(sb_ha_chassis);
 }
 
 static void
 ovn_port_update_sbrec(struct northd_context *ctx,
                       struct ovsdb_idl_index *sbrec_chassis_by_name,
                       const struct ovn_port *op,
-                      struct hmap *chassis_qdisc_queues)
+                      struct hmap *chassis_qdisc_queues,
+                      struct sset *active_ha_chassis_grps)
 {
     sbrec_port_binding_set_datapath(op->sb, op->od->sb);
     if (op->nbrp) {
@@ -2142,6 +2212,7 @@  ovn_port_update_sbrec(struct northd_context *ctx,
                                                       op->nbrp, op->sb);
                 }
 
+                sset_add(active_ha_chassis_grps, op->nbrp->name);
             } else if (redirect_chassis) {
                 /* Handle ports that had redirect-chassis option attached
                  * to them, and for backwards compatibility convert them
@@ -2152,20 +2223,23 @@  ovn_port_update_sbrec(struct northd_context *ctx,
                 if (chassis) {
                     /* If we found the chassis, and the gw chassis on record
                      * differs from what we expect go ahead and update */
+                    char *gwc_name = xasprintf("%s_%s", op->nbrp->name,
+                                               chassis->name);
                     if (op->sb->n_gateway_chassis != 1
                         || !op->sb->gateway_chassis[0]->chassis
                         || strcmp(op->sb->gateway_chassis[0]->chassis->name,
                                   chassis->name)
                         || op->sb->gateway_chassis[0]->priority != 0) {
+                        /* This code to create gateway_chassis in SB DB needs
+                         * to be deleted once ovn-controller supports making
+                         * use of HA chassis groups. */
+
                         /* Construct a single Gateway_Chassis entry on the
                          * Port_Binding attached to the redirect_chassis
                          * name */
                         struct sbrec_gateway_chassis *gw_chassis =
                             sbrec_gateway_chassis_insert(ctx->ovnsb_txn);
 
-                        char *gwc_name = xasprintf("%s_%s", op->nbrp->name,
-                                chassis->name);
-
                         /* XXX: Again, here, we could just update an existing
                          * Gateway_Chassis, instead of creating a new one
                          * and replacing it */
@@ -2176,8 +2250,31 @@  ovn_port_update_sbrec(struct northd_context *ctx,
                                 &op->nbrp->external_ids);
                         sbrec_port_binding_set_gateway_chassis(op->sb,
                                                                &gw_chassis, 1);
-                        free(gwc_name);
                     }
+
+                    /* Create HA chassis group in SB DB for the
+                     * redirect-chassis option. */
+                    const struct sbrec_ha_chassis_group *sb_ha_ch_grp;
+                    sb_ha_ch_grp = ha_chassis_group_lookup_by_name(
+                        ctx->sbrec_ha_chassis_grp_by_name, gwc_name);
+                    if (!sb_ha_ch_grp) {
+                        sb_ha_ch_grp =
+                            sbrec_ha_chassis_group_insert(ctx->ovnsb_txn);
+                        sbrec_ha_chassis_group_set_name(sb_ha_ch_grp,
+                                                        gwc_name);
+                    }
+
+                    if (sb_ha_ch_grp->n_ha_chassis != 1) {
+                        struct sbrec_ha_chassis **sb_ha_ch =
+                            xcalloc(1, sizeof *sb_ha_ch);
+                        sb_ha_ch[0] = create_sb_ha_chassis(ctx, chassis, 0);
+                        sbrec_ha_chassis_group_set_ha_chassis(sb_ha_ch_grp,
+                                                              sb_ha_ch, 1);
+                    }
+                    sbrec_port_binding_set_ha_chassis_group(op->sb,
+                                                            sb_ha_ch_grp);
+                    sset_add(active_ha_chassis_grps, gwc_name);
+                    free(gwc_name);
                 } else {
                     VLOG_WARN("chassis name '%s' from redirect from logical "
                               " router port '%s' redirect-chassis not found",
@@ -2348,6 +2445,18 @@  cleanup_mac_bindings(struct northd_context *ctx, struct hmap *ports)
     }
 }
 
+static void
+cleanup_sb_ha_chassis_groups(struct northd_context *ctx,
+                             struct sset *active_ha_chassis_groups)
+{
+    const struct sbrec_ha_chassis_group *b, *n;
+    SBREC_HA_CHASSIS_GROUP_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
+        if (!sset_contains(active_ha_chassis_groups, b->name)) {
+            sbrec_ha_chassis_group_delete(b);
+        }
+    }
+}
+
 /* Updates the southbound Port_Binding table so that it contains the logical
  * switch ports specified by the northbound database.
  *
@@ -2363,6 +2472,10 @@  build_ports(struct northd_context *ctx,
     struct hmap tag_alloc_table = HMAP_INITIALIZER(&tag_alloc_table);
     struct hmap chassis_qdisc_queues = HMAP_INITIALIZER(&chassis_qdisc_queues);
 
+    /* sset which stores the set of ha chassis group names used. */
+    struct sset active_ha_chassis_grps =
+        SSET_INITIALIZER(&active_ha_chassis_grps);
+
     join_logical_ports(ctx, datapaths, ports, &chassis_qdisc_queues,
                        &tag_alloc_table, &sb_only, &nb_only, &both);
 
@@ -2376,8 +2489,8 @@  build_ports(struct northd_context *ctx,
             tag_alloc_create_new_tag(&tag_alloc_table, op->nbsp);
         }
         ovn_port_update_sbrec(ctx, sbrec_chassis_by_name,
-                              op, &chassis_qdisc_queues);
-
+                              op, &chassis_qdisc_queues,
+                              &active_ha_chassis_grps);
         add_tnlid(&op->od->port_tnlids, op->sb->tunnel_key);
         if (op->sb->tunnel_key > op->od->port_key_hint) {
             op->od->port_key_hint = op->sb->tunnel_key;
@@ -2393,8 +2506,8 @@  build_ports(struct northd_context *ctx,
 
         op->sb = sbrec_port_binding_insert(ctx->ovnsb_txn);
         ovn_port_update_sbrec(ctx, sbrec_chassis_by_name, op,
-                              &chassis_qdisc_queues);
-
+                              &chassis_qdisc_queues,
+                              &active_ha_chassis_grps);
         sbrec_port_binding_set_logical_port(op->sb, op->key);
         sbrec_port_binding_set_tunnel_key(op->sb, tunnel_key);
     }
@@ -2416,6 +2529,8 @@  build_ports(struct northd_context *ctx,
 
     tag_alloc_destroy(&tag_alloc_table);
     destroy_chassis_queues(&chassis_qdisc_queues);
+    cleanup_sb_ha_chassis_groups(ctx, &active_ha_chassis_grps);
+    sset_destroy(&active_ha_chassis_grps);
 }
 
 #define OVN_MIN_MULTICAST 32768
@@ -4046,6 +4161,108 @@  build_stateful(struct ovn_datapath *od, struct hmap *lflows)
     }
 }
 
+static void
+build_lrouter_groups__(struct hmap *ports, struct ovn_datapath *od)
+{
+    ovs_assert((od && od->nbr && od->lr_group));
+
+    if (od->l3dgw_port && od->l3redirect_port) {
+        /* It's a logical router with gateway port. If it
+         * has HA_Chassis_Group associated to it in SB DB, then store the
+         * ha chassis group name. */
+        if (od->l3redirect_port->sb->ha_chassis_group) {
+            sset_add(&od->lr_group->ha_chassis_groups,
+                     od->l3redirect_port->sb->ha_chassis_group->name);
+        }
+    }
+
+    for (size_t i = 0; i < od->nbr->n_ports; i++) {
+        struct ovn_port *router_port =
+            ovn_port_find(ports, od->nbr->ports[i]->name);
+
+        if (!router_port || !router_port->peer) {
+            continue;
+        }
+
+        /* Get the peer logical switch/logical router datapath. */
+        struct ovn_datapath *peer_dp = router_port->peer->od;
+        if (peer_dp->nbr) {
+            if (!peer_dp->lr_group) {
+                peer_dp->lr_group = od->lr_group;
+                od->lr_group->router_dps[od->lr_group->n_router_dps++]
+                    = peer_dp;
+                build_lrouter_groups__(ports, peer_dp);
+            }
+        } else {
+            for (size_t j = 0; j < peer_dp->n_router_ports; j++) {
+                if (!peer_dp->router_ports[j]->peer) {
+                    /* If there is no peer port connecting to the
+                    * router datapath, ignore it. */
+                    continue;
+                }
+
+                struct ovn_datapath *router_dp;
+                router_dp = peer_dp->router_ports[j]->peer->od;
+                if (router_dp == od) {
+                    continue;
+                }
+
+                if (router_dp->lr_group == od->lr_group) {
+                    /* 'router_dp' and 'od' already belong to the same
+                    * lrouter group. Nothing to be done. */
+                    continue;
+                }
+
+                router_dp->lr_group = od->lr_group;
+                od->lr_group->router_dps[od->lr_group->n_router_dps++]
+                    = router_dp;
+                build_lrouter_groups__(ports, router_dp);
+            }
+        }
+    }
+}
+
+/* Adds each logical router into a logical router group. All the
+ * logical routers which belong to a group are connected to
+ * each other either directly or indirectly (via transit logical switches
+ * in between).
+ *
+ * Suppose if 'lr_list' has lr0, lr1, lr2, lr3, lr4, lr5
+ * and the topology is like
+ *  sw0 <-> lr0 <-> sw1 <-> lr1 <->sw2 <-> lr2
+ *  sw3 <-> lr3 <-> lr4 <-> sw5
+ *  sw6 <-> lr5 <-> sw7
+ * Then 3 groups are created.
+ * Group 1 -> lr0, lr1 and lr2
+ *            lr0, lr1 and lr2's ovn_datapath->lr_group will point to this
+ *            group. This means sw0's logical ports can send packets to sw2's
+ *            logical ports if proper static route's are added.
+ * Group 2 -> lr3 and lr4
+ *            lr3 and lr4's ovn_datapath->lr_group will point to this group.
+ * Group 3 -> lr5
+ *
+ * Each logical router can belong to only one group.
+ */
+static void
+build_lrouter_groups(struct hmap *ports, struct ovs_list *lr_list)
+{
+    struct ovn_datapath *od;
+    size_t n_router_dps = ovs_list_size(lr_list);
+
+    LIST_FOR_EACH (od, lr_list, lr_list) {
+        if (!od->lr_group) {
+            od->lr_group = xzalloc(sizeof *od->lr_group);
+            /* Each logical router group can have max
+             * 'n_router_dps'. So allocate enough memory. */
+            od->lr_group->router_dps = xcalloc(sizeof *od, n_router_dps);
+            od->lr_group->router_dps[od->lr_group->n_router_dps] = od;
+            od->lr_group->n_router_dps = 1;
+            sset_init(&od->lr_group->ha_chassis_groups);
+            build_lrouter_groups__(ports, od);
+        }
+    }
+}
+
 static void
 build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
                     struct hmap *port_groups, struct hmap *lflows,
@@ -7221,8 +7438,27 @@  sync_dns_entries(struct northd_context *ctx, struct hmap *datapaths)
 }
 
 static void
-destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports)
+destroy_datapaths_and_ports(struct hmap *datapaths, struct hmap *ports,
+                            struct ovs_list *lr_list)
 {
+    struct ovn_datapath *router_dp;
+    LIST_FOR_EACH_POP (router_dp, lr_list, lr_list) {
+        if (router_dp->lr_group) {
+            struct lrouter_group *lr_group = router_dp->lr_group;
+
+            for (size_t i = 0; i < router_dp->lr_group->n_router_dps; i++) {
+                if (router_dp->lr_group->router_dps[i] != router_dp) {
+                    router_dp->lr_group->router_dps[i]->lr_group = NULL;
+                }
+            }
+
+            free(lr_group->router_dps);
+            sset_destroy(&lr_group->ha_chassis_groups);
+            free(lr_group);
+            router_dp->lr_group = NULL;
+        }
+    }
+
     struct ovn_datapath *dp, *next_dp;
     HMAP_FOR_EACH_SAFE (dp, next_dp, key_node, datapaths) {
         ovn_datapath_destroy(datapaths, dp);
@@ -7240,16 +7476,19 @@  static void
 ovnnb_db_run(struct northd_context *ctx,
              struct ovsdb_idl_index *sbrec_chassis_by_name,
              struct ovsdb_idl_loop *sb_loop,
-             struct hmap *datapaths, struct hmap *ports)
+             struct hmap *datapaths, struct hmap *ports,
+             struct ovs_list *lr_list)
 {
     if (!ctx->ovnsb_txn || !ctx->ovnnb_txn) {
         return;
     }
     struct hmap port_groups;
-    build_datapaths(ctx, datapaths);
+
+    build_datapaths(ctx, datapaths, lr_list);
     build_ports(ctx, sbrec_chassis_by_name, datapaths, ports);
     build_ipam(datapaths, ports);
     build_port_group_lswitches(ctx, &port_groups, ports);
+    build_lrouter_groups(ports, lr_list);
     build_lflows(ctx, datapaths, ports, &port_groups);
 
     sync_address_sets(ctx);
@@ -7309,13 +7548,131 @@  ovnnb_db_run(struct northd_context *ctx,
     cleanup_macam(&macam);
 }
 
+/* Stores the list of chassis which references an ha_chassis_group.
+ */
+struct ha_ref_chassis_info {
+    const struct sbrec_ha_chassis_group *ha_chassis_group;
+    struct sbrec_chassis **ref_chassis;
+    size_t n_ref_chassis;
+    size_t free_slots;
+};
+
+static void
+add_to_ha_ref_chassis_info(struct ha_ref_chassis_info *ref_ch_info,
+                           const struct sbrec_chassis *chassis)
+{
+    for (size_t j = 0; j < ref_ch_info->n_ref_chassis; j++) {
+        if (ref_ch_info->ref_chassis[j] == chassis) {
+           return;
+        }
+    }
+
+    /* Allocate space for 3 chassis at a time. */
+    if (!ref_ch_info->free_slots) {
+        ref_ch_info->ref_chassis =
+            xrealloc(ref_ch_info->ref_chassis,
+                     sizeof *ref_ch_info->ref_chassis *
+                     (ref_ch_info->n_ref_chassis + 3));
+        ref_ch_info->free_slots = 3;
+    }
+
+    ref_ch_info->ref_chassis[ref_ch_info->n_ref_chassis] =
+        CONST_CAST(struct sbrec_chassis *, chassis);
+    ref_ch_info->n_ref_chassis++;
+    ref_ch_info->free_slots--;
+}
+
+static void
+update_sb_ha_group_ref_chassis(struct shash *ha_ref_chassis_map)
+{
+    struct shash_node *node, *next;
+    SHASH_FOR_EACH_SAFE (node, next, ha_ref_chassis_map) {
+        struct ha_ref_chassis_info *ha_ref_info = node->data;
+        sbrec_ha_chassis_group_set_ref_chassis(ha_ref_info->ha_chassis_group,
+                                               ha_ref_info->ref_chassis,
+                                               ha_ref_info->n_ref_chassis);
+        free(ha_ref_info->ref_chassis);
+        free(ha_ref_info);
+        shash_delete(ha_ref_chassis_map, node);
+    }
+}
+
+/* This function checks if the port binding 'sb' references
+ * a HA chassis group.
+ * Eg. Suppose a distributed logical router port - lr0-public
+ * uses an HA chassis group - hagrp1 and if hagrp1 has 3 ha
+ * chassis - gw1, gw2 and gw3.
+ * Or
+ * If the distributed logical router port - lr0-public has
+ * 3 gateway chassis - gw1, gw2 and gw3.
+ * ovn-northd creates ha chassis group - hagrp1 in SB DB
+ * and adds gw1, gw2 and gw3 to its ha_chassis list.
+ *
+ * If port binding 'sb' represents a logical switch port 'p1'
+ * and its logical switch is connected to the logical router
+ * 'lr0' directly or indirectly (i.e p1's logical switch is
+ *  connected to a router 'lr1' and 'lr1' has a path to lr0 via
+ *  transit logical switches) and 'sb' is claimed by chassis - 'c1' then
+ * this function adds c1 to the list of the reference chassis
+ *  - 'ref_chassis' of hagrp1.
+ */
+static void
+build_ha_chassis_group_ref_chassis(struct northd_context *ctx,
+                                   const struct sbrec_port_binding *sb,
+                                   struct ovn_port *op,
+                                   struct shash *ha_ref_chassis_map)
+{
+    struct lrouter_group *lr_group = NULL;
+    for (size_t i = 0; i < op->od->n_router_ports; i++) {
+        if (!op->od->router_ports[i]->peer) {
+            continue;
+        }
+
+        lr_group = op->od->router_ports[i]->peer->od->lr_group;
+        /* If a logical switch has multiple router ports, then
+         * all the logical routers belong to the same logical
+         * router group. */
+        break;
+    }
+
+    if (!lr_group) {
+        return;
+    }
+
+    const char *ha_group_name;
+    SSET_FOR_EACH (ha_group_name, &lr_group->ha_chassis_groups) {
+        const struct sbrec_ha_chassis_group *sb_ha_chassis_grp;
+        sb_ha_chassis_grp = ha_chassis_group_lookup_by_name(
+            ctx->sbrec_ha_chassis_grp_by_name, ha_group_name);
+
+        if (sb_ha_chassis_grp) {
+            struct ha_ref_chassis_info *ref_ch_info =
+            shash_find_data(ha_ref_chassis_map, sb_ha_chassis_grp->name);
+            ovs_assert(ref_ch_info);
+            add_to_ha_ref_chassis_info(ref_ch_info, sb->chassis);
+        }
+    }
+}
+
 /* Handle changes to the 'chassis' column of the 'Port_Binding' table.  When
  * this column is not empty, it means we need to set the corresponding logical
  * port as 'up' in the northbound DB. */
 static void
-update_logical_port_status(struct northd_context *ctx, struct hmap *ports)
+handle_port_binding_changes(struct northd_context *ctx, struct hmap *ports,
+                            struct shash *ha_ref_chassis_map)
 {
     const struct sbrec_port_binding *sb;
+    bool build_ha_chassis_ref = false;
+    if (ctx->ovnsb_txn) {
+        const struct sbrec_ha_chassis_group *ha_ch_grp;
+        SBREC_HA_CHASSIS_GROUP_FOR_EACH (ha_ch_grp, ctx->ovnsb_idl) {
+            struct ha_ref_chassis_info *ref_ch_info =
+                xzalloc(sizeof *ref_ch_info);
+            ref_ch_info->ha_chassis_group = ha_ch_grp;
+            build_ha_chassis_ref = true;
+            shash_add(ha_ref_chassis_map, ha_ch_grp->name, ref_ch_info);
+        }
+    }
 
     SBREC_PORT_BINDING_FOR_EACH(sb, ctx->ovnsb_idl) {
         struct ovn_port *op = ovn_port_find(ports, sb->logical_port);
@@ -7331,6 +7688,13 @@  update_logical_port_status(struct northd_context *ctx, struct hmap *ports)
         if (!op->nbsp->up || *op->nbsp->up != up) {
             nbrec_logical_switch_port_set_up(op->nbsp, &up, 1);
         }
+
+        if (build_ha_chassis_ref && ctx->ovnsb_txn && sb->chassis) {
+            /* Check and add the chassis which has claimed this 'sb'
+             * to the ha chassis group's ref_chassis if required. */
+            build_ha_chassis_group_ref_chassis(ctx, sb, op,
+                                               ha_ref_chassis_map);
+        }
     }
 }
 
@@ -7660,8 +8024,13 @@  ovnsb_db_run(struct northd_context *ctx, struct ovsdb_idl_loop *sb_loop,
         return;
     }
 
-    update_logical_port_status(ctx, ports);
+    struct shash ha_ref_chassis_map = SHASH_INITIALIZER(&ha_ref_chassis_map);
+    handle_port_binding_changes(ctx, ports, &ha_ref_chassis_map);
     update_northbound_cfg(ctx, sb_loop);
+    if (ctx->ovnsb_txn) {
+        update_sb_ha_group_ref_chassis(&ha_ref_chassis_map);
+    }
+    shash_destroy(&ha_ref_chassis_map);
 }
 
 static void
@@ -7670,12 +8039,14 @@  ovn_db_run(struct northd_context *ctx,
            struct ovsdb_idl_loop *ovnsb_idl_loop)
 {
     struct hmap datapaths, ports;
+    struct ovs_list lr_list;
+    ovs_list_init(&lr_list);
     hmap_init(&datapaths);
     hmap_init(&ports);
     ovnnb_db_run(ctx, sbrec_chassis_by_name, ovnsb_idl_loop,
-                 &datapaths, &ports);
+                 &datapaths, &ports, &lr_list);
     ovnsb_db_run(ctx, ovnsb_idl_loop, &ports);
-    destroy_datapaths_and_ports(&datapaths, &ports);
+    destroy_datapaths_and_ports(&datapaths, &ports, &lr_list);
 }
 
 static void
@@ -7840,6 +8211,8 @@  main(int argc, char *argv[])
     ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_port_binding_col_chassis);
     ovsdb_idl_add_column(ovnsb_idl_loop.idl,
                          &sbrec_port_binding_col_gateway_chassis);
+    ovsdb_idl_add_column(ovnsb_idl_loop.idl,
+                         &sbrec_port_binding_col_ha_chassis_group);
     ovsdb_idl_add_column(ovnsb_idl_loop.idl,
                          &sbrec_gateway_chassis_col_chassis);
     ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_gateway_chassis_col_name);
@@ -7904,9 +8277,30 @@  main(int argc, char *argv[])
     ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_chassis_col_nb_cfg);
     ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_chassis_col_name);
 
+    ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_ha_chassis);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_col_chassis);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_col_priority);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_col_external_ids);
+
+    ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_ha_chassis_group);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_group_col_name);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_group_col_ha_chassis);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_group_col_external_ids);
+    add_column_noalert(ovnsb_idl_loop.idl,
+                       &sbrec_ha_chassis_group_col_ref_chassis);
+
     struct ovsdb_idl_index *sbrec_chassis_by_name
         = chassis_index_create(ovnsb_idl_loop.idl);
 
+    struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name
+        = ha_chassis_group_index_create(ovnsb_idl_loop.idl);
+
     /* Ensure that only a single ovn-northd is active in the deployment by
      * acquiring a lock called "ovn_northd" on the southbound database
      * and then only performing DB transactions if the lock is held. */
@@ -7921,6 +8315,7 @@  main(int argc, char *argv[])
             .ovnnb_txn = ovsdb_idl_loop_run(&ovnnb_idl_loop),
             .ovnsb_idl = ovnsb_idl_loop.idl,
             .ovnsb_txn = ovsdb_idl_loop_run(&ovnsb_idl_loop),
+            .sbrec_ha_chassis_grp_by_name = sbrec_ha_chassis_grp_by_name,
         };
 
         if (!had_lock && ovsdb_idl_has_lock(ovnsb_idl_loop.idl)) {
diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
index 10a59649a..48d27b960 100644
--- a/ovn/ovn-nb.ovsschema
+++ b/ovn/ovn-nb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Northbound",
-    "version": "5.14.1",
-    "cksum": "3758097843 20509",
+    "version": "5.15.0",
+    "cksum": "1078795414 21917",
     "tables": {
         "NB_Global": {
             "columns": {
@@ -271,6 +271,12 @@ 
                                      "refType": "strong"},
                              "min": 0,
                              "max": "unlimited"}},
+                "ha_chassis_group": {
+                    "type": {"key": {"type": "uuid",
+                                     "refTable": "HA_Chassis_Group",
+                                     "refType": "weak"},
+                             "min": 0,
+                             "max": 1}},
                 "options": {
                     "type": {"key": "string",
                              "value": "string",
@@ -392,5 +398,29 @@ 
                     "type": {"key": "string", "value": "string",
                              "min": 0, "max": "unlimited"}}},
             "indexes": [["name"]],
-            "isRoot": false}}
+            "isRoot": false},
+        "HA_Chassis": {
+            "columns": {
+                "chassis_name": {"type": "string"},
+                "priority": {"type": {"key": {"type": "integer",
+                                              "minInteger": 0,
+                                              "maxInteger": 32767}}},
+                "external_ids": {
+                    "type": {"key": "string", "value": "string",
+                             "min": 0, "max": "unlimited"}}},
+            "isRoot": false},
+        "HA_Chassis_Group": {
+            "columns": {
+                "name": {"type": "string"},
+                "ha_chassis": {
+                    "type": {"key": {"type": "uuid",
+                                     "refTable": "HA_Chassis",
+                                     "refType": "strong"},
+                             "min": 0,
+                             "max": "unlimited"}},
+                "external_ids": {
+                    "type": {"key": "string", "value": "string",
+                             "min": 0, "max": "unlimited"}}},
+            "indexes": [["name"]],
+            "isRoot": true}}
     }
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 61a57110a..75d27a0eb 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -1522,6 +1522,12 @@ 
     </column>
 
     <column name="gateway_chassis">
+      <p>
+        This column is ignored if the column
+        <ref column="ha_chassis_group" table="Logical_Router_Port"/>.
+        is set.
+      </p>
+
       <p>
         If set, this indicates that this logical router port represents
         a distributed gateway port that connects this router to a logical
@@ -1549,6 +1555,24 @@ 
       </p>
     </column>
 
+    <column name="ha_chassis_group">
+      <p>
+        If set, this indicates that this logical router port represents
+        a distributed gateway port that connects this router to a logical
+        switch with a localnet port.  There may be at most one such
+        logical router port on each logical router. The HA chassis which
+        are part of the HA chassis group will provide the gateway high
+        availability. Please see the <ref table="HA_Chassis_Group"/> for
+        more details.
+      </p>
+
+      <p>
+        When this column is set, the column
+        <ref column="gateway_chassis" table="Logical_Router_Port"/> will
+        be ignored.
+      </p>
+    </column>
+
     <column name="networks">
       <p>
         The IP addresses and netmasks of the router.  For example,
@@ -2622,4 +2646,55 @@ 
     </group>
   </table>
 
+  <table name="HA_Chassis_Group">
+    <p>
+      Table representing a group of chassis which can provide High availability
+      services. Each chassis in the group is represented by the table
+      <ref table="HA_Chassis"/>. The HA chassis with highest priority will
+      be the master of this group. If the master chassis failover is detected,
+      the HA chassis with the next higher priority takes over the
+      responsibility of providing the HA. If a distributed gateway router port
+      references a row in this table, then the master HA chassis in this group
+      provides the gateway functionality.
+    </p>
+
+    <column name="name">
+      Name of the <ref table="HA_Chassis_Group"/>. Name should be unique.
+    </column>
+
+    <column name="ha_chassis">
+      A list of HA chassis which belongs to this group.
+    </column>
+
+    <group title="Common Columns">
+      <column name="external_ids">
+        See <em>External IDs</em> at the beginning of this document.
+      </column>
+    </group>
+  </table>
+
+  <table name="HA_Chassis">
+    <column name="chassis_name">
+      <p>
+        Name of the chassis which is part of the HA chassis group.
+        The value must match the
+        <ref db="OVN_Southbound" table="Chassis" column="name"/> column
+        of the <ref db="OVN_Southbound" table="Chassis"/> table in the
+        <ref db="OVN_Southbound"/> database.
+      </p>
+    </column>
+
+    <column name="priority">
+      <p>
+        Priority of the chassis. Chassis with highest priority will be
+        the master.
+      </p>
+    </column>
+
+    <group title="Common Columns">
+      <column name="external_ids">
+        See <em>External IDs</em> at the beginning of this document.
+      </column>
+    </group>
+  </table>
 </database>
diff --git a/ovn/ovn-sb.ovsschema b/ovn/ovn-sb.ovsschema
index cc8c771a7..e05f964b0 100644
--- a/ovn/ovn-sb.ovsschema
+++ b/ovn/ovn-sb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Southbound",
-    "version": "2.1.0",
-    "cksum": "3806083220 15332",
+    "version": "2.2.0",
+    "cksum": "2001312516 17220",
     "tables": {
         "SB_Global": {
             "columns": {
@@ -147,6 +147,12 @@ 
                                      "refType": "strong"},
                              "min": 0,
                              "max": "unlimited"}},
+                "ha_chassis_group": {
+                    "type": {"key": {"type": "uuid",
+                                     "refTable": "HA_Chassis_Group",
+                                     "refType": "weak"},
+                             "min": 0,
+                             "max": 1}},
                 "options": {
                      "type": {"key": "string",
                               "value": "string",
@@ -309,4 +315,35 @@ 
                     "type": {"key": "string", "value": "string",
                              "min": 0, "max": "unlimited"}}},
             "indexes": [["name"]],
-            "isRoot": false}}}
+            "isRoot": false},
+        "HA_Chassis": {
+            "columns": {
+                "chassis": {"type": {"key": {"type": "uuid",
+                                             "refTable": "Chassis",
+                                             "refType": "weak"},
+                                     "min": 0, "max": 1}},
+                "priority": {"type": {"key": {"type": "integer",
+                                              "minInteger": 0,
+                                              "maxInteger": 32767}}},
+                "external_ids": {
+                    "type": {"key": "string", "value": "string",
+                             "min": 0, "max": "unlimited"}}},
+            "isRoot": false},
+        "HA_Chassis_Group": {
+            "columns": {
+                "name": {"type": "string"},
+                "ha_chassis": {
+                    "type": {"key": {"type": "uuid",
+                                     "refTable": "HA_Chassis",
+                                     "refType": "strong"},
+                             "min": 0,
+                             "max": "unlimited"}},
+                "ref_chassis": {"type": {"key": {"type": "uuid",
+                                                 "refTable": "Chassis",
+                                                 "refType": "weak"},
+                                         "min": 0, "max": "unlimited"}},
+                "external_ids": {
+                    "type": {"key": "string", "value": "string",
+                             "min": 0, "max": "unlimited"}}},
+            "indexes": [["name"]],
+            "isRoot": true}}}
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index 4e080abff..5c4a852a5 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -2246,6 +2246,16 @@  tcp.flags = RST;
         </p>
       </column>
 
+      <column name="ha_chassis_group">
+        <p>
+          This should only be populated for ports with
+          <ref column="type"/> set to <code>chassisredirect</code>.
+          This column defines the HA chassis group with a list of
+          HA chassis used as gateways where traffic will be redirected
+          through.
+        </p>
+      </column>
+
       <column name="tunnel_key">
         <p>
           A number that represents the logical port in the key (e.g. STT key or
@@ -3339,4 +3349,57 @@  tcp.flags = RST;
       <column name="external_ids"/>
     </group>
   </table>
+
+  <table name="HA_Chassis">
+    <column name="chassis">
+      <p>
+        The <ref table="Chassis"/> which provides the HA functionality.
+        </p>
+    </column>
+
+    <column name="priority">
+      <p>
+        Priority of the HA chassis. Chassis with highest priority will be
+        the master in the HA chassis group.
+      </p>
+    </column>
+
+    <group title="Common Columns">
+      <column name="external_ids">
+        See <em>External IDs</em> at the beginning of this document.
+      </column>
+    </group>
+  </table>
+
+  <table name="HA_Chassis_Group">
+    <p>
+      Table representing a group of chassis which can provide High availability
+      services. Each chassis in the group is represented by the table
+      <ref table="HA_Chassis"/>. The HA chassis with highest priority will
+      be the master of this group. If the master chassis failover is detected,
+      the HA chassis with the next higher priority takes over the
+      responsibility of providing the HA. If <ref db="OVN_Southbound"
+      table="Port_Binding" column="ha_chassis_group"/> column of the table
+      <ref db="OVN_Southbound" table="Port_Binding"/> references this table,
+      then this HA chassis group provides the gateway functionality and
+      redirects the gateway traffic to the master of this group.
+    </p>
+    <column name="name">
+      Name of the <ref table="HA_Chassis_Group"/>. Name should be unique.
+    </column>
+
+    <column name="ha_chassis">
+      A list of <ref table="HA_Chassis"/> which belongs to this group.
+    </column>
+
+    <column name="ref_chassis">
+      A list of <ref table="chassis"/> which references this HA chassis group.
+    </column>
+
+    <group title="Common Columns">
+      <column name="external_ids">
+        See <em>External IDs</em> at the beginning of this document.
+      </column>
+    </group>
+  </table>
 </database>
diff --git a/ovn/utilities/ovn-nbctl.8.xml b/ovn/utilities/ovn-nbctl.8.xml
index a7a9c2701..483602970 100644
--- a/ovn/utilities/ovn-nbctl.8.xml
+++ b/ovn/utilities/ovn-nbctl.8.xml
@@ -908,6 +908,47 @@ 
       </dd>
     </dl>
 
+    <h1> HA Chassis Group commands</h1>
+
+    <dl>
+      <dt><code>ha-chassis-group-add</code> <var>group</var></dt>
+      <dd>
+        Creates a new HA chassis group in the <code>HA_Chassis_Group</code>
+        table named <code>group</code>.
+      </dd>
+
+      <dt><code>ha-chassis-group-del</code> <var>group</var></dt>
+      <dd>
+        Deletes the HA chassis group <code>group</code>. It is an error if
+        <code>group</code> does not exist.
+      </dd>
+
+      <dt><code>ha-chassis-group-list</code></dt>
+      <dd>
+        Lists the HA chassis group <code>group</code> along with the
+        <code>HA chassis</code> if any associated with it.
+      </dd>
+
+      <dt><code>ha-chassis-group-add-chassis</code> <var>group</var>
+      <var>chassis</var> <var>priority</var></dt>
+      <dd>
+        Adds a new HA chassis <code>chassis</code> to the
+        HA Chassis group <code>group</code> with the specified priority.
+        If the <code>chassis</code> already exists, then the
+        <code>priority</code> is updated.
+        The <code>chassis</code> should be the name of the chassis in the
+        <code>OVN_Southbound</code>.
+      </dd>
+
+      <dt><code>ha-chassis-group-remove-chassis</code> <var>group</var>
+      <var>chassis</var></dt>
+      <dd>
+        Removes the HA chassis <code>chassis</code> from the HA chassis
+        group <code>group</code>. It is an error if <code>chassis</code> does
+        not exist.
+      </dd>
+    </dl>
+
     <h1>Database Commands</h1>
     <p>These commands query and modify the contents of <code>ovsdb</code> tables.
     They are a slight abstraction of the <code>ovsdb</code> interface and
diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
index 2727b410a..a2954abd4 100644
--- a/ovn/utilities/ovn-nbctl.c
+++ b/ovn/utilities/ovn-nbctl.c
@@ -694,6 +694,15 @@  Port group commands:\n\
   pg-set-ports PG PORTS       Set PORTS on port group PG\n\
   pg-del PG                   Delete port group PG\n\
 \n\
+HA chassis group commands:\n\
+  ha-chassis-group-add GRP  Create an HA chassis group GRP\n\
+  ha-chassis-group-del GRP  Delete the HA chassis group GRP\n\
+  ha-chassis-group-list     List the HA chassis groups\n\
+  ha-chassis-group-add-chassis GRP CHASSIS [PRIORITY] Adds an HA\
+chassis with optional PRIORITY to the HA chassis group GRP\n\
+  ha-chassis-group-del-chassis GRP CHASSIS Deletes the HA chassis\
+CHASSIS from the HA chassis group GRP\n\
+\n\
 %s\
 %s\
 \n\
@@ -4756,6 +4765,201 @@  cmd_pg_del(struct ctl_context *ctx)
     nbrec_port_group_delete(pg);
 }
 
+static const struct nbrec_ha_chassis_group*
+ha_chassis_group_by_name_or_uuid(struct ctl_context *ctx, const char *id,
+                                 bool must_exist)
+{
+    struct uuid ch_grp_uuid;
+    const struct nbrec_ha_chassis_group *ha_ch_grp = NULL;
+    bool is_uuid = uuid_from_string(&ch_grp_uuid, id);
+    if (is_uuid) {
+        ha_ch_grp = nbrec_ha_chassis_group_get_for_uuid(ctx->idl,
+                                                        &ch_grp_uuid);
+    }
+
+    if (!ha_ch_grp) {
+        const struct nbrec_ha_chassis_group *iter;
+        NBREC_HA_CHASSIS_GROUP_FOR_EACH (iter, ctx->idl) {
+            if (!strcmp(iter->name, id)) {
+                ha_ch_grp = iter;
+                break;
+            }
+        }
+    }
+
+    if (!ha_ch_grp && must_exist) {
+        ctx->error = xasprintf("%s: ha_chassi_group %s not found",
+                               id, is_uuid ? "UUID" : "name");
+    }
+
+    return ha_ch_grp;
+}
+
+static void
+cmd_ha_ch_grp_add(struct ctl_context *ctx)
+{
+    const char *name = ctx->argv[1];
+    struct nbrec_ha_chassis_group *ha_ch_grp =
+        nbrec_ha_chassis_group_insert(ctx->txn);
+    nbrec_ha_chassis_group_set_name(ha_ch_grp, name);
+}
+
+static void
+cmd_ha_ch_grp_del(struct ctl_context *ctx)
+{
+    const char *name_or_id = ctx->argv[1];
+
+    const struct nbrec_ha_chassis_group *ha_ch_grp =
+        ha_chassis_group_by_name_or_uuid(ctx, name_or_id, true);
+
+    if (ha_ch_grp) {
+        nbrec_ha_chassis_group_delete(ha_ch_grp);
+    }
+}
+
+static void
+cmd_ha_ch_grp_list(struct ctl_context *ctx)
+{
+    const struct nbrec_ha_chassis_group *ha_ch_grp;
+
+    NBREC_HA_CHASSIS_GROUP_FOR_EACH (ha_ch_grp, ctx->idl) {
+        ds_put_format(&ctx->output, UUID_FMT " (%s)\n",
+                      UUID_ARGS(&ha_ch_grp->header_.uuid), ha_ch_grp->name);
+        const struct nbrec_ha_chassis *ha_ch;
+        for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
+            ha_ch = ha_ch_grp->ha_chassis[i];
+            ds_put_format(&ctx->output,
+                          "    "UUID_FMT " (%s)\n"
+                          "    priority %lu\n\n",
+                          UUID_ARGS(&ha_ch->header_.uuid), ha_ch->chassis_name,
+                          ha_ch->priority);
+        }
+        ds_put_cstr(&ctx->output, "\n");
+    }
+}
+
+static void
+cmd_ha_ch_grp_add_chassis(struct ctl_context *ctx)
+{
+    const struct nbrec_ha_chassis_group *ha_ch_grp =
+        ha_chassis_group_by_name_or_uuid(ctx, ctx->argv[1], true);
+
+    if (!ha_ch_grp) {
+        return;
+    }
+
+    const char *chassis_name = ctx->argv[2];
+    int64_t priority;
+    char *error = parse_priority(ctx->argv[3], &priority);
+    if (error) {
+        ctx->error = error;
+        return;
+    }
+
+    struct nbrec_ha_chassis *ha_chassis = NULL;
+    for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
+        if (!strcmp(ha_ch_grp->ha_chassis[i]->chassis_name, chassis_name)) {
+            ha_chassis = ha_ch_grp->ha_chassis[i];
+            break;
+        }
+    }
+
+    if (ha_chassis) {
+        nbrec_ha_chassis_set_priority(ha_chassis, priority);
+        return;
+    }
+
+    ha_chassis = nbrec_ha_chassis_insert(ctx->txn);
+    nbrec_ha_chassis_set_chassis_name(ha_chassis, chassis_name);
+    nbrec_ha_chassis_set_priority(ha_chassis, priority);
+
+    nbrec_ha_chassis_group_verify_ha_chassis(ha_ch_grp);
+
+    struct nbrec_ha_chassis **new_ha_chs =
+        xmalloc(sizeof *new_ha_chs * (ha_ch_grp->n_ha_chassis + 1));
+    nullable_memcpy(new_ha_chs, ha_ch_grp->ha_chassis,
+                    sizeof *new_ha_chs * ha_ch_grp->n_ha_chassis);
+    new_ha_chs[ha_ch_grp->n_ha_chassis] =
+        CONST_CAST(struct nbrec_ha_chassis *, ha_chassis);
+    nbrec_ha_chassis_group_set_ha_chassis(ha_ch_grp, new_ha_chs,
+                                          ha_ch_grp->n_ha_chassis + 1);
+    free(new_ha_chs);
+}
+
+static void
+cmd_ha_ch_grp_remove_chassis(struct ctl_context *ctx)
+{
+    const struct nbrec_ha_chassis_group *ha_ch_grp =
+        ha_chassis_group_by_name_or_uuid(ctx, ctx->argv[1], true);
+
+    if (!ha_ch_grp) {
+        return;
+    }
+
+    const char *chassis_name = ctx->argv[2];
+    struct nbrec_ha_chassis *ha_chassis = NULL;
+    size_t idx = 0;
+    for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
+        if (!strcmp(ha_ch_grp->ha_chassis[i]->chassis_name, chassis_name)) {
+            ha_chassis = ha_ch_grp->ha_chassis[i];
+            idx = i;
+            break;
+        }
+    }
+
+    if (!ha_chassis) {
+        ctx->error = xasprintf("%s: ha chassis not found in %s ha "
+                               "chassis group", chassis_name, ctx->argv[1]);
+        return;
+    }
+
+    struct nbrec_ha_chassis **new_ha_ch
+        = xmemdup(ha_ch_grp->ha_chassis,
+                  sizeof *new_ha_ch * ha_ch_grp->n_ha_chassis);
+    new_ha_ch[idx] = new_ha_ch[ha_ch_grp->n_ha_chassis - 1];
+    nbrec_ha_chassis_group_verify_ha_chassis(ha_ch_grp);
+    nbrec_ha_chassis_group_set_ha_chassis(ha_ch_grp, new_ha_ch,
+                                          ha_ch_grp->n_ha_chassis - 1);
+    free(new_ha_ch);
+    nbrec_ha_chassis_delete(ha_chassis);
+}
+
+static void
+cmd_ha_ch_grp_set_chassis_prio(struct ctl_context *ctx)
+{
+    const struct nbrec_ha_chassis_group *ha_ch_grp =
+        ha_chassis_group_by_name_or_uuid(ctx, ctx->argv[1], true);
+
+    if (!ha_ch_grp) {
+        return;
+    }
+
+    int64_t priority;
+    char *error = parse_priority(ctx->argv[3], &priority);
+    if (error) {
+        ctx->error = error;
+        return;
+    }
+
+    const char *chassis_name = ctx->argv[2];
+    struct nbrec_ha_chassis *ha_chassis = NULL;
+
+    for (size_t i = 0; i < ha_ch_grp->n_ha_chassis; i++) {
+        if (!strcmp(ha_ch_grp->ha_chassis[i]->chassis_name, chassis_name)) {
+            ha_chassis = ha_ch_grp->ha_chassis[i];
+            break;
+        }
+    }
+
+    if (!ha_chassis) {
+        ctx->error = xasprintf("%s: ha chassis not found in %s ha "
+                               "chassis group", chassis_name, ctx->argv[1]);
+        return;
+    }
+
+    nbrec_ha_chassis_set_priority(ha_chassis, priority);
+}
+
 static const struct ctl_table_class tables[NBREC_N_TABLES] = {
     [NBREC_TABLE_DHCP_OPTIONS].row_ids
     = {{&nbrec_logical_switch_port_col_name, NULL,
@@ -4790,6 +4994,9 @@  static const struct ctl_table_class tables[NBREC_N_TABLES] = {
     = {&nbrec_port_group_col_name, NULL, NULL},
 
     [NBREC_TABLE_ACL].row_ids[0] = {&nbrec_acl_col_name, NULL, NULL},
+
+    [NBREC_TABLE_HA_CHASSIS_GROUP].row_ids[0]
+    = {&nbrec_ha_chassis_group_col_name, NULL, NULL},
 };
 
 static char *
@@ -5230,6 +5437,20 @@  static const struct ctl_command_syntax nbctl_commands[] = {
     {"pg-set-ports", 2, INT_MAX, "", NULL, cmd_pg_set_ports, NULL, "", RW },
     {"pg-del", 1, 1, "", NULL, cmd_pg_del, NULL, "", RW },
 
+    /* HA chassis group commands. */
+    {"ha-chassis-group-add", 1, 1, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_add, NULL, "", RW },
+    {"ha-chassis-group-del", 1, 1, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_del, NULL, "", RW },
+    {"ha-chassis-group-list", 0, 0, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_list, NULL, "", RO },
+    {"ha-chassis-group-add-chassis", 3, 3, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_add_chassis, NULL, "", RW },
+    {"ha-chassis-group-remove-chassis", 2, 2, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_remove_chassis, NULL, "", RW },
+    {"ha-chassis-group-set-chassis-prio", 3, 3, "[CHASSIS GROUP]", NULL,
+      cmd_ha_ch_grp_set_chassis_prio, NULL, "", RW },
+
     {NULL, 0, 0, NULL, NULL, NULL, NULL, "", RO},
 };
 
diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c
index ee97a4710..c5ff9318f 100644
--- a/ovn/utilities/ovn-sbctl.c
+++ b/ovn/utilities/ovn-sbctl.c
@@ -1189,6 +1189,12 @@  static const struct ctl_table_class tables[SBREC_N_TABLES] = {
 
     [SBREC_TABLE_ADDRESS_SET].row_ids[0]
     = {&sbrec_address_set_col_name, NULL, NULL},
+
+    [SBREC_TABLE_HA_CHASSIS_GROUP].row_ids[0]
+    = {&sbrec_ha_chassis_group_col_name, NULL, NULL},
+
+    [SBREC_TABLE_HA_CHASSIS].row_ids[0]
+    = {&sbrec_ha_chassis_col_chassis, NULL, NULL},
 };
 
 
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 1878eb2df..b4e8995be 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -34,6 +34,33 @@  AT_CHECK([ovn-sbctl --bare --columns gateway_chassis find port_binding logical_p
 AT_CHECK([ovn-sbctl --bare --columns gateway_chassis find port_binding logical_port="cr-alice" | grep $gwc2_uuid | wc -l], [0], [1
 ])
 
+# There should be one ha_chassis_group with the name "alice"
+ha_chassi_grp_name=`ovn-sbctl --bare --columns name find \
+ha_chassis_group name="alice"`
+
+AT_CHECK([test $ha_chassi_grp_name = alice])
+
+ha_chgrp_uuid=`ovn-sbctl --bare --columns _uuid find ha_chassis_group name=alice`
+
+AT_CHECK([ovn-sbctl --bare --columns ha_chassis_group find port_binding \
+logical_port="cr-alice" | grep $ha_chgrp_uuid | wc -l], [0], [1
+])
+
+ha_ch=`ovn-sbctl --bare --columns ha_chassis  find ha_chassis_group`
+# Trim the spaces.
+ha_ch=`echo $ha_ch | sed 's/ //g'`
+
+ha_ch_list=''
+for i in `ovn-sbctl --bare --columns _uuid list ha_chassis | sort`
+do
+    ha_ch_list="$ha_ch_list $i"
+done
+
+# Trim the spaces.
+ha_ch_list=`echo $ha_ch_list | sed 's/ //g'`
+
+AT_CHECK([test "$ha_ch_list" = "$ha_ch"])
+
 # delete the 2nd Gateway_Chassis on NBDB for alice port
 
 ovn-nbctl --wait=sb set Logical_Router_Port alice gateway_chassis=${nb_gwc1_uuid}
@@ -45,6 +72,10 @@  AT_CHECK([ovn-sbctl --bare --columns gateway_chassis find port_binding logical_p
 
 AT_CHECK([ovn-sbctl find Gateway_Chassis name=alice_gw2], [0], [])
 
+# There should be only 1 row in ha_chassis SB DB table.
+AT_CHECK([ovn-sbctl --bare --columns _uuid list ha_chassis | wc -l], [0], [1
+])
+
 # delete all the gateway_chassis on NBDB for alice port
 
 ovn-nbctl --wait=sb clear Logical_Router_Port alice gateway_chassis
@@ -54,6 +85,12 @@  ovn-nbctl --wait=sb clear Logical_Router_Port alice gateway_chassis
 AT_CHECK([ovn-sbctl find Gateway_Chassis name=alice_gw1], [0], [])
 AT_CHECK([ovn-sbctl find Gateway_Chassis name=alice_gw2], [0], [])
 
+# expect that the ha_chassis doesn't exist anymore
+AT_CHECK([ovn-sbctl list ha_chassis | wc -l], [0], [0
+])
+AT_CHECK([ovn-sbctl list ha_chassis_group | wc -l], [0], [0
+])
+
 AT_CLEANUP
 
 AT_SETUP([ovn -- check Gateway_Chassis propagation from NBDB to SBDB backwards compatibility])
@@ -301,3 +338,228 @@  as northd
 OVS_APP_EXIT_AND_WAIT([ovn-northd])
 
 AT_CLEANUP
+
+AT_SETUP([ovn -- check HA_Chassis_Group propagation from NBDB to SBDB])
+AT_SKIP_IF([test $HAVE_PYTHON = no])
+ovn_start
+
+ovn-nbctl --wait=sb ha-chassis-group-add hagrp1
+
+# ovn-northd should not create HA chassis group and HA chassis rows
+# unless the HA chassis group in OVN NB DB is associated to
+# a logical router port. ovn-northd still doesn't support
+# associating a HA chassis group to a logical router port.
+AT_CHECK([ovn-sbctl --bare --columns name find ha_chassis_group name="hagrp1" \
+| wc -l], [0], [0
+])
+
+ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp1 ch1 30
+ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp1 ch2 20
+ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp1 ch3 10
+
+# There should be no HA_Chassis rows in SB DB.
+AT_CHECK([ovn-sbctl list ha_chassis | grep chassis | awk '{print $3}' \
+| grep -v '-' | wc -l ], [0], [0
+])
+
+# Add chassis ch1.
+ovn-sbctl chassis-add ch1 geneve 127.0.0.2
+
+OVS_WAIT_UNTIL([test 1 = `ovn-sbctl list chassis | grep ch1 | wc -l`])
+
+# There should be no HA_Chassis rows
+AT_CHECK([ovn-sbctl list ha_chassis | grep chassis | awk '{print $3}' \
+| grep -v '-' | wc -l ], [0], [0
+])
+
+ovn-nbctl ha-chassis-group-del hagrp1
+OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis_group | wc -l`])
+OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis | wc -l`])
+
+# Create a logical router port and attach Gateway chassis.
+# ovn-northd should create HA chassis group rows in SB DB.
+ovn-nbctl lr-add lr0
+
+ovn-nbctl lrp-add lr0 lr0-public 00:00:20:20:12:13 172.168.0.100/24
+ovn-nbctl lrp-set-gateway-chassis lr0-public ch1 20
+
+OVS_WAIT_UNTIL([test 1 = `ovn-sbctl --bare --columns name find ha_chassis_group \
+name="lr0-public" | wc -l`])
+
+OVS_WAIT_UNTIL([test 1 = `ovn-sbctl --bare --columns _uuid \
+find ha_chassis | wc -l`])
+
+ovn-nbctl lrp-set-gateway-chassis lr0-public ch2 10
+
+ovn-sbctl --bare --columns _uuid find ha_chassis
+OVS_WAIT_UNTIL([test 2 = `ovn-sbctl list ha_chassis | grep chassis | wc -l`])
+
+# Test if 'ref_chassis' column is properly set or not in
+# SB DB ha_chassis_group.
+ovn-nbctl ls-add sw0
+ovn-nbctl lsp-add sw0 sw0-p1
+
+ovn-sbctl chassis-add ch2 geneve 127.0.0.3
+ovn-sbctl chassis-add ch3 geneve 127.0.0.4
+ovn-sbctl chassis-add comp1 geneve 127.0.0.5
+ovn-sbctl chassis-add comp2 geneve 127.0.0.6
+
+ovn-nbctl lrp-add lr0 lr0-sw0 00:00:20:20:12:14 10.0.0.1/24
+ovn-nbctl lsp-add sw0 sw0-lr0
+ovn-nbctl lsp-set-type sw0-lr0 router
+ovn-nbctl lsp-set-addresses sw0-lr0 router
+ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0
+
+ovn-sbctl lsp-bind sw0-p1 comp1
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p1` = xup])
+
+comp1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="comp1"`
+comp2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="comp2"`
+ch2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="comp1"`
+
+echo "comp1_ch_uuid = $comp1_ch_uuid"
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp1_ch_uuid" = "$ref_ch_list"])
+
+# unbind sw0-p1
+ovn-sbctl lsp-unbind sw0-p1
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p1` = xdown])
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "" = "$ref_ch_list"])
+
+# Bind sw0-p1 in comp2
+ovn-sbctl lsp-bind sw0-p1 comp2
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp2_ch_uuid" = "$ref_ch_list"])
+
+ovn-nbctl ls-add sw1
+ovn-nbctl lsp-add sw1 sw1-p1
+ovn-nbctl lr-add lr1
+ovn-nbctl lrp-add lr1 lr1-sw1 00:00:20:20:12:15 20.0.0.1/24
+ovn-nbctl lsp-add sw1 sw1-lr1
+ovn-nbctl lsp-set-type sw1-lr1 router
+ovn-nbctl lsp-set-addresses sw1-lr1 router
+ovn-nbctl lsp-set-options sw1-lr1 router-port=lr1-sw1
+
+# Bind sw1-p1 in comp1.
+ovn-sbctl lsp-bind sw1-p1 comp1
+# Wait until sw1-p1 is up
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw1-p1` = xup])
+
+# sw1-p1 is not connected to lr0. So comp1 should not be in 'ref_chassis'
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp2_ch_uuid" = "$ref_ch_list"])
+
+# Now attach sw0 to lr1
+ovn-nbctl lrp-add lr1 lr1-sw0 00:00:20:20:12:16 10.0.0.10/24
+ovn-nbctl lsp-add sw0 sw0-lr1
+ovn-nbctl lsp-set-type sw0-lr1 router
+ovn-nbctl lsp-set-addresses sw0-lr1 router
+ovn-nbctl lsp-set-options sw0-lr1 router-port=lr1-sw0
+
+# Both comp1 and comp2 should be in 'ref_chassis' as sw1 is indirectly
+# connected to lr0
+exp_ref_ch_list=''
+for i in `ovn-sbctl --bare --columns _uuid list chassis | sort`
+do
+    if test $i = $comp1_ch_uuid; then
+        exp_ref_ch_list="${exp_ref_ch_list}$i"
+    elif test $i = $comp2_ch_uuid; then
+        exp_ref_ch_list="${exp_ref_ch_list}$i"
+    fi
+done
+
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
+# Unind sw1-p1. comp2 should not be in the ref_chassis.
+ovn-sbctl lsp-unbind sw1-p1
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw1-p1` = xdown])
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp2_ch_uuid" = "$ref_ch_list"])
+
+# Create sw2 and attach it to lr2
+ovn-nbctl ls-add sw2
+ovn-nbctl lsp-add sw2 sw2-p1
+ovn-nbctl lr-add lr2
+ovn-nbctl lrp-add lr2 lr2-sw2 00:00:20:20:12:17 30.0.0.1/24
+ovn-nbctl lsp-add sw2 sw2-lr2
+ovn-nbctl lsp-set-type sw2-lr2 router
+ovn-nbctl lsp-set-addresses sw2-lr2 router
+ovn-nbctl lsp-set-options sw2-lr2 router-port=lr2-sw2
+
+# Bind sw2-p1 to comp1
+ovn-sbctl lsp-bind sw2-p1 comp1
+# Wait until sw2-p1 is up
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw2-p1` = xup])
+
+# sw2-p1 is not connected to lr0. So comp1 should not be in 'ref_chassis'
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp2_ch_uuid" = "$ref_ch_list"])
+
+# Now attach sw1 to lr2. With this sw2-p1 is indirectly connected to lr0.
+ovn-nbctl lrp-add lr2 lr2-sw1 00:00:20:20:12:18 20.0.0.10/24
+ovn-nbctl lsp-add sw1 sw1-lr2
+ovn-nbctl lsp-set-type sw1-lr2 router
+ovn-nbctl lsp-set-addresses sw1-lr2 router
+ovn-nbctl lsp-set-options sw1-lr2 router-port=lr2-sw1
+
+# sw2-p1 is indirectly connected to lr0. So comp1 (and comp2) should be in
+# 'ref_chassis'
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
+# Create sw0-p2 and bind it to comp1
+ovn-nbctl lsp-add sw0 sw0-p2
+ovn-sbctl lsp-bind sw0-p2 comp1
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p2` = xup])
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
+# unbind sw0-p2
+ovn-sbctl lsp-unbind sw0-p2
+OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0-p2` = xdown])
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
+# Delete lr1-sw0. comp1 should be deleted from ref_chassis as there is no link
+# from sw1 and sw2 to lr0.
+ovn-nbctl lrp-del lr1-sw0
+
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$comp2_ch_uuid" = "$ref_ch_list"])
+
+AT_CLEANUP
diff --git a/tests/ovn.at b/tests/ovn.at
index f2f2bc405..04a618801 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -8398,6 +8398,16 @@  as ext1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
 
 AT_CHECK([ovn-nbctl --timeout=3 --wait=sb sync], [0], [ignore])
 
+# hv1 should be in 'ref_chassis' of the ha_chasssi_group as logical
+# switch 'foo' can reach the router 'R1' (which has gw router port)
+# via foo1 -> foo -> R0 -> join -> R1
+hv1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv1"`
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$hv1_ch_uuid" = "$ref_ch_list"])
+
 # Allow some time for ovn-northd and ovn-controller to catch up.
 # XXX This should be more systematic.
 sleep 2
@@ -9766,6 +9776,32 @@  echo "------ Port_Binding chassisredirect -------"
 ovn-sbctl find Port_Binding type=chassisredirect
 echo "-------------------------------------------"
 
+# There should be one ha_chassis_group with the name "outside"
+ha_chassi_grp_name=`ovn-sbctl --bare --columns name find \
+ha_chassis_group name="outside"`
+
+AT_CHECK([test $ha_chassi_grp_name = outside])
+
+# There should be 2 ha_chassis rows in SB DB.
+AT_CHECK([ovn-sbctl list ha_chassis | grep chassis | awk '{print $3}' \
+| grep '-' | wc -l ], [0], [2
+])
+
+ha_ch=`ovn-sbctl --bare --columns ha_chassis  find ha_chassis_group`
+# Trim the spaces.
+ha_ch=`echo $ha_ch | sed 's/ //g'`
+
+ha_ch_list=''
+for i in `ovn-sbctl --bare --columns _uuid list ha_chassis | sort`
+do
+    ha_ch_list="$ha_ch_list $i"
+done
+
+# Trim the spaces.
+ha_ch_list=`echo $ha_ch_list | sed 's/ //g'`
+
+AT_CHECK([test "$ha_ch_list" = "$ha_ch"])
+
 for chassis in gw1 gw2 hv1 hv2; do
     as $chassis
     echo "------ $chassis dump ----------"
@@ -9810,33 +9846,56 @@  as hv2 ovs-ofctl dump-flows br-int table=32
 gw1_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw1)
 gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
 
-AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv1_gw1_ofport,$hv1_gw2_ofport | wc -l], [0], [1
+OVS_WAIT_UNTIL([as hv1 ovs-ofctl dump-flows br-int table=32 | \
+grep active_backup | grep slaves:$hv1_gw1_ofport,$hv1_gw2_ofport \
+| wc -l], [0], [1
 ])
 
-AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv2_gw1_ofport,$hv2_gw2_ofport | wc -l], [0], [1
+OVS_WAIT_UNTIL([as hv2 ovs-ofctl dump-flows br-int table=32 | \
+grep active_backup | grep slaves:$hv2_gw1_ofport,$hv2_gw2_ofport \
+| wc -l], [0], [1
 ])
 
-sleep 3 # let BFD sessions settle so we get the right flows on the right chassis
-
 # make sure that flows for handling the outside router port reside on gw1
-AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
+OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[1
 ]])
-AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
+
+OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[0
 ]])
 
 # make sure ARP responder flows for outside router port reside on gw1 too
-AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=9 | grep arp_tpa=192.168.0.101 | wc -l], [0], [[1
+OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=9 | \
+grep arp_tpa=192.168.0.101 | wc -l], [0], [[1
 ]])
-AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=9 | grep arp_tpa=192.168.0.101 | wc -l], [0], [[0
+OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=9 | grep arp_tpa=192.168.0.101 | wc -l], [0], [[0
 ]])
 
-
-
 # check that the chassis redirect port has been claimed by the gw1 chassis
-AT_CHECK([ovn-sbctl --columns chassis --bare find Port_Binding logical_port=cr-outside | grep $gw1_chassis | wc -l],
-         [0],[[1
+OVS_WAIT_UNTIL([ovn-sbctl --columns chassis --bare find Port_Binding \
+logical_port=cr-outside | grep $gw1_chassis | wc -l], [0],[[1
 ]])
 
+hv1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv1"`
+hv2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv2"`
+
+exp_ref_ch_list=''
+for i in `ovn-sbctl --bare --columns _uuid list chassis | sort`
+do
+    if test $i = $hv1_ch_uuid; then
+        exp_ref_ch_list="${exp_ref_ch_list}$i"
+    elif test $i = $hv2_ch_uuid; then
+        exp_ref_ch_list="${exp_ref_ch_list}$i"
+    fi
+done
+
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
 
 # at this point, we invert the priority of the gw chassis between gw1 and gw2
 
@@ -9851,15 +9910,19 @@  ovn-nbctl --id=@gc0 create Gateway_Chassis \
 ovn-nbctl --wait=hv --timeout=3 sync
 
 # we make sure that the hypervisors noticed, and inverted the slave ports
-AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv1_gw2_ofport,$hv1_gw1_ofport | wc -l], [0], [1
+OVS_WAIT_UNTIL([as hv1 ovs-ofctl dump-flows br-int table=32 | \
+grep active_backup | grep slaves:$hv1_gw2_ofport,$hv1_gw1_ofport \
+| wc -l], [0], [1
 ])
 
-AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=32 | grep active_backup | grep slaves:$hv2_gw2_ofport,$hv2_gw1_ofport | wc -l], [0], [1
+OVS_WAIT_UNTIL([as hv2 ovs-ofctl dump-flows br-int table=32 | \
+grep active_backup | grep slaves:$hv2_gw2_ofport,$hv2_gw1_ofport \
+| wc -l], [0], [1
 ])
 
 # check that the chassis redirect port has been reclaimed by the gw2 chassis
-AT_CHECK([ovn-sbctl --columns chassis --bare find Port_Binding logical_port=cr-outside | grep $gw2_chassis | wc -l],
-         [0],[[1
+OVS_WAIT_UNTIL([ovn-sbctl --columns chassis --bare find Port_Binding \
+logical_port=cr-outside | grep $gw2_chassis | wc -l], [0],[[1
 ]])
 
 # check BFD enablement on tunnel ports from gw1 #########
@@ -9908,31 +9971,32 @@  AT_CHECK([ovs-vsctl --bare --columns bfd find Interface name=ovn-hv1-0],[0],
          [[
 ]])
 
-sleep 3  # let BFD sessions settle so we get the right flows on the right chassis
-
 # make sure that flows for handling the outside router port reside on gw2 now
-AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
+OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[1
 ]])
-AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
+OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[0
 ]])
 
 # disconnect GW2 from the network, GW1 should take over
 as gw2
 port=${sandbox}_br-phys
 as main ovs-vsctl del-port n1 $port
-sleep 4
 
 bfd_dump
 
 # make sure that flows for handling the outside router port reside on gw2 now
-AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
+OVS_WAIT_UNTIL([as gw1 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[1
 ]])
-AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
+OVS_WAIT_UNTIL([as gw2 ovs-ofctl dump-flows br-int table=24 | \
+grep 00:00:02:01:02:04 | wc -l], [0], [[0
 ]])
 
 # check that the chassis redirect port has been reclaimed by the gw1 chassis
-AT_CHECK([ovn-sbctl --columns chassis --bare find Port_Binding logical_port=cr-outside | grep $gw1_chassis | wc -l],
-         [0],[[1
+OVS_WAIT_UNTIL([ovn-sbctl --columns chassis --bare find Port_Binding \
+logical_port=cr-outside | grep $gw1_chassis | wc -l], [0],[[1
 ]])
 
 ovn-nbctl --wait=hv set NB_Global . options:"bfd-min-rx"=2000
@@ -9962,6 +10026,37 @@  for chassis in gw1 hv1 hv2; do
 ])
 done
 
+# Delete the inside1 vif. The ref_chassis in ha_chassis_group shouldn't have
+# reference to hv1.
+as hv1 ovs-vsctl del-port hv1-vif1
+
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     test "$hv2_ch_uuid" = "$ref_ch_list"])
+
+# Delete the inside2 vif.
+ovn-sbctl show
+
+echo "Deleting hv2-vif1"
+as hv2 ovs-vsctl del-port hv2-vif1
+
+# ref_chassis of ha_chassis_group should be empty
+OVS_WAIT_UNTIL(
+    [ref_ch_list=`ovn-sbctl --bare --columns ref_chassis find ha_chassis_group | sort`
+     # Trim the spaces.
+     ref_ch_list=`echo $ref_ch_list | sed 's/ //g'`
+     exp_ref_ch_list=""
+     test "$exp_ref_ch_list" = "$ref_ch_list"])
+
+# Delete the Gateway_Chassis for lrp - outside
+ovn-nbctl clear Logical_Router_Port outside gateway_chassis
+
+# There shoud be no ha_chassis_group rows in SB DB.
+OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis_group | wc -l`])
+OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list ha_chassis | wc -l`])
+
 OVN_CLEANUP([gw1],[gw2],[hv1],[hv2])
 
 AT_CLEANUP
@@ -10198,10 +10293,7 @@  ovn-nbctl --wait=hv --timeout=3 sync
 gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
 ovn-sbctl destroy Chassis $gw2_chassis
 
-# Ensure ovn-controller has processed latest sbdb update
-# ovn-nbctl --wait=hv sync
-
-AT_CHECK([grep "Releasing lport" gw1/ovn-controller.log], [1], [])
+OVS_WAIT_UNTIL([test 0 = `grep -c "Releasing lport" gw1/ovn-controller.log`])
 
 OVN_CLEANUP([gw1],[gw2],[hv1])