diff mbox series

[ovs-dev,v7] AB bonding: Add "primary" interface concept

Message ID 20200522211205.18898-1-jsquyres@cisco.com
State Changes Requested
Delegated to: Ilya Maximets
Headers show
Series [ovs-dev,v7] AB bonding: Add "primary" interface concept | expand

Commit Message

Jeff Squyres \(jsquyres\) May 22, 2020, 9:12 p.m. UTC
In AB bonding, if the current active slave becomes disabled, a
replacement slave is arbitrarily picked from the remaining set of
enabled slaves.  This commit adds the concept of a "primary" slave: an
interface that will always be (or become) the current active slave if
it is enabled.

The rationale for this functionality is to allow the designation of a
preferred interface for a given bond.  For example:

1. Bond is created with interfaces p1 (primary) and p2, both enabled.
2. p1 becomes the current active slave (because it was designated as
   the primary).
3. Later, p1 fails/becomes disabled.
4. p2 is chosen to become the current active slave.
5. Later, p1 becomes re-enabled.
6. p1 is chosen to become the current active slave (because it was
   designated as the primary)

Note that p1 becomes the active slave once it becomes re-enabled, even
if nothing has happened to p2.

This "primary" concept exists in Linux kernel network interface
bonding, but did not previously exist in OVS bonding.

Only one primary slave inteface is supported per bond, and is only
supported for active/backup bonding.

The primary slave interface is designated via
"other_config:bond-primary" when creating a bond.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Greg Rose <gvrose8192@gmail.com>
---
v7:
- After rebasing patch to head of tree, adjust test output as result
  of changes from 81f71381f.

v6:
- Style: bleep bloop.  Fix use of {}.

v5:
- Handle when bond is reconfigured to remove "bond-primary" config.
- Fix potential flapping between remaining slaves.
- Add a test to re-add a removed primary interface and make sure the
  bond reflects that properly.
- Add a test to remove "bond-primary" config and make sure the bond
  reflects that properly.

v4:
- Style: bleep bloop.  Trim line length below 79 characters.

v3:
- Adjusted print logic for when the primary slave is not attached

v2:
- Added "ovs-vsctl bond/show" label when primary interface is no
  longer enslaved
- Fixed strcmp() usage nits
- Moved "other_config:bond-primary" docs to the "Bonding
  Configuration" group
- Expanded commit message
- Added more test cases (including one for when the primary interface
  is no longer enslaved)

 ofproto/bond.c        |  75 +++++++++++++++-
 ofproto/bond.h        |   1 +
 tests/lacp.at         |   1 +
 tests/ofproto-dpif.at | 199 +++++++++++++++++++++++++++++++++++++++++-
 vswitchd/bridge.c     |   5 ++
 vswitchd/vswitch.xml  |   8 ++
 6 files changed, 287 insertions(+), 2 deletions(-)

Comments

Jeff Squyres \(jsquyres\) June 2, 2020, 1:47 p.m. UTC | #1
Greetings.

Is there an estimate on when this patch can be merged?

Thanks!


> On May 22, 2020, at 5:12 PM, Jeff Squyres <jsquyres@cisco.com> wrote:
> 
> In AB bonding, if the current active slave becomes disabled, a
> replacement slave is arbitrarily picked from the remaining set of
> enabled slaves.  This commit adds the concept of a "primary" slave: an
> interface that will always be (or become) the current active slave if
> it is enabled.
> 
> The rationale for this functionality is to allow the designation of a
> preferred interface for a given bond.  For example:
> 
> 1. Bond is created with interfaces p1 (primary) and p2, both enabled.
> 2. p1 becomes the current active slave (because it was designated as
>   the primary).
> 3. Later, p1 fails/becomes disabled.
> 4. p2 is chosen to become the current active slave.
> 5. Later, p1 becomes re-enabled.
> 6. p1 is chosen to become the current active slave (because it was
>   designated as the primary)
> 
> Note that p1 becomes the active slave once it becomes re-enabled, even
> if nothing has happened to p2.
> 
> This "primary" concept exists in Linux kernel network interface
> bonding, but did not previously exist in OVS bonding.
> 
> Only one primary slave inteface is supported per bond, and is only
> supported for active/backup bonding.
> 
> The primary slave interface is designated via
> "other_config:bond-primary" when creating a bond.
> 
> Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
> Reviewed-by: Aaron Conole <aconole@redhat.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> Acked-by: Greg Rose <gvrose8192@gmail.com>
> ---
> v7:
> - After rebasing patch to head of tree, adjust test output as result
>  of changes from 81f71381f.
> 
> v6:
> - Style: bleep bloop.  Fix use of {}.
> 
> v5:
> - Handle when bond is reconfigured to remove "bond-primary" config.
> - Fix potential flapping between remaining slaves.
> - Add a test to re-add a removed primary interface and make sure the
>  bond reflects that properly.
> - Add a test to remove "bond-primary" config and make sure the bond
>  reflects that properly.
> 
> v4:
> - Style: bleep bloop.  Trim line length below 79 characters.
> 
> v3:
> - Adjusted print logic for when the primary slave is not attached
> 
> v2:
> - Added "ovs-vsctl bond/show" label when primary interface is no
>  longer enslaved
> - Fixed strcmp() usage nits
> - Moved "other_config:bond-primary" docs to the "Bonding
>  Configuration" group
> - Expanded commit message
> - Added more test cases (including one for when the primary interface
>  is no longer enslaved)
> 
> ofproto/bond.c        |  75 +++++++++++++++-
> ofproto/bond.h        |   1 +
> tests/lacp.at         |   1 +
> tests/ofproto-dpif.at | 199 +++++++++++++++++++++++++++++++++++++++++-
> vswitchd/bridge.c     |   5 ++
> vswitchd/vswitch.xml  |   8 ++
> 6 files changed, 287 insertions(+), 2 deletions(-)
> 
> diff --git a/ofproto/bond.c b/ofproto/bond.c
> index 405202fb6..4ad1a20ae 100644
> --- a/ofproto/bond.c
> +++ b/ofproto/bond.c
> @@ -93,6 +93,7 @@ struct bond_slave {
>     /* Link status. */
>     bool enabled;               /* May be chosen for flows? */
>     bool may_enable;            /* Client considers this slave bondable. */
> +    bool is_primary;            /* This slave is preferred over others. */
>     long long delay_expires;    /* Time after which 'enabled' may change. */
> 
>     /* Rebalancing info.  Used only by bond_rebalance(). */
> @@ -126,6 +127,7 @@ struct bond {
>     enum lacp_status lacp_status; /* Status of LACP negotiations. */
>     bool bond_revalidate;       /* True if flows need revalidation. */
>     uint32_t basis;             /* Basis for flow hash function. */
> +    char *primary;              /* Name of the primary slave interface. */
> 
>     /* SLB specific bonding info. */
>     struct bond_entry *hash;     /* An array of BOND_BUCKETS elements. */
> @@ -241,6 +243,7 @@ bond_create(const struct bond_settings *s, struct ofproto_dpif *ofproto)
> 
>     bond->active_slave_mac = eth_addr_zero;
>     bond->active_slave_changed = false;
> +    bond->primary = NULL;
> 
>     bond_reconfigure(bond, s);
>     return bond;
> @@ -290,6 +293,7 @@ bond_unref(struct bond *bond)
>     update_recirc_rules__(bond);
> 
>     hmap_destroy(&bond->pr_rule_ops);
> +    free(bond->primary);
>     free(bond->name);
>     free(bond);
> }
> @@ -459,6 +463,39 @@ bond_reconfigure(struct bond *bond, const struct bond_settings *s)
>         bond->bond_revalidate = false;
>     }
> 
> +    /*
> +     * If a primary interface is set on the new settings:
> +     * 1. If the bond has no primary previously set, save it and
> +     * revalidate.
> +     * 2. If the bond has a different primary previously set, save the
> +     * new one and revalidate.
> +     * 3. If the bond has the same primary previously set, do nothing.
> +     */
> +    if (s->primary) {
> +        bool changed = false;
> +        if (bond->primary) {
> +            if (strcmp(bond->primary, s->primary)) {
> +                free(bond->primary);
> +                changed = true;
> +            }
> +        } else {
> +            changed = true;
> +        }
> +
> +        if (changed) {
> +            bond->primary = xstrdup(s->primary);
> +            revalidate = true;
> +        }
> +    } else if (bond->primary) {
> +        /*
> +         * If the new settings have no primary interface, but the
> +         * bond already has a primary, remove the bond's primary.
> +         */
> +        free(bond->primary);
> +        bond->primary = NULL;
> +        revalidate = true;
> +    }
> +
>     if (bond->balance != BM_AB) {
>         if (!bond->recirc_id) {
>             bond->recirc_id = recirc_alloc_id(bond->ofproto);
> @@ -549,6 +586,12 @@ bond_slave_register(struct bond *bond, void *slave_,
>         slave->name = xstrdup(netdev_get_name(netdev));
>         bond->bond_revalidate = true;
> 
> +        if (bond->primary && !strcmp(bond->primary, slave->name)) {
> +            slave->is_primary = true;
> +        } else {
> +            slave->is_primary = false;
> +        }
> +
>         slave->enabled = false;
>         bond_enable_slave(slave, netdev_get_carrier(netdev));
>     }
> @@ -644,6 +687,7 @@ bond_run(struct bond *bond, enum lacp_status lacp_status)
> {
>     struct bond_slave *slave;
>     bool revalidate;
> +    struct bond_slave *primary;
> 
>     ovs_rwlock_wrlock(&rwlock);
>     if (bond->lacp_status != lacp_status) {
> @@ -659,11 +703,19 @@ bond_run(struct bond *bond, enum lacp_status lacp_status)
>     }
> 
>     /* Enable slaves based on link status and LACP feedback. */
> +    primary = NULL;
>     HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
>         bond_link_status_update(slave);
>         slave->change_seq = seq_read(connectivity_seq_get());
> +
> +        /* Discover if there is an active slave marked "primary". */
> +        if (bond->balance == BM_AB && slave->is_primary && slave->enabled) {
> +            primary = slave;
> +        }
>     }
> -    if (!bond->active_slave || !bond->active_slave->enabled) {
> +
> +    if (!bond->active_slave || !bond->active_slave->enabled ||
> +        (primary && primary->enabled && bond->active_slave)) {
>         bond_choose_active_slave(bond);
>     }
> 
> @@ -1393,6 +1445,20 @@ bond_print_details(struct ds *ds, const struct bond *bond)
>     ds_put_format(ds, "lacp_fallback_ab: %s\n",
>                   bond->lacp_fallback_ab ? "true" : "false");
> 
> +    bool found_primary = false;
> +    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
> +        if (slave->is_primary) {
> +            found_primary = true;
> +        }
> +    }
> +
> +    if (bond->balance == BM_AB) {
> +        ds_put_format(ds, "primary: %s%s\n",
> +                      bond->primary ? bond->primary : "<none>",
> +                      (!found_primary && bond->primary) ?
> +                      " (no such slave)" : "");
> +    }
> +
>     ds_put_cstr(ds, "active slave mac: ");
>     ds_put_format(ds, ETH_ADDR_FMT, ETH_ADDR_ARGS(bond->active_slave_mac));
>     slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
> @@ -1862,6 +1928,13 @@ bond_choose_slave(const struct bond *bond)
> {
>     struct bond_slave *slave, *best;
> 
> +    /* If there's a primary and it's active, return that. */
> +    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
> +        if (slave->is_primary && slave->enabled) {
> +            return slave;
> +        }
> +    }
> +
>     /* Find the last active slave. */
>     slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
>     if (slave && slave->enabled) {
> diff --git a/ofproto/bond.h b/ofproto/bond.h
> index e7c3d9bc3..3a923dcfa 100644
> --- a/ofproto/bond.h
> +++ b/ofproto/bond.h
> @@ -45,6 +45,7 @@ struct bond_settings {
> 
>     /* Balancing configuration. */
>     enum bond_mode balance;
> +    const char *primary;        /* For AB balance, primary interface name. */
>     int rebalance_interval;     /* Milliseconds between rebalances.
>                                    Zero to disable rebalancing. */
> 
> diff --git a/tests/lacp.at b/tests/lacp.at
> index 7b460d7be..696ffc6d4 100644
> --- a/tests/lacp.at
> +++ b/tests/lacp.at
> @@ -125,6 +125,7 @@ updelay: 0 ms
> downdelay: 0 ms
> lacp_status: negotiated
> lacp_fallback_ab: false
> +primary: <none>
> active slave mac: 00:00:00:00:00:00(none)
> 
> slave p1: disabled
> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> index 41164d735..b2d8c089f 100644
> --- a/tests/ofproto-dpif.at
> +++ b/tests/ofproto-dpif.at
> @@ -29,7 +29,203 @@ AT_CHECK([ovs-appctl revalidator/wait])
> OVS_VSWITCHD_STOP
> AT_CLEANUP
> 
> -AT_SETUP([ofproto-dpif - active-backup bonding])
> +AT_SETUP([ofproto-dpif - active-backup bonding (with primary)])
> +
> +# Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and
> +#    p2 (p1 as primary) and br1 with interfaces p3, p4 and p8.
> +# toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
> +# With p1 down and p2 up/active, bring p1 back up.  Since p1 is the primary,
> +# it should become active.
> +OVS_VSWITCHD_START(
> +  [add-bond br0 bond0 p1 p2 bond_mode=active-backup \
> +                other_config:bond-primary=p1 -- \
> +   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
> +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy -- \
> +   add-br br1 -- \
> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234 \
> +                  fail-mode=secure -- \
> +   add-port br1 p3 -- set interface p3 type=dummy options:stream=unix:$OVS_RUNDIR/p1.sock ofport_request=3 -- \
> +   add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
> +   add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
> +WAIT_FOR_DUMMY_PORTS([p3], [p4])
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
> +
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-ofctl add-flow br1 action=normal])
> +ovs-appctl netdev-dummy/set-admin-state up
> +ovs-appctl time/warp 100
> +ovs-appctl netdev-dummy/set-admin-state p2 down
> +ovs-appctl time/stop
> +ovs-appctl time/warp 100
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +ovs-appctl time/warp 100
> +ovs-appctl netdev-dummy/set-admin-state p2 up
> +ovs-appctl netdev-dummy/set-admin-state p1 down
> +ovs-appctl time/warp 100
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(src=10.0.0.5,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(src=10.0.0.6,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +ovs-appctl time/warp 200 100
> +sleep 1
> +AT_CHECK([grep 'in_port([[348]])' ovs-vswitchd.log | filter_flow_install | strip_xout], [0], [dnl
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +])
> +
> +ovs-appctl netdev-dummy/set-admin-state p1 up
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +])
> +
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> +
> +AT_SETUP([ofproto-dpif - active-backup bonding (primary validation)])
> +# Make a switch with 3 ports in a bond, so that when we delete one of
> +# the ports from the bond, there are still 2 ports left and the bond
> +# remains functional.
> +OVS_VSWITCHD_START(
> +  [add-bond br0 bond0 p1 p2 p3 bond_mode=active-backup \
> +                other_config:bond-primary=p1 -- \
> +   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
> +   set interface p3 type=dummy options:pstream=punix:$OVS_RUNDIR/p3.sock ofport_request=3 -- \
> +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy --])
> +
> +dnl Make sure the initial primary interface is set
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> +
> +dnl Down the primary interface and verify that we switched.  Then
> +dnl bring the primary back and verify that we switched back to the
> +dnl primary.
> +ovs-appctl netdev-dummy/set-admin-state p1 down
> +AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'slave p1: disabled'`"])
> +ovs-appctl netdev-dummy/set-admin-state p1 up
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +dnl Now delete the primary and verify that the output shows that the
> +dnl primary is no longer enslaved
> +ovs-vsctl --id=@p1 get Interface p1 -- remove Port bond0 interfaces @p1
> +AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'primary: p1 (no such slave)'`"])
> +
> +dnl Now re-add the primary and verify that the output shows that the
> +dnl primary is available again.
> +dnl
> +dnl First, get the UUIDs of the interfaces that exist on bond0.
> +dnl Strip the trailing ] so that we can add a new UUID to the end.
> +uuids=`ovs-vsctl get Port bond0 interfaces | sed -e 's/]//'`
> +dnl Create a new port "p1" and add its UUID to the set of interfaces
> +dnl on bond0.
> +ovs-vsctl \
> +   --id=@p1 create Interface name=p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set Port bond0 interfaces="$uuids, @p1]"
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +dnl Remove the "bond-primary" config directive from the bond.
> +dnl First, find the other_config values on bond0.
> +other_config=`ovs-vsctl get Port bond0 other_config | sed -e 's/\(, \)*bond-primary=p1//'`
> +dnl Now re-set the other_config on bond0.
> +ovs-vsctl set Port bond0 other_config="$other_config"
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: <none>
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> +
> +AT_SETUP([ofproto-dpif - active-backup bonding (without primary)])
> # Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and p2
> #    and br1 with interfaces p3, p4 and p8.
> # toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
> @@ -46,6 +242,7 @@ OVS_VSWITCHD_START(
>    add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
>    add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
> WAIT_FOR_DUMMY_PORTS([p3], [p4])
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: <none>'`"])
> AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
> 
> AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index fe73c38d4..5f30b7737 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -4563,6 +4563,11 @@ port_configure_bond(struct port *port, struct bond_settings *s)
>                   port->name);
>     }
> 
> +    s->primary = NULL;
> +    if (s->balance == BM_AB) {
> +        s->primary = smap_get(&port->cfg->other_config, "bond-primary");
> +    }
> +
>     miimon_interval = smap_get_int(&port->cfg->other_config,
>                                    "bond-miimon-interval", 0);
>     if (miimon_interval <= 0) {
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 6d334370d..a2a26e849 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -1994,6 +1994,14 @@
>         <code>active-backup</code>.
>       </column>
> 
> +      <column name="other_config" key="bond-primary"
> +              type='{"type": "string"}'>
> +        If a slave interface with this name exists in the bond and
> +        is up, it will be made active.  Relevant only when <ref
> +        column="other_config" key="bond-mode"/> is
> +        <code>active-backup</code>.
> +      </column>
> +
>       <group title="Link Failure Detection">
>         <p>
>           An important part of link bonding is detecting that links are down so
> -- 
> 2.22.0
>
Jeff Squyres \(jsquyres\) June 5, 2020, 1:02 p.m. UTC | #2
Is there anything else that is needed before this patch can be merged?

Thanks!


> On May 22, 2020, at 5:12 PM, Jeff Squyres <jsquyres@cisco.com> wrote:
> 
> In AB bonding, if the current active slave becomes disabled, a
> replacement slave is arbitrarily picked from the remaining set of
> enabled slaves.  This commit adds the concept of a "primary" slave: an
> interface that will always be (or become) the current active slave if
> it is enabled.
> 
> The rationale for this functionality is to allow the designation of a
> preferred interface for a given bond.  For example:
> 
> 1. Bond is created with interfaces p1 (primary) and p2, both enabled.
> 2. p1 becomes the current active slave (because it was designated as
>   the primary).
> 3. Later, p1 fails/becomes disabled.
> 4. p2 is chosen to become the current active slave.
> 5. Later, p1 becomes re-enabled.
> 6. p1 is chosen to become the current active slave (because it was
>   designated as the primary)
> 
> Note that p1 becomes the active slave once it becomes re-enabled, even
> if nothing has happened to p2.
> 
> This "primary" concept exists in Linux kernel network interface
> bonding, but did not previously exist in OVS bonding.
> 
> Only one primary slave inteface is supported per bond, and is only
> supported for active/backup bonding.
> 
> The primary slave interface is designated via
> "other_config:bond-primary" when creating a bond.
> 
> Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
> Reviewed-by: Aaron Conole <aconole@redhat.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> Acked-by: Greg Rose <gvrose8192@gmail.com>
> ---
> v7:
> - After rebasing patch to head of tree, adjust test output as result
>  of changes from 81f71381f.
> 
> v6:
> - Style: bleep bloop.  Fix use of {}.
> 
> v5:
> - Handle when bond is reconfigured to remove "bond-primary" config.
> - Fix potential flapping between remaining slaves.
> - Add a test to re-add a removed primary interface and make sure the
>  bond reflects that properly.
> - Add a test to remove "bond-primary" config and make sure the bond
>  reflects that properly.
> 
> v4:
> - Style: bleep bloop.  Trim line length below 79 characters.
> 
> v3:
> - Adjusted print logic for when the primary slave is not attached
> 
> v2:
> - Added "ovs-vsctl bond/show" label when primary interface is no
>  longer enslaved
> - Fixed strcmp() usage nits
> - Moved "other_config:bond-primary" docs to the "Bonding
>  Configuration" group
> - Expanded commit message
> - Added more test cases (including one for when the primary interface
>  is no longer enslaved)
> 
> ofproto/bond.c        |  75 +++++++++++++++-
> ofproto/bond.h        |   1 +
> tests/lacp.at         |   1 +
> tests/ofproto-dpif.at | 199 +++++++++++++++++++++++++++++++++++++++++-
> vswitchd/bridge.c     |   5 ++
> vswitchd/vswitch.xml  |   8 ++
> 6 files changed, 287 insertions(+), 2 deletions(-)
> 
> diff --git a/ofproto/bond.c b/ofproto/bond.c
> index 405202fb6..4ad1a20ae 100644
> --- a/ofproto/bond.c
> +++ b/ofproto/bond.c
> @@ -93,6 +93,7 @@ struct bond_slave {
>     /* Link status. */
>     bool enabled;               /* May be chosen for flows? */
>     bool may_enable;            /* Client considers this slave bondable. */
> +    bool is_primary;            /* This slave is preferred over others. */
>     long long delay_expires;    /* Time after which 'enabled' may change. */
> 
>     /* Rebalancing info.  Used only by bond_rebalance(). */
> @@ -126,6 +127,7 @@ struct bond {
>     enum lacp_status lacp_status; /* Status of LACP negotiations. */
>     bool bond_revalidate;       /* True if flows need revalidation. */
>     uint32_t basis;             /* Basis for flow hash function. */
> +    char *primary;              /* Name of the primary slave interface. */
> 
>     /* SLB specific bonding info. */
>     struct bond_entry *hash;     /* An array of BOND_BUCKETS elements. */
> @@ -241,6 +243,7 @@ bond_create(const struct bond_settings *s, struct ofproto_dpif *ofproto)
> 
>     bond->active_slave_mac = eth_addr_zero;
>     bond->active_slave_changed = false;
> +    bond->primary = NULL;
> 
>     bond_reconfigure(bond, s);
>     return bond;
> @@ -290,6 +293,7 @@ bond_unref(struct bond *bond)
>     update_recirc_rules__(bond);
> 
>     hmap_destroy(&bond->pr_rule_ops);
> +    free(bond->primary);
>     free(bond->name);
>     free(bond);
> }
> @@ -459,6 +463,39 @@ bond_reconfigure(struct bond *bond, const struct bond_settings *s)
>         bond->bond_revalidate = false;
>     }
> 
> +    /*
> +     * If a primary interface is set on the new settings:
> +     * 1. If the bond has no primary previously set, save it and
> +     * revalidate.
> +     * 2. If the bond has a different primary previously set, save the
> +     * new one and revalidate.
> +     * 3. If the bond has the same primary previously set, do nothing.
> +     */
> +    if (s->primary) {
> +        bool changed = false;
> +        if (bond->primary) {
> +            if (strcmp(bond->primary, s->primary)) {
> +                free(bond->primary);
> +                changed = true;
> +            }
> +        } else {
> +            changed = true;
> +        }
> +
> +        if (changed) {
> +            bond->primary = xstrdup(s->primary);
> +            revalidate = true;
> +        }
> +    } else if (bond->primary) {
> +        /*
> +         * If the new settings have no primary interface, but the
> +         * bond already has a primary, remove the bond's primary.
> +         */
> +        free(bond->primary);
> +        bond->primary = NULL;
> +        revalidate = true;
> +    }
> +
>     if (bond->balance != BM_AB) {
>         if (!bond->recirc_id) {
>             bond->recirc_id = recirc_alloc_id(bond->ofproto);
> @@ -549,6 +586,12 @@ bond_slave_register(struct bond *bond, void *slave_,
>         slave->name = xstrdup(netdev_get_name(netdev));
>         bond->bond_revalidate = true;
> 
> +        if (bond->primary && !strcmp(bond->primary, slave->name)) {
> +            slave->is_primary = true;
> +        } else {
> +            slave->is_primary = false;
> +        }
> +
>         slave->enabled = false;
>         bond_enable_slave(slave, netdev_get_carrier(netdev));
>     }
> @@ -644,6 +687,7 @@ bond_run(struct bond *bond, enum lacp_status lacp_status)
> {
>     struct bond_slave *slave;
>     bool revalidate;
> +    struct bond_slave *primary;
> 
>     ovs_rwlock_wrlock(&rwlock);
>     if (bond->lacp_status != lacp_status) {
> @@ -659,11 +703,19 @@ bond_run(struct bond *bond, enum lacp_status lacp_status)
>     }
> 
>     /* Enable slaves based on link status and LACP feedback. */
> +    primary = NULL;
>     HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
>         bond_link_status_update(slave);
>         slave->change_seq = seq_read(connectivity_seq_get());
> +
> +        /* Discover if there is an active slave marked "primary". */
> +        if (bond->balance == BM_AB && slave->is_primary && slave->enabled) {
> +            primary = slave;
> +        }
>     }
> -    if (!bond->active_slave || !bond->active_slave->enabled) {
> +
> +    if (!bond->active_slave || !bond->active_slave->enabled ||
> +        (primary && primary->enabled && bond->active_slave)) {
>         bond_choose_active_slave(bond);
>     }
> 
> @@ -1393,6 +1445,20 @@ bond_print_details(struct ds *ds, const struct bond *bond)
>     ds_put_format(ds, "lacp_fallback_ab: %s\n",
>                   bond->lacp_fallback_ab ? "true" : "false");
> 
> +    bool found_primary = false;
> +    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
> +        if (slave->is_primary) {
> +            found_primary = true;
> +        }
> +    }
> +
> +    if (bond->balance == BM_AB) {
> +        ds_put_format(ds, "primary: %s%s\n",
> +                      bond->primary ? bond->primary : "<none>",
> +                      (!found_primary && bond->primary) ?
> +                      " (no such slave)" : "");
> +    }
> +
>     ds_put_cstr(ds, "active slave mac: ");
>     ds_put_format(ds, ETH_ADDR_FMT, ETH_ADDR_ARGS(bond->active_slave_mac));
>     slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
> @@ -1862,6 +1928,13 @@ bond_choose_slave(const struct bond *bond)
> {
>     struct bond_slave *slave, *best;
> 
> +    /* If there's a primary and it's active, return that. */
> +    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
> +        if (slave->is_primary && slave->enabled) {
> +            return slave;
> +        }
> +    }
> +
>     /* Find the last active slave. */
>     slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
>     if (slave && slave->enabled) {
> diff --git a/ofproto/bond.h b/ofproto/bond.h
> index e7c3d9bc3..3a923dcfa 100644
> --- a/ofproto/bond.h
> +++ b/ofproto/bond.h
> @@ -45,6 +45,7 @@ struct bond_settings {
> 
>     /* Balancing configuration. */
>     enum bond_mode balance;
> +    const char *primary;        /* For AB balance, primary interface name. */
>     int rebalance_interval;     /* Milliseconds between rebalances.
>                                    Zero to disable rebalancing. */
> 
> diff --git a/tests/lacp.at b/tests/lacp.at
> index 7b460d7be..696ffc6d4 100644
> --- a/tests/lacp.at
> +++ b/tests/lacp.at
> @@ -125,6 +125,7 @@ updelay: 0 ms
> downdelay: 0 ms
> lacp_status: negotiated
> lacp_fallback_ab: false
> +primary: <none>
> active slave mac: 00:00:00:00:00:00(none)
> 
> slave p1: disabled
> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> index 41164d735..b2d8c089f 100644
> --- a/tests/ofproto-dpif.at
> +++ b/tests/ofproto-dpif.at
> @@ -29,7 +29,203 @@ AT_CHECK([ovs-appctl revalidator/wait])
> OVS_VSWITCHD_STOP
> AT_CLEANUP
> 
> -AT_SETUP([ofproto-dpif - active-backup bonding])
> +AT_SETUP([ofproto-dpif - active-backup bonding (with primary)])
> +
> +# Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and
> +#    p2 (p1 as primary) and br1 with interfaces p3, p4 and p8.
> +# toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
> +# With p1 down and p2 up/active, bring p1 back up.  Since p1 is the primary,
> +# it should become active.
> +OVS_VSWITCHD_START(
> +  [add-bond br0 bond0 p1 p2 bond_mode=active-backup \
> +                other_config:bond-primary=p1 -- \
> +   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
> +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy -- \
> +   add-br br1 -- \
> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234 \
> +                  fail-mode=secure -- \
> +   add-port br1 p3 -- set interface p3 type=dummy options:stream=unix:$OVS_RUNDIR/p1.sock ofport_request=3 -- \
> +   add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
> +   add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
> +WAIT_FOR_DUMMY_PORTS([p3], [p4])
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
> +
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-ofctl add-flow br1 action=normal])
> +ovs-appctl netdev-dummy/set-admin-state up
> +ovs-appctl time/warp 100
> +ovs-appctl netdev-dummy/set-admin-state p2 down
> +ovs-appctl time/stop
> +ovs-appctl time/warp 100
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +ovs-appctl time/warp 100
> +ovs-appctl netdev-dummy/set-admin-state p2 up
> +ovs-appctl netdev-dummy/set-admin-state p1 down
> +ovs-appctl time/warp 100
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(src=10.0.0.5,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(src=10.0.0.6,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +ovs-appctl time/warp 200 100
> +sleep 1
> +AT_CHECK([grep 'in_port([[348]])' ovs-vswitchd.log | filter_flow_install | strip_xout], [0], [dnl
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +])
> +
> +ovs-appctl netdev-dummy/set-admin-state p1 up
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +])
> +
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> +
> +AT_SETUP([ofproto-dpif - active-backup bonding (primary validation)])
> +# Make a switch with 3 ports in a bond, so that when we delete one of
> +# the ports from the bond, there are still 2 ports left and the bond
> +# remains functional.
> +OVS_VSWITCHD_START(
> +  [add-bond br0 bond0 p1 p2 p3 bond_mode=active-backup \
> +                other_config:bond-primary=p1 -- \
> +   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
> +   set interface p3 type=dummy options:pstream=punix:$OVS_RUNDIR/p3.sock ofport_request=3 -- \
> +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy --])
> +
> +dnl Make sure the initial primary interface is set
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> +
> +dnl Down the primary interface and verify that we switched.  Then
> +dnl bring the primary back and verify that we switched back to the
> +dnl primary.
> +ovs-appctl netdev-dummy/set-admin-state p1 down
> +AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'slave p1: disabled'`"])
> +ovs-appctl netdev-dummy/set-admin-state p1 up
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +dnl Now delete the primary and verify that the output shows that the
> +dnl primary is no longer enslaved
> +ovs-vsctl --id=@p1 get Interface p1 -- remove Port bond0 interfaces @p1
> +AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'primary: p1 (no such slave)'`"])
> +
> +dnl Now re-add the primary and verify that the output shows that the
> +dnl primary is available again.
> +dnl
> +dnl First, get the UUIDs of the interfaces that exist on bond0.
> +dnl Strip the trailing ] so that we can add a new UUID to the end.
> +uuids=`ovs-vsctl get Port bond0 interfaces | sed -e 's/]//'`
> +dnl Create a new port "p1" and add its UUID to the set of interfaces
> +dnl on bond0.
> +ovs-vsctl \
> +   --id=@p1 create Interface name=p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set Port bond0 interfaces="$uuids, @p1]"
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +dnl Remove the "bond-primary" config directive from the bond.
> +dnl First, find the other_config values on bond0.
> +other_config=`ovs-vsctl get Port bond0 other_config | sed -e 's/\(, \)*bond-primary=p1//'`
> +dnl Now re-set the other_config on bond0.
> +ovs-vsctl set Port bond0 other_config="$other_config"
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: <none>
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> +
> +AT_SETUP([ofproto-dpif - active-backup bonding (without primary)])
> # Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and p2
> #    and br1 with interfaces p3, p4 and p8.
> # toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
> @@ -46,6 +242,7 @@ OVS_VSWITCHD_START(
>    add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
>    add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
> WAIT_FOR_DUMMY_PORTS([p3], [p4])
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: <none>'`"])
> AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
> 
> AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index fe73c38d4..5f30b7737 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -4563,6 +4563,11 @@ port_configure_bond(struct port *port, struct bond_settings *s)
>                   port->name);
>     }
> 
> +    s->primary = NULL;
> +    if (s->balance == BM_AB) {
> +        s->primary = smap_get(&port->cfg->other_config, "bond-primary");
> +    }
> +
>     miimon_interval = smap_get_int(&port->cfg->other_config,
>                                    "bond-miimon-interval", 0);
>     if (miimon_interval <= 0) {
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 6d334370d..a2a26e849 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -1994,6 +1994,14 @@
>         <code>active-backup</code>.
>       </column>
> 
> +      <column name="other_config" key="bond-primary"
> +              type='{"type": "string"}'>
> +        If a slave interface with this name exists in the bond and
> +        is up, it will be made active.  Relevant only when <ref
> +        column="other_config" key="bond-mode"/> is
> +        <code>active-backup</code>.
> +      </column>
> +
>       <group title="Link Failure Detection">
>         <p>
>           An important part of link bonding is detecting that links are down so
> -- 
> 2.22.0
>
Jeff Squyres \(jsquyres\) June 11, 2020, 10:44 p.m. UTC | #3
Sorry to be a pest, but this patch has been sitting since May 22nd.

Are there any objections, or anything else I should address?  It would be great to get this functionality merged upstream.  :-)


> On May 22, 2020, at 5:12 PM, Jeff Squyres <jsquyres@cisco.com> wrote:
> 
> In AB bonding, if the current active slave becomes disabled, a
> replacement slave is arbitrarily picked from the remaining set of
> enabled slaves.  This commit adds the concept of a "primary" slave: an
> interface that will always be (or become) the current active slave if
> it is enabled.
> 
> The rationale for this functionality is to allow the designation of a
> preferred interface for a given bond.  For example:
> 
> 1. Bond is created with interfaces p1 (primary) and p2, both enabled.
> 2. p1 becomes the current active slave (because it was designated as
>   the primary).
> 3. Later, p1 fails/becomes disabled.
> 4. p2 is chosen to become the current active slave.
> 5. Later, p1 becomes re-enabled.
> 6. p1 is chosen to become the current active slave (because it was
>   designated as the primary)
> 
> Note that p1 becomes the active slave once it becomes re-enabled, even
> if nothing has happened to p2.
> 
> This "primary" concept exists in Linux kernel network interface
> bonding, but did not previously exist in OVS bonding.
> 
> Only one primary slave inteface is supported per bond, and is only
> supported for active/backup bonding.
> 
> The primary slave interface is designated via
> "other_config:bond-primary" when creating a bond.
> 
> Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
> Reviewed-by: Aaron Conole <aconole@redhat.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> Acked-by: Greg Rose <gvrose8192@gmail.com>
> ---
> v7:
> - After rebasing patch to head of tree, adjust test output as result
>  of changes from 81f71381f.
> 
> v6:
> - Style: bleep bloop.  Fix use of {}.
> 
> v5:
> - Handle when bond is reconfigured to remove "bond-primary" config.
> - Fix potential flapping between remaining slaves.
> - Add a test to re-add a removed primary interface and make sure the
>  bond reflects that properly.
> - Add a test to remove "bond-primary" config and make sure the bond
>  reflects that properly.
> 
> v4:
> - Style: bleep bloop.  Trim line length below 79 characters.
> 
> v3:
> - Adjusted print logic for when the primary slave is not attached
> 
> v2:
> - Added "ovs-vsctl bond/show" label when primary interface is no
>  longer enslaved
> - Fixed strcmp() usage nits
> - Moved "other_config:bond-primary" docs to the "Bonding
>  Configuration" group
> - Expanded commit message
> - Added more test cases (including one for when the primary interface
>  is no longer enslaved)
> 
> ofproto/bond.c        |  75 +++++++++++++++-
> ofproto/bond.h        |   1 +
> tests/lacp.at         |   1 +
> tests/ofproto-dpif.at | 199 +++++++++++++++++++++++++++++++++++++++++-
> vswitchd/bridge.c     |   5 ++
> vswitchd/vswitch.xml  |   8 ++
> 6 files changed, 287 insertions(+), 2 deletions(-)
> 
> diff --git a/ofproto/bond.c b/ofproto/bond.c
> index 405202fb6..4ad1a20ae 100644
> --- a/ofproto/bond.c
> +++ b/ofproto/bond.c
> @@ -93,6 +93,7 @@ struct bond_slave {
>     /* Link status. */
>     bool enabled;               /* May be chosen for flows? */
>     bool may_enable;            /* Client considers this slave bondable. */
> +    bool is_primary;            /* This slave is preferred over others. */
>     long long delay_expires;    /* Time after which 'enabled' may change. */
> 
>     /* Rebalancing info.  Used only by bond_rebalance(). */
> @@ -126,6 +127,7 @@ struct bond {
>     enum lacp_status lacp_status; /* Status of LACP negotiations. */
>     bool bond_revalidate;       /* True if flows need revalidation. */
>     uint32_t basis;             /* Basis for flow hash function. */
> +    char *primary;              /* Name of the primary slave interface. */
> 
>     /* SLB specific bonding info. */
>     struct bond_entry *hash;     /* An array of BOND_BUCKETS elements. */
> @@ -241,6 +243,7 @@ bond_create(const struct bond_settings *s, struct ofproto_dpif *ofproto)
> 
>     bond->active_slave_mac = eth_addr_zero;
>     bond->active_slave_changed = false;
> +    bond->primary = NULL;
> 
>     bond_reconfigure(bond, s);
>     return bond;
> @@ -290,6 +293,7 @@ bond_unref(struct bond *bond)
>     update_recirc_rules__(bond);
> 
>     hmap_destroy(&bond->pr_rule_ops);
> +    free(bond->primary);
>     free(bond->name);
>     free(bond);
> }
> @@ -459,6 +463,39 @@ bond_reconfigure(struct bond *bond, const struct bond_settings *s)
>         bond->bond_revalidate = false;
>     }
> 
> +    /*
> +     * If a primary interface is set on the new settings:
> +     * 1. If the bond has no primary previously set, save it and
> +     * revalidate.
> +     * 2. If the bond has a different primary previously set, save the
> +     * new one and revalidate.
> +     * 3. If the bond has the same primary previously set, do nothing.
> +     */
> +    if (s->primary) {
> +        bool changed = false;
> +        if (bond->primary) {
> +            if (strcmp(bond->primary, s->primary)) {
> +                free(bond->primary);
> +                changed = true;
> +            }
> +        } else {
> +            changed = true;
> +        }
> +
> +        if (changed) {
> +            bond->primary = xstrdup(s->primary);
> +            revalidate = true;
> +        }
> +    } else if (bond->primary) {
> +        /*
> +         * If the new settings have no primary interface, but the
> +         * bond already has a primary, remove the bond's primary.
> +         */
> +        free(bond->primary);
> +        bond->primary = NULL;
> +        revalidate = true;
> +    }
> +
>     if (bond->balance != BM_AB) {
>         if (!bond->recirc_id) {
>             bond->recirc_id = recirc_alloc_id(bond->ofproto);
> @@ -549,6 +586,12 @@ bond_slave_register(struct bond *bond, void *slave_,
>         slave->name = xstrdup(netdev_get_name(netdev));
>         bond->bond_revalidate = true;
> 
> +        if (bond->primary && !strcmp(bond->primary, slave->name)) {
> +            slave->is_primary = true;
> +        } else {
> +            slave->is_primary = false;
> +        }
> +
>         slave->enabled = false;
>         bond_enable_slave(slave, netdev_get_carrier(netdev));
>     }
> @@ -644,6 +687,7 @@ bond_run(struct bond *bond, enum lacp_status lacp_status)
> {
>     struct bond_slave *slave;
>     bool revalidate;
> +    struct bond_slave *primary;
> 
>     ovs_rwlock_wrlock(&rwlock);
>     if (bond->lacp_status != lacp_status) {
> @@ -659,11 +703,19 @@ bond_run(struct bond *bond, enum lacp_status lacp_status)
>     }
> 
>     /* Enable slaves based on link status and LACP feedback. */
> +    primary = NULL;
>     HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
>         bond_link_status_update(slave);
>         slave->change_seq = seq_read(connectivity_seq_get());
> +
> +        /* Discover if there is an active slave marked "primary". */
> +        if (bond->balance == BM_AB && slave->is_primary && slave->enabled) {
> +            primary = slave;
> +        }
>     }
> -    if (!bond->active_slave || !bond->active_slave->enabled) {
> +
> +    if (!bond->active_slave || !bond->active_slave->enabled ||
> +        (primary && primary->enabled && bond->active_slave)) {
>         bond_choose_active_slave(bond);
>     }
> 
> @@ -1393,6 +1445,20 @@ bond_print_details(struct ds *ds, const struct bond *bond)
>     ds_put_format(ds, "lacp_fallback_ab: %s\n",
>                   bond->lacp_fallback_ab ? "true" : "false");
> 
> +    bool found_primary = false;
> +    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
> +        if (slave->is_primary) {
> +            found_primary = true;
> +        }
> +    }
> +
> +    if (bond->balance == BM_AB) {
> +        ds_put_format(ds, "primary: %s%s\n",
> +                      bond->primary ? bond->primary : "<none>",
> +                      (!found_primary && bond->primary) ?
> +                      " (no such slave)" : "");
> +    }
> +
>     ds_put_cstr(ds, "active slave mac: ");
>     ds_put_format(ds, ETH_ADDR_FMT, ETH_ADDR_ARGS(bond->active_slave_mac));
>     slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
> @@ -1862,6 +1928,13 @@ bond_choose_slave(const struct bond *bond)
> {
>     struct bond_slave *slave, *best;
> 
> +    /* If there's a primary and it's active, return that. */
> +    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
> +        if (slave->is_primary && slave->enabled) {
> +            return slave;
> +        }
> +    }
> +
>     /* Find the last active slave. */
>     slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
>     if (slave && slave->enabled) {
> diff --git a/ofproto/bond.h b/ofproto/bond.h
> index e7c3d9bc3..3a923dcfa 100644
> --- a/ofproto/bond.h
> +++ b/ofproto/bond.h
> @@ -45,6 +45,7 @@ struct bond_settings {
> 
>     /* Balancing configuration. */
>     enum bond_mode balance;
> +    const char *primary;        /* For AB balance, primary interface name. */
>     int rebalance_interval;     /* Milliseconds between rebalances.
>                                    Zero to disable rebalancing. */
> 
> diff --git a/tests/lacp.at b/tests/lacp.at
> index 7b460d7be..696ffc6d4 100644
> --- a/tests/lacp.at
> +++ b/tests/lacp.at
> @@ -125,6 +125,7 @@ updelay: 0 ms
> downdelay: 0 ms
> lacp_status: negotiated
> lacp_fallback_ab: false
> +primary: <none>
> active slave mac: 00:00:00:00:00:00(none)
> 
> slave p1: disabled
> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> index 41164d735..b2d8c089f 100644
> --- a/tests/ofproto-dpif.at
> +++ b/tests/ofproto-dpif.at
> @@ -29,7 +29,203 @@ AT_CHECK([ovs-appctl revalidator/wait])
> OVS_VSWITCHD_STOP
> AT_CLEANUP
> 
> -AT_SETUP([ofproto-dpif - active-backup bonding])
> +AT_SETUP([ofproto-dpif - active-backup bonding (with primary)])
> +
> +# Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and
> +#    p2 (p1 as primary) and br1 with interfaces p3, p4 and p8.
> +# toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
> +# With p1 down and p2 up/active, bring p1 back up.  Since p1 is the primary,
> +# it should become active.
> +OVS_VSWITCHD_START(
> +  [add-bond br0 bond0 p1 p2 bond_mode=active-backup \
> +                other_config:bond-primary=p1 -- \
> +   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
> +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy -- \
> +   add-br br1 -- \
> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234 \
> +                  fail-mode=secure -- \
> +   add-port br1 p3 -- set interface p3 type=dummy options:stream=unix:$OVS_RUNDIR/p1.sock ofport_request=3 -- \
> +   add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
> +   add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
> +WAIT_FOR_DUMMY_PORTS([p3], [p4])
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
> +
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-ofctl add-flow br1 action=normal])
> +ovs-appctl netdev-dummy/set-admin-state up
> +ovs-appctl time/warp 100
> +ovs-appctl netdev-dummy/set-admin-state p2 down
> +ovs-appctl time/stop
> +ovs-appctl time/warp 100
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +ovs-appctl time/warp 100
> +ovs-appctl netdev-dummy/set-admin-state p2 up
> +ovs-appctl netdev-dummy/set-admin-state p1 down
> +ovs-appctl time/warp 100
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(src=10.0.0.5,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(src=10.0.0.6,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +ovs-appctl time/warp 200 100
> +sleep 1
> +AT_CHECK([grep 'in_port([[348]])' ovs-vswitchd.log | filter_flow_install | strip_xout], [0], [dnl
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +])
> +
> +ovs-appctl netdev-dummy/set-admin-state p1 up
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +])
> +
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> +
> +AT_SETUP([ofproto-dpif - active-backup bonding (primary validation)])
> +# Make a switch with 3 ports in a bond, so that when we delete one of
> +# the ports from the bond, there are still 2 ports left and the bond
> +# remains functional.
> +OVS_VSWITCHD_START(
> +  [add-bond br0 bond0 p1 p2 p3 bond_mode=active-backup \
> +                other_config:bond-primary=p1 -- \
> +   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
> +   set interface p3 type=dummy options:pstream=punix:$OVS_RUNDIR/p3.sock ofport_request=3 -- \
> +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy --])
> +
> +dnl Make sure the initial primary interface is set
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> +
> +dnl Down the primary interface and verify that we switched.  Then
> +dnl bring the primary back and verify that we switched back to the
> +dnl primary.
> +ovs-appctl netdev-dummy/set-admin-state p1 down
> +AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'slave p1: disabled'`"])
> +ovs-appctl netdev-dummy/set-admin-state p1 up
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +dnl Now delete the primary and verify that the output shows that the
> +dnl primary is no longer enslaved
> +ovs-vsctl --id=@p1 get Interface p1 -- remove Port bond0 interfaces @p1
> +AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'primary: p1 (no such slave)'`"])
> +
> +dnl Now re-add the primary and verify that the output shows that the
> +dnl primary is available again.
> +dnl
> +dnl First, get the UUIDs of the interfaces that exist on bond0.
> +dnl Strip the trailing ] so that we can add a new UUID to the end.
> +uuids=`ovs-vsctl get Port bond0 interfaces | sed -e 's/]//'`
> +dnl Create a new port "p1" and add its UUID to the set of interfaces
> +dnl on bond0.
> +ovs-vsctl \
> +   --id=@p1 create Interface name=p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set Port bond0 interfaces="$uuids, @p1]"
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +dnl Remove the "bond-primary" config directive from the bond.
> +dnl First, find the other_config values on bond0.
> +other_config=`ovs-vsctl get Port bond0 other_config | sed -e 's/\(, \)*bond-primary=p1//'`
> +dnl Now re-set the other_config on bond0.
> +ovs-vsctl set Port bond0 other_config="$other_config"
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: <none>
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> +
> +AT_SETUP([ofproto-dpif - active-backup bonding (without primary)])
> # Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and p2
> #    and br1 with interfaces p3, p4 and p8.
> # toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
> @@ -46,6 +242,7 @@ OVS_VSWITCHD_START(
>    add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
>    add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
> WAIT_FOR_DUMMY_PORTS([p3], [p4])
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: <none>'`"])
> AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
> 
> AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index fe73c38d4..5f30b7737 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -4563,6 +4563,11 @@ port_configure_bond(struct port *port, struct bond_settings *s)
>                   port->name);
>     }
> 
> +    s->primary = NULL;
> +    if (s->balance == BM_AB) {
> +        s->primary = smap_get(&port->cfg->other_config, "bond-primary");
> +    }
> +
>     miimon_interval = smap_get_int(&port->cfg->other_config,
>                                    "bond-miimon-interval", 0);
>     if (miimon_interval <= 0) {
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 6d334370d..a2a26e849 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -1994,6 +1994,14 @@
>         <code>active-backup</code>.
>       </column>
> 
> +      <column name="other_config" key="bond-primary"
> +              type='{"type": "string"}'>
> +        If a slave interface with this name exists in the bond and
> +        is up, it will be made active.  Relevant only when <ref
> +        column="other_config" key="bond-mode"/> is
> +        <code>active-backup</code>.
> +      </column>
> +
>       <group title="Link Failure Detection">
>         <p>
>           An important part of link bonding is detecting that links are down so
> -- 
> 2.22.0
>
Ilya Maximets June 18, 2020, 1:44 p.m. UTC | #4
On 5/22/20 11:12 PM, Jeff Squyres via dev wrote:
> In AB bonding, if the current active slave becomes disabled, a
> replacement slave is arbitrarily picked from the remaining set of
> enabled slaves.  This commit adds the concept of a "primary" slave: an
> interface that will always be (or become) the current active slave if
> it is enabled.
> 
> The rationale for this functionality is to allow the designation of a
> preferred interface for a given bond.  For example:
> 
> 1. Bond is created with interfaces p1 (primary) and p2, both enabled.
> 2. p1 becomes the current active slave (because it was designated as
>    the primary).
> 3. Later, p1 fails/becomes disabled.
> 4. p2 is chosen to become the current active slave.
> 5. Later, p1 becomes re-enabled.
> 6. p1 is chosen to become the current active slave (because it was
>    designated as the primary)
> 
> Note that p1 becomes the active slave once it becomes re-enabled, even
> if nothing has happened to p2.
> 
> This "primary" concept exists in Linux kernel network interface
> bonding, but did not previously exist in OVS bonding.
> 
> Only one primary slave inteface is supported per bond, and is only
> supported for active/backup bonding.

Does it work for the case where balanced bonding falls back to active-backup?
It seems like it should, but I see that code doesn't handle this in a few
places.  IMHO, it might make sence to add this support.

Comments inline.

Best regards, Ilya Maximets.

> 
> The primary slave interface is designated via
> "other_config:bond-primary" when creating a bond.
> 
> Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
> Reviewed-by: Aaron Conole <aconole@redhat.com>
> Tested-by: Greg Rose <gvrose8192@gmail.com>
> Acked-by: Greg Rose <gvrose8192@gmail.com>
> ---
> v7:
> - After rebasing patch to head of tree, adjust test output as result
>   of changes from 81f71381f.
> 
> v6:
> - Style: bleep bloop.  Fix use of {}.
> 
> v5:
> - Handle when bond is reconfigured to remove "bond-primary" config.
> - Fix potential flapping between remaining slaves.
> - Add a test to re-add a removed primary interface and make sure the
>   bond reflects that properly.
> - Add a test to remove "bond-primary" config and make sure the bond
>   reflects that properly.
> 
> v4:
> - Style: bleep bloop.  Trim line length below 79 characters.
> 
> v3:
> - Adjusted print logic for when the primary slave is not attached
> 
> v2:
> - Added "ovs-vsctl bond/show" label when primary interface is no
>   longer enslaved
> - Fixed strcmp() usage nits
> - Moved "other_config:bond-primary" docs to the "Bonding
>   Configuration" group
> - Expanded commit message
> - Added more test cases (including one for when the primary interface
>   is no longer enslaved)
> 
>  ofproto/bond.c        |  75 +++++++++++++++-
>  ofproto/bond.h        |   1 +
>  tests/lacp.at         |   1 +
>  tests/ofproto-dpif.at | 199 +++++++++++++++++++++++++++++++++++++++++-
>  vswitchd/bridge.c     |   5 ++
>  vswitchd/vswitch.xml  |   8 ++
>  6 files changed, 287 insertions(+), 2 deletions(-)
> 
> diff --git a/ofproto/bond.c b/ofproto/bond.c
> index 405202fb6..4ad1a20ae 100644
> --- a/ofproto/bond.c
> +++ b/ofproto/bond.c
> @@ -93,6 +93,7 @@ struct bond_slave {
>      /* Link status. */
>      bool enabled;               /* May be chosen for flows? */
>      bool may_enable;            /* Client considers this slave bondable. */
> +    bool is_primary;            /* This slave is preferred over others. */
>      long long delay_expires;    /* Time after which 'enabled' may change. */
>  
>      /* Rebalancing info.  Used only by bond_rebalance(). */
> @@ -126,6 +127,7 @@ struct bond {
>      enum lacp_status lacp_status; /* Status of LACP negotiations. */
>      bool bond_revalidate;       /* True if flows need revalidation. */
>      uint32_t basis;             /* Basis for flow hash function. */
> +    char *primary;              /* Name of the primary slave interface. */
>  
>      /* SLB specific bonding info. */
>      struct bond_entry *hash;     /* An array of BOND_BUCKETS elements. */
> @@ -241,6 +243,7 @@ bond_create(const struct bond_settings *s, struct ofproto_dpif *ofproto)
>  
>      bond->active_slave_mac = eth_addr_zero;
>      bond->active_slave_changed = false;
> +    bond->primary = NULL;
>  
>      bond_reconfigure(bond, s);
>      return bond;
> @@ -290,6 +293,7 @@ bond_unref(struct bond *bond)
>      update_recirc_rules__(bond);
>  
>      hmap_destroy(&bond->pr_rule_ops);
> +    free(bond->primary);
>      free(bond->name);
>      free(bond);
>  }
> @@ -459,6 +463,39 @@ bond_reconfigure(struct bond *bond, const struct bond_settings *s)
>          bond->bond_revalidate = false;
>      }
>  
> +    /*
> +     * If a primary interface is set on the new settings:
> +     * 1. If the bond has no primary previously set, save it and
> +     * revalidate.
> +     * 2. If the bond has a different primary previously set, save the
> +     * new one and revalidate.
> +     * 3. If the bond has the same primary previously set, do nothing.
> +     */
> +    if (s->primary) {
> +        bool changed = false;
> +        if (bond->primary) {
> +            if (strcmp(bond->primary, s->primary)) {
> +                free(bond->primary);
> +                changed = true;
> +            }
> +        } else {
> +            changed = true;
> +        }
> +
> +        if (changed) {
> +            bond->primary = xstrdup(s->primary);
> +            revalidate = true;
> +        }
> +    } else if (bond->primary) {
> +        /*
> +         * If the new settings have no primary interface, but the
> +         * bond already has a primary, remove the bond's primary.
> +         */
> +        free(bond->primary);
> +        bond->primary = NULL;
> +        revalidate = true;
> +    }
> +

Does above code equal to:

    if (!nullable_string_is_equal(bond->primary, s->primary)) {
        free(bond->primary);
        bond->primary = nullable_xstrdup(s->primary);
        revalidate = true;
    }

?

>      if (bond->balance != BM_AB) {
>          if (!bond->recirc_id) {
>              bond->recirc_id = recirc_alloc_id(bond->ofproto);
> @@ -549,6 +586,12 @@ bond_slave_register(struct bond *bond, void *slave_,
>          slave->name = xstrdup(netdev_get_name(netdev));
>          bond->bond_revalidate = true;
>  
> +        if (bond->primary && !strcmp(bond->primary, slave->name)) {
> +            slave->is_primary = true;
> +        } else {
> +            slave->is_primary = false;
> +        }
> +
>          slave->enabled = false;
>          bond_enable_slave(slave, netdev_get_carrier(netdev));
>      }
> @@ -644,6 +687,7 @@ bond_run(struct bond *bond, enum lacp_status lacp_status)
>  {
>      struct bond_slave *slave;
>      bool revalidate;
> +    struct bond_slave *primary;

Could you keep a reversed x-mass tree here?  It's not a requirement, but at least
we're trying to keep definitions of 'struct' variables close to each other.
You could even combine the definition with '*slave' one.

>  
>      ovs_rwlock_wrlock(&rwlock);
>      if (bond->lacp_status != lacp_status) {
> @@ -659,11 +703,19 @@ bond_run(struct bond *bond, enum lacp_status lacp_status)
>      }
>  
>      /* Enable slaves based on link status and LACP feedback. */
> +    primary = NULL;
>      HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
>          bond_link_status_update(slave);
>          slave->change_seq = seq_read(connectivity_seq_get());
> +
> +        /* Discover if there is an active slave marked "primary". */

Please, use single quotes (') in commnets if possible.

> +        if (bond->balance == BM_AB && slave->is_primary && slave->enabled) {
> +            primary = slave;
> +        }
>      }
> -    if (!bond->active_slave || !bond->active_slave->enabled) {
> +
> +    if (!bond->active_slave || !bond->active_slave->enabled ||
> +        (primary && primary->enabled && bond->active_slave)) {

We already checked that 'primary' is enabled in a previous loop.
And we don't need to check for 'bond->active_slave' second time.
If it's NULL, we will not reach this condition anyway.

Don't we need to check that bond->active_slave != primary ?

The 'if', I think, should look like this:

    if (!bond->active_slave || !bond->active_slave->enabled ||
        (primary && bond->active_slave != primary)) {

>          bond_choose_active_slave(bond);
>      }
>  
> @@ -1393,6 +1445,20 @@ bond_print_details(struct ds *ds, const struct bond *bond)
>      ds_put_format(ds, "lacp_fallback_ab: %s\n",
>                    bond->lacp_fallback_ab ? "true" : "false");
>  
> +    bool found_primary = false;
> +    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {

There is a similar HAMP iteration loop a bit below, can we combine them
to avoid double iteration?

> +        if (slave->is_primary) {
> +            found_primary = true;
> +        }
> +    }
> +
> +    if (bond->balance == BM_AB) {

This doesn't handle case where balance-tcp falls back to AB.
We might want to print this value regardles of mode, but add
extra words to clarify, i.e. "primary: " vs "active-backup primary: ".

> +        ds_put_format(ds, "primary: %s%s\n",
> +                      bond->primary ? bond->primary : "<none>",
> +                      (!found_primary && bond->primary) ?

Please, move the question mark to the next line.

> +                      " (no such slave)" : "");
> +    }
> +
>      ds_put_cstr(ds, "active slave mac: ");
>      ds_put_format(ds, ETH_ADDR_FMT, ETH_ADDR_ARGS(bond->active_slave_mac));
>      slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
> @@ -1862,6 +1928,13 @@ bond_choose_slave(const struct bond *bond)
>  {
>      struct bond_slave *slave, *best;
>  
> +    /* If there's a primary and it's active, return that. */
> +    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
> +        if (slave->is_primary && slave->enabled) {
> +            return slave;
> +        }
> +    }
> +
>      /* Find the last active slave. */
>      slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
>      if (slave && slave->enabled) {
> diff --git a/ofproto/bond.h b/ofproto/bond.h
> index e7c3d9bc3..3a923dcfa 100644
> --- a/ofproto/bond.h
> +++ b/ofproto/bond.h
> @@ -45,6 +45,7 @@ struct bond_settings {
>  
>      /* Balancing configuration. */
>      enum bond_mode balance;
> +    const char *primary;        /* For AB balance, primary interface name. */
>      int rebalance_interval;     /* Milliseconds between rebalances.
>                                     Zero to disable rebalancing. */
>  
> diff --git a/tests/lacp.at b/tests/lacp.at
> index 7b460d7be..696ffc6d4 100644
> --- a/tests/lacp.at
> +++ b/tests/lacp.at
> @@ -125,6 +125,7 @@ updelay: 0 ms
>  downdelay: 0 ms
>  lacp_status: negotiated
>  lacp_fallback_ab: false
> +primary: <none>
>  active slave mac: 00:00:00:00:00:00(none)
>  
>  slave p1: disabled
> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> index 41164d735..b2d8c089f 100644
> --- a/tests/ofproto-dpif.at
> +++ b/tests/ofproto-dpif.at
> @@ -29,7 +29,203 @@ AT_CHECK([ovs-appctl revalidator/wait])
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
>  
> -AT_SETUP([ofproto-dpif - active-backup bonding])
> +AT_SETUP([ofproto-dpif - active-backup bonding (with primary)])
> +
> +# Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and
> +#    p2 (p1 as primary) and br1 with interfaces p3, p4 and p8.
> +# toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
> +# With p1 down and p2 up/active, bring p1 back up.  Since p1 is the primary,
> +# it should become active.

Please, use 'dnl' for commnets as possible.  I understand that the file is not
that consistent already, but it's better to not make it worse.

> +OVS_VSWITCHD_START(
> +  [add-bond br0 bond0 p1 p2 bond_mode=active-backup \
> +                other_config:bond-primary=p1 -- \
> +   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
> +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy -- \
> +   add-br br1 -- \
> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234 \
> +                  fail-mode=secure -- \
> +   add-port br1 p3 -- set interface p3 type=dummy options:stream=unix:$OVS_RUNDIR/p1.sock ofport_request=3 -- \
> +   add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
> +   add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
> +WAIT_FOR_DUMMY_PORTS([p3], [p4])
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])

Configuration updates are not immediate.  It's better to use OVS_WAIT_UNTIL.

> +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])

It's better to enable debug before the previous operation.

> +
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-ofctl add-flow br1 action=normal])
> +ovs-appctl netdev-dummy/set-admin-state up
> +ovs-appctl time/warp 100
> +ovs-appctl netdev-dummy/set-admin-state p2 down
> +ovs-appctl time/stop
> +ovs-appctl time/warp 100
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +ovs-appctl time/warp 100
> +ovs-appctl netdev-dummy/set-admin-state p2 up
> +ovs-appctl netdev-dummy/set-admin-state p1 down
> +ovs-appctl time/warp 100
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(src=10.0.0.5,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(src=10.0.0.6,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
> +ovs-appctl time/warp 200 100
> +sleep 1

Why sleep here?  You already using time/wrap.  Just modify the wrapping values.

> +AT_CHECK([grep 'in_port([[348]])' ovs-vswitchd.log | filter_flow_install | strip_xout], [0], [dnl

Instead of parsing the log it might be better to dump flows via 'ovs-appctl dpctl/dump-flows'.

> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(frag=no), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
> +])
> +
> +ovs-appctl netdev-dummy/set-admin-state p1 up
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl

Port state updates might not be propagated in time.  We need to wait somehow.

> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +])
> +
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])

We just checked the full output of this command, no need to do this again.

> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> +
> +AT_SETUP([ofproto-dpif - active-backup bonding (primary validation)])
> +# Make a switch with 3 ports in a bond, so that when we delete one of
> +# the ports from the bond, there are still 2 ports left and the bond
> +# remains functional.

Same here.

> +OVS_VSWITCHD_START(
> +  [add-bond br0 bond0 p1 p2 p3 bond_mode=active-backup \
> +                other_config:bond-primary=p1 -- \
> +   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
> +   set interface p3 type=dummy options:pstream=punix:$OVS_RUNDIR/p3.sock ofport_request=3 -- \
> +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy --])
> +
> +dnl Make sure the initial primary interface is set
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])

Ditto.

> +
> +dnl Down the primary interface and verify that we switched.  Then
> +dnl bring the primary back and verify that we switched back to the
> +dnl primary.
> +ovs-appctl netdev-dummy/set-admin-state p1 down
> +AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'slave p1: disabled'`"])

Ditto.

> +ovs-appctl netdev-dummy/set-admin-state p1 up
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl

Ditto.

> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +dnl Now delete the primary and verify that the output shows that the
> +dnl primary is no longer enslaved
> +ovs-vsctl --id=@p1 get Interface p1 -- remove Port bond0 interfaces @p1
> +AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'primary: p1 (no such slave)'`"])

Ditto.

> +
> +dnl Now re-add the primary and verify that the output shows that the
> +dnl primary is available again.
> +dnl
> +dnl First, get the UUIDs of the interfaces that exist on bond0.
> +dnl Strip the trailing ] so that we can add a new UUID to the end.
> +uuids=`ovs-vsctl get Port bond0 interfaces | sed -e 's/]//'`
> +dnl Create a new port "p1" and add its UUID to the set of interfaces
> +dnl on bond0.
> +ovs-vsctl \
> +   --id=@p1 create Interface name=p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
> +   set Port bond0 interfaces="$uuids, @p1]"
> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl

Ditto.

> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: p1
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +dnl Remove the "bond-primary" config directive from the bond.
> +dnl First, find the other_config values on bond0.
> +other_config=`ovs-vsctl get Port bond0 other_config | sed -e 's/\(, \)*bond-primary=p1//'`
> +dnl Now re-set the other_config on bond0.
> +ovs-vsctl set Port bond0 other_config="$other_config"

You can remove one entry like this:

  ovs-vsctl remove Port bond0 other_config bond-primary

> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl

Some waiting needed.

> +---- bond0 ----
> +bond_mode: active-backup
> +bond may use recirculation: no, <del>
> +bond-hash-basis: 0
> +updelay: 0 ms
> +downdelay: 0 ms
> +lacp_status: off
> +lacp_fallback_ab: false
> +primary: <none>
> +<active slave mac del>
> +
> +slave p1: enabled
> +  active slave
> +  may_enable: true
> +
> +slave p2: enabled
> +  may_enable: true
> +
> +slave p3: enabled
> +  may_enable: true
> +
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> +
> +AT_SETUP([ofproto-dpif - active-backup bonding (without primary)])
>  # Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and p2
>  #    and br1 with interfaces p3, p4 and p8.
>  # toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
> @@ -46,6 +242,7 @@ OVS_VSWITCHD_START(
>     add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
>     add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
>  WAIT_FOR_DUMMY_PORTS([p3], [p4])
> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: <none>'`"])

Waiting is needed and ned to move this line below the logging configuration.

>  AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
>  
>  AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index fe73c38d4..5f30b7737 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -4563,6 +4563,11 @@ port_configure_bond(struct port *port, struct bond_settings *s)
>                    port->name);
>      }
>  
> +    s->primary = NULL;
> +    if (s->balance == BM_AB) {

Need to check for lacp_fallback_ab_cfg too?

> +        s->primary = smap_get(&port->cfg->other_config, "bond-primary");
> +    }
> +
>      miimon_interval = smap_get_int(&port->cfg->other_config,
>                                     "bond-miimon-interval", 0);
>      if (miimon_interval <= 0) {
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 6d334370d..a2a26e849 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -1994,6 +1994,14 @@
>          <code>active-backup</code>.
>        </column>
>  
> +      <column name="other_config" key="bond-primary"
> +              type='{"type": "string"}'>
> +        If a slave interface with this name exists in the bond and
> +        is up, it will be made active.  Relevant only when <ref
> +        column="other_config" key="bond-mode"/> is
> +        <code>active-backup</code>.
> +      </column>
> +
>        <group title="Link Failure Detection">
>          <p>
>            An important part of link bonding is detecting that links are down so
>
Jeff Squyres \(jsquyres\) June 22, 2020, 10:41 p.m. UTC | #5
On Jun 18, 2020, at 9:44 AM, Ilya Maximets <i.maximets@ovn.org> wrote:
> 
> On 5/22/20 11:12 PM, Jeff Squyres via dev wrote:
>> In AB bonding, if the current active slave becomes disabled, a
>> replacement slave is arbitrarily picked from the remaining set of
>> enabled slaves.  This commit adds the concept of a "primary" slave: an
>> interface that will always be (or become) the current active slave if
>> it is enabled.
>> 
>> The rationale for this functionality is to allow the designation of a
>> preferred interface for a given bond.  For example:
>> 
>> 1. Bond is created with interfaces p1 (primary) and p2, both enabled.
>> 2. p1 becomes the current active slave (because it was designated as
>>   the primary).
>> 3. Later, p1 fails/becomes disabled.
>> 4. p2 is chosen to become the current active slave.
>> 5. Later, p1 becomes re-enabled.
>> 6. p1 is chosen to become the current active slave (because it was
>>   designated as the primary)
>> 
>> Note that p1 becomes the active slave once it becomes re-enabled, even
>> if nothing has happened to p2.
>> 
>> This "primary" concept exists in Linux kernel network interface
>> bonding, but did not previously exist in OVS bonding.
>> 
>> Only one primary slave inteface is supported per bond, and is only
>> supported for active/backup bonding.
> 
> Does it work for the case where balanced bonding falls back to active-backup?
> It seems like it should, but I see that code doesn't handle this in a few
> places.  IMHO, it might make sence to add this support.
> 
> Comments inline.
> 
> Best regards, Ilya Maximets.
> 
>> 
>> The primary slave interface is designated via
>> "other_config:bond-primary" when creating a bond.
>> 
>> Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
>> Reviewed-by: Aaron Conole <aconole@redhat.com>
>> Tested-by: Greg Rose <gvrose8192@gmail.com>
>> Acked-by: Greg Rose <gvrose8192@gmail.com>
>> ---
>> v7:
>> - After rebasing patch to head of tree, adjust test output as result
>>  of changes from 81f71381f.
>> 
>> v6:
>> - Style: bleep bloop.  Fix use of {}.
>> 
>> v5:
>> - Handle when bond is reconfigured to remove "bond-primary" config.
>> - Fix potential flapping between remaining slaves.
>> - Add a test to re-add a removed primary interface and make sure the
>>  bond reflects that properly.
>> - Add a test to remove "bond-primary" config and make sure the bond
>>  reflects that properly.
>> 
>> v4:
>> - Style: bleep bloop.  Trim line length below 79 characters.
>> 
>> v3:
>> - Adjusted print logic for when the primary slave is not attached
>> 
>> v2:
>> - Added "ovs-vsctl bond/show" label when primary interface is no
>>  longer enslaved
>> - Fixed strcmp() usage nits
>> - Moved "other_config:bond-primary" docs to the "Bonding
>>  Configuration" group
>> - Expanded commit message
>> - Added more test cases (including one for when the primary interface
>>  is no longer enslaved)
>> 
>> ofproto/bond.c        |  75 +++++++++++++++-
>> ofproto/bond.h        |   1 +
>> tests/lacp.at         |   1 +
>> tests/ofproto-dpif.at | 199 +++++++++++++++++++++++++++++++++++++++++-
>> vswitchd/bridge.c     |   5 ++
>> vswitchd/vswitch.xml  |   8 ++
>> 6 files changed, 287 insertions(+), 2 deletions(-)
>> 
>> diff --git a/ofproto/bond.c b/ofproto/bond.c
>> index 405202fb6..4ad1a20ae 100644
>> --- a/ofproto/bond.c
>> +++ b/ofproto/bond.c
>> @@ -93,6 +93,7 @@ struct bond_slave {
>>     /* Link status. */
>>     bool enabled;               /* May be chosen for flows? */
>>     bool may_enable;            /* Client considers this slave bondable. */
>> +    bool is_primary;            /* This slave is preferred over others. */
>>     long long delay_expires;    /* Time after which 'enabled' may change. */
>> 
>>     /* Rebalancing info.  Used only by bond_rebalance(). */
>> @@ -126,6 +127,7 @@ struct bond {
>>     enum lacp_status lacp_status; /* Status of LACP negotiations. */
>>     bool bond_revalidate;       /* True if flows need revalidation. */
>>     uint32_t basis;             /* Basis for flow hash function. */
>> +    char *primary;              /* Name of the primary slave interface. */
>> 
>>     /* SLB specific bonding info. */
>>     struct bond_entry *hash;     /* An array of BOND_BUCKETS elements. */
>> @@ -241,6 +243,7 @@ bond_create(const struct bond_settings *s, struct ofproto_dpif *ofproto)
>> 
>>     bond->active_slave_mac = eth_addr_zero;
>>     bond->active_slave_changed = false;
>> +    bond->primary = NULL;
>> 
>>     bond_reconfigure(bond, s);
>>     return bond;
>> @@ -290,6 +293,7 @@ bond_unref(struct bond *bond)
>>     update_recirc_rules__(bond);
>> 
>>     hmap_destroy(&bond->pr_rule_ops);
>> +    free(bond->primary);
>>     free(bond->name);
>>     free(bond);
>> }
>> @@ -459,6 +463,39 @@ bond_reconfigure(struct bond *bond, const struct bond_settings *s)
>>         bond->bond_revalidate = false;
>>     }
>> 
>> +    /*
>> +     * If a primary interface is set on the new settings:
>> +     * 1. If the bond has no primary previously set, save it and
>> +     * revalidate.
>> +     * 2. If the bond has a different primary previously set, save the
>> +     * new one and revalidate.
>> +     * 3. If the bond has the same primary previously set, do nothing.
>> +     */
>> +    if (s->primary) {
>> +        bool changed = false;
>> +        if (bond->primary) {
>> +            if (strcmp(bond->primary, s->primary)) {
>> +                free(bond->primary);
>> +                changed = true;
>> +            }
>> +        } else {
>> +            changed = true;
>> +        }
>> +
>> +        if (changed) {
>> +            bond->primary = xstrdup(s->primary);
>> +            revalidate = true;
>> +        }
>> +    } else if (bond->primary) {
>> +        /*
>> +         * If the new settings have no primary interface, but the
>> +         * bond already has a primary, remove the bond's primary.
>> +         */
>> +        free(bond->primary);
>> +        bond->primary = NULL;
>> +        revalidate = true;
>> +    }
>> +
> 
> Does above code equal to:
> 
>    if (!nullable_string_is_equal(bond->primary, s->primary)) {
>        free(bond->primary);
>        bond->primary = nullable_xstrdup(s->primary);
>        revalidate = true;
>    }
> 
> ?

Yes, that's much more compact / better.  I didn't realize that these utility functions were available.

>>     if (bond->balance != BM_AB) {
>>         if (!bond->recirc_id) {
>>             bond->recirc_id = recirc_alloc_id(bond->ofproto);
>> @@ -549,6 +586,12 @@ bond_slave_register(struct bond *bond, void *slave_,
>>         slave->name = xstrdup(netdev_get_name(netdev));
>>         bond->bond_revalidate = true;
>> 
>> +        if (bond->primary && !strcmp(bond->primary, slave->name)) {
>> +            slave->is_primary = true;
>> +        } else {
>> +            slave->is_primary = false;
>> +        }
>> +
>>         slave->enabled = false;
>>         bond_enable_slave(slave, netdev_get_carrier(netdev));
>>     }
>> @@ -644,6 +687,7 @@ bond_run(struct bond *bond, enum lacp_status lacp_status)
>> {
>>     struct bond_slave *slave;
>>     bool revalidate;
>> +    struct bond_slave *primary;
> 
> Could you keep a reversed x-mass tree here?  It's not a requirement, but at least
> we're trying to keep definitions of 'struct' variables close to each other.
> You could even combine the definition with '*slave' one.

Done.

>> 
>>     ovs_rwlock_wrlock(&rwlock);
>>     if (bond->lacp_status != lacp_status) {
>> @@ -659,11 +703,19 @@ bond_run(struct bond *bond, enum lacp_status lacp_status)
>>     }
>> 
>>     /* Enable slaves based on link status and LACP feedback. */
>> +    primary = NULL;
>>     HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
>>         bond_link_status_update(slave);
>>         slave->change_seq = seq_read(connectivity_seq_get());
>> +
>> +        /* Discover if there is an active slave marked "primary". */
> 
> Please, use single quotes (') in commnets if possible.

Done.

>> +        if (bond->balance == BM_AB && slave->is_primary && slave->enabled) {
>> +            primary = slave;
>> +        }
>>     }
>> -    if (!bond->active_slave || !bond->active_slave->enabled) {
>> +
>> +    if (!bond->active_slave || !bond->active_slave->enabled ||
>> +        (primary && primary->enabled && bond->active_slave)) {
> 
> We already checked that 'primary' is enabled in a previous loop.
> And we don't need to check for 'bond->active_slave' second time.
> If it's NULL, we will not reach this condition anyway.
> 
> Don't we need to check that bond->active_slave != primary ?
> 
> The 'if', I think, should look like this:
> 
>    if (!bond->active_slave || !bond->active_slave->enabled ||
>        (primary && bond->active_slave != primary)) {

Done.

>>         bond_choose_active_slave(bond);
>>     }
>> 
>> @@ -1393,6 +1445,20 @@ bond_print_details(struct ds *ds, const struct bond *bond)
>>     ds_put_format(ds, "lacp_fallback_ab: %s\n",
>>                   bond->lacp_fallback_ab ? "true" : "false");
>> 
>> +    bool found_primary = false;
>> +    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
> 
> There is a similar HAMP iteration loop a bit below, can we combine them
> to avoid double iteration?

Done (I moved the "slave = ..." assignment up, and put the shash_add() in this HMAP_FOR_EACH loop).

>> +        if (slave->is_primary) {
>> +            found_primary = true;
>> +        }
>> +    }
>> +
>> +    if (bond->balance == BM_AB) {
> 
> This doesn't handle case where balance-tcp falls back to AB.
> We might want to print this value regardles of mode, but add
> extra words to clarify, i.e. "primary: " vs "active-backup primary: ".

Done.

>> +        ds_put_format(ds, "primary: %s%s\n",
>> +                      bond->primary ? bond->primary : "<none>",
>> +                      (!found_primary && bond->primary) ?
> 
> Please, move the question mark to the next line.

Done.

>> +                      " (no such slave)" : "");
>> +    }
>> +
>>     ds_put_cstr(ds, "active slave mac: ");
>>     ds_put_format(ds, ETH_ADDR_FMT, ETH_ADDR_ARGS(bond->active_slave_mac));
>>     slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
>> @@ -1862,6 +1928,13 @@ bond_choose_slave(const struct bond *bond)
>> {
>>     struct bond_slave *slave, *best;
>> 
>> +    /* If there's a primary and it's active, return that. */
>> +    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
>> +        if (slave->is_primary && slave->enabled) {
>> +            return slave;
>> +        }
>> +    }
>> +
>>     /* Find the last active slave. */
>>     slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
>>     if (slave && slave->enabled) {
>> diff --git a/ofproto/bond.h b/ofproto/bond.h
>> index e7c3d9bc3..3a923dcfa 100644
>> --- a/ofproto/bond.h
>> +++ b/ofproto/bond.h
>> @@ -45,6 +45,7 @@ struct bond_settings {
>> 
>>     /* Balancing configuration. */
>>     enum bond_mode balance;
>> +    const char *primary;        /* For AB balance, primary interface name. */
>>     int rebalance_interval;     /* Milliseconds between rebalances.
>>                                    Zero to disable rebalancing. */
>> 
>> diff --git a/tests/lacp.at b/tests/lacp.at
>> index 7b460d7be..696ffc6d4 100644
>> --- a/tests/lacp.at
>> +++ b/tests/lacp.at
>> @@ -125,6 +125,7 @@ updelay: 0 ms
>> downdelay: 0 ms
>> lacp_status: negotiated
>> lacp_fallback_ab: false
>> +primary: <none>
>> active slave mac: 00:00:00:00:00:00(none)
>> 
>> slave p1: disabled
>> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
>> index 41164d735..b2d8c089f 100644
>> --- a/tests/ofproto-dpif.at
>> +++ b/tests/ofproto-dpif.at
>> @@ -29,7 +29,203 @@ AT_CHECK([ovs-appctl revalidator/wait])
>> OVS_VSWITCHD_STOP
>> AT_CLEANUP
>> 
>> -AT_SETUP([ofproto-dpif - active-backup bonding])
>> +AT_SETUP([ofproto-dpif - active-backup bonding (with primary)])
>> +
>> +# Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and
>> +#    p2 (p1 as primary) and br1 with interfaces p3, p4 and p8.
>> +# toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
>> +# With p1 down and p2 up/active, bring p1 back up.  Since p1 is the primary,
>> +# it should become active.
> 
> Please, use 'dnl' for commnets as possible.  I understand that the file is not
> that consistent already, but it's better to not make it worse.

Done.

>> +OVS_VSWITCHD_START(
>> +  [add-bond br0 bond0 p1 p2 bond_mode=active-backup \
>> +                other_config:bond-primary=p1 -- \
>> +   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
>> +   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
>> +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy -- \
>> +   add-br br1 -- \
>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234 \
>> +                  fail-mode=secure -- \
>> +   add-port br1 p3 -- set interface p3 type=dummy options:stream=unix:$OVS_RUNDIR/p1.sock ofport_request=3 -- \
>> +   add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
>> +   add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
>> +WAIT_FOR_DUMMY_PORTS([p3], [p4])
>> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> 
> Configuration updates are not immediate.  It's better to use OVS_WAIT_UNTIL.

Done.

>> +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
> 
> It's better to enable debug before the previous operation.

Done.

>> +
>> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
>> +AT_CHECK([ovs-ofctl add-flow br1 action=normal])
>> +ovs-appctl netdev-dummy/set-admin-state up
>> +ovs-appctl time/warp 100
>> +ovs-appctl netdev-dummy/set-admin-state p2 down
>> +ovs-appctl time/stop
>> +ovs-appctl time/warp 100
>> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
>> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
>> +ovs-appctl time/warp 100
>> +ovs-appctl netdev-dummy/set-admin-state p2 up
>> +ovs-appctl netdev-dummy/set-admin-state p1 down
>> +ovs-appctl time/warp 100
>> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(src=10.0.0.5,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
>> +AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(src=10.0.0.6,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
>> +ovs-appctl time/warp 200 100
>> +sleep 1
> 
> Why sleep here?  You already using time/wrap.  Just modify the wrapping values.

Sorry for the delay -- I'm still trying to figure out what several of the rest of your comments mean; it's taking me a little time.

Is there any documentation on the OVS testing framework?  I figured out these tests by copying/tweaking the existing [non-primary] test (thankfully, I have a decent understanding of Autoconf/m4).  E.g., this "sleep" and the subsequent grep -- they're inherited from the (non-primary) test.  I can't find any docs for time/warp, which leads to digging through source, etc.

>> +AT_CHECK([grep 'in_port([[348]])' ovs-vswitchd.log | filter_flow_install | strip_xout], [0], [dnl
> 
> Instead of parsing the log it might be better to dump flows via 'ovs-appctl dpctl/dump-flows'.

I'll investigate.

(For all the other comments below that I didn't respond to: I'm investigating)

>> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(frag=no), actions: <del>
>> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
>> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(frag=no), actions: <del>
>> +recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
>> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(frag=no), actions: <del>
>> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(frag=no), actions: <del>
>> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
>> +recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
>> +])
>> +
>> +ovs-appctl netdev-dummy/set-admin-state p1 up
>> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> 
> Port state updates might not be propagated in time.  We need to wait somehow.

>> +---- bond0 ----
>> +bond_mode: active-backup
>> +bond may use recirculation: no, <del>
>> +bond-hash-basis: 0
>> +updelay: 0 ms
>> +downdelay: 0 ms
>> +lacp_status: off
>> +lacp_fallback_ab: false
>> +primary: p1
>> +<active slave mac del>
>> +
>> +slave p1: enabled
>> +  active slave
>> +  may_enable: true
>> +
>> +slave p2: enabled
>> +  may_enable: true
>> +
>> +])
>> +
>> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> 
> We just checked the full output of this command, no need to do this again.

Will do.

>> +
>> +OVS_VSWITCHD_STOP
>> +AT_CLEANUP
>> +
>> +AT_SETUP([ofproto-dpif - active-backup bonding (primary validation)])
>> +# Make a switch with 3 ports in a bond, so that when we delete one of
>> +# the ports from the bond, there are still 2 ports left and the bond
>> +# remains functional.
> 
> Same here.
> 
>> +OVS_VSWITCHD_START(
>> +  [add-bond br0 bond0 p1 p2 p3 bond_mode=active-backup \
>> +                other_config:bond-primary=p1 -- \
>> +   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
>> +   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
>> +   set interface p3 type=dummy options:pstream=punix:$OVS_RUNDIR/p3.sock ofport_request=3 -- \
>> +   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy --])
>> +
>> +dnl Make sure the initial primary interface is set
>> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
> 
> Ditto.
> 
>> +
>> +dnl Down the primary interface and verify that we switched.  Then
>> +dnl bring the primary back and verify that we switched back to the
>> +dnl primary.
>> +ovs-appctl netdev-dummy/set-admin-state p1 down
>> +AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'slave p1: disabled'`"])
> 
> Ditto.
> 
>> +ovs-appctl netdev-dummy/set-admin-state p1 up
>> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> 
> Ditto.
> 
>> +---- bond0 ----
>> +bond_mode: active-backup
>> +bond may use recirculation: no, <del>
>> +bond-hash-basis: 0
>> +updelay: 0 ms
>> +downdelay: 0 ms
>> +lacp_status: off
>> +lacp_fallback_ab: false
>> +primary: p1
>> +<active slave mac del>
>> +
>> +slave p1: enabled
>> +  active slave
>> +  may_enable: true
>> +
>> +slave p2: enabled
>> +  may_enable: true
>> +
>> +slave p3: enabled
>> +  may_enable: true
>> +
>> +])
>> +
>> +dnl Now delete the primary and verify that the output shows that the
>> +dnl primary is no longer enslaved
>> +ovs-vsctl --id=@p1 get Interface p1 -- remove Port bond0 interfaces @p1
>> +AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'primary: p1 (no such slave)'`"])
> 
> Ditto.
> 
>> +
>> +dnl Now re-add the primary and verify that the output shows that the
>> +dnl primary is available again.
>> +dnl
>> +dnl First, get the UUIDs of the interfaces that exist on bond0.
>> +dnl Strip the trailing ] so that we can add a new UUID to the end.
>> +uuids=`ovs-vsctl get Port bond0 interfaces | sed -e 's/]//'`
>> +dnl Create a new port "p1" and add its UUID to the set of interfaces
>> +dnl on bond0.
>> +ovs-vsctl \
>> +   --id=@p1 create Interface name=p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
>> +   set Port bond0 interfaces="$uuids, @p1]"
>> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> 
> Ditto.
> 
>> +---- bond0 ----
>> +bond_mode: active-backup
>> +bond may use recirculation: no, <del>
>> +bond-hash-basis: 0
>> +updelay: 0 ms
>> +downdelay: 0 ms
>> +lacp_status: off
>> +lacp_fallback_ab: false
>> +primary: p1
>> +<active slave mac del>
>> +
>> +slave p1: enabled
>> +  active slave
>> +  may_enable: true
>> +
>> +slave p2: enabled
>> +  may_enable: true
>> +
>> +slave p3: enabled
>> +  may_enable: true
>> +
>> +])
>> +
>> +dnl Remove the "bond-primary" config directive from the bond.
>> +dnl First, find the other_config values on bond0.
>> +other_config=`ovs-vsctl get Port bond0 other_config | sed -e 's/\(, \)*bond-primary=p1//'`
>> +dnl Now re-set the other_config on bond0.
>> +ovs-vsctl set Port bond0 other_config="$other_config"
> 
> You can remove one entry like this:
> 
>  ovs-vsctl remove Port bond0 other_config bond-primary
> 
>> +AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
> 
> Some waiting needed.
> 
>> +---- bond0 ----
>> +bond_mode: active-backup
>> +bond may use recirculation: no, <del>
>> +bond-hash-basis: 0
>> +updelay: 0 ms
>> +downdelay: 0 ms
>> +lacp_status: off
>> +lacp_fallback_ab: false
>> +primary: <none>
>> +<active slave mac del>
>> +
>> +slave p1: enabled
>> +  active slave
>> +  may_enable: true
>> +
>> +slave p2: enabled
>> +  may_enable: true
>> +
>> +slave p3: enabled
>> +  may_enable: true
>> +
>> +])
>> +
>> +OVS_VSWITCHD_STOP
>> +AT_CLEANUP
>> +
>> +AT_SETUP([ofproto-dpif - active-backup bonding (without primary)])
>> # Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and p2
>> #    and br1 with interfaces p3, p4 and p8.
>> # toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
>> @@ -46,6 +242,7 @@ OVS_VSWITCHD_START(
>>    add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
>>    add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
>> WAIT_FOR_DUMMY_PORTS([p3], [p4])
>> +AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: <none>'`"])
> 
> Waiting is needed and ned to move this line below the logging configuration.
> 
>> AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
>> 
>> AT_CHECK([ovs-ofctl add-flow br0 action=normal])
>> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
>> index fe73c38d4..5f30b7737 100644
>> --- a/vswitchd/bridge.c
>> +++ b/vswitchd/bridge.c
>> @@ -4563,6 +4563,11 @@ port_configure_bond(struct port *port, struct bond_settings *s)
>>                   port->name);
>>     }
>> 
>> +    s->primary = NULL;
>> +    if (s->balance == BM_AB) {
> 
> Need to check for lacp_fallback_ab_cfg too?

Done.

>> +        s->primary = smap_get(&port->cfg->other_config, "bond-primary");
>> +    }
>> +
>>     miimon_interval = smap_get_int(&port->cfg->other_config,
>>                                    "bond-miimon-interval", 0);
>>     if (miimon_interval <= 0) {
>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>> index 6d334370d..a2a26e849 100644
>> --- a/vswitchd/vswitch.xml
>> +++ b/vswitchd/vswitch.xml
>> @@ -1994,6 +1994,14 @@
>>         <code>active-backup</code>.
>>       </column>
>> 
>> +      <column name="other_config" key="bond-primary"
>> +              type='{"type": "string"}'>
>> +        If a slave interface with this name exists in the bond and
>> +        is up, it will be made active.  Relevant only when <ref
>> +        column="other_config" key="bond-mode"/> is
>> +        <code>active-backup</code>.
>> +      </column>
>> +
>>       <group title="Link Failure Detection">
>>         <p>
>>           An important part of link bonding is detecting that links are down so
>> 
>
Jeff Squyres \(jsquyres\) June 23, 2020, 1:46 p.m. UTC | #6
On Jun 22, 2020, at 6:41 PM, Jeff Squyres (jsquyres) via dev <ovs-dev@openvswitch.org> wrote:
> 
> On Jun 18, 2020, at 9:44 AM, Ilya Maximets <i.maximets@ovn.org> wrote:
>> 
>> Comments inline.

Thanks for all the feedback and suggestions!

I think I figured everything out and will sending v8 of the patch shortly.
diff mbox series

Patch

diff --git a/ofproto/bond.c b/ofproto/bond.c
index 405202fb6..4ad1a20ae 100644
--- a/ofproto/bond.c
+++ b/ofproto/bond.c
@@ -93,6 +93,7 @@  struct bond_slave {
     /* Link status. */
     bool enabled;               /* May be chosen for flows? */
     bool may_enable;            /* Client considers this slave bondable. */
+    bool is_primary;            /* This slave is preferred over others. */
     long long delay_expires;    /* Time after which 'enabled' may change. */
 
     /* Rebalancing info.  Used only by bond_rebalance(). */
@@ -126,6 +127,7 @@  struct bond {
     enum lacp_status lacp_status; /* Status of LACP negotiations. */
     bool bond_revalidate;       /* True if flows need revalidation. */
     uint32_t basis;             /* Basis for flow hash function. */
+    char *primary;              /* Name of the primary slave interface. */
 
     /* SLB specific bonding info. */
     struct bond_entry *hash;     /* An array of BOND_BUCKETS elements. */
@@ -241,6 +243,7 @@  bond_create(const struct bond_settings *s, struct ofproto_dpif *ofproto)
 
     bond->active_slave_mac = eth_addr_zero;
     bond->active_slave_changed = false;
+    bond->primary = NULL;
 
     bond_reconfigure(bond, s);
     return bond;
@@ -290,6 +293,7 @@  bond_unref(struct bond *bond)
     update_recirc_rules__(bond);
 
     hmap_destroy(&bond->pr_rule_ops);
+    free(bond->primary);
     free(bond->name);
     free(bond);
 }
@@ -459,6 +463,39 @@  bond_reconfigure(struct bond *bond, const struct bond_settings *s)
         bond->bond_revalidate = false;
     }
 
+    /*
+     * If a primary interface is set on the new settings:
+     * 1. If the bond has no primary previously set, save it and
+     * revalidate.
+     * 2. If the bond has a different primary previously set, save the
+     * new one and revalidate.
+     * 3. If the bond has the same primary previously set, do nothing.
+     */
+    if (s->primary) {
+        bool changed = false;
+        if (bond->primary) {
+            if (strcmp(bond->primary, s->primary)) {
+                free(bond->primary);
+                changed = true;
+            }
+        } else {
+            changed = true;
+        }
+
+        if (changed) {
+            bond->primary = xstrdup(s->primary);
+            revalidate = true;
+        }
+    } else if (bond->primary) {
+        /*
+         * If the new settings have no primary interface, but the
+         * bond already has a primary, remove the bond's primary.
+         */
+        free(bond->primary);
+        bond->primary = NULL;
+        revalidate = true;
+    }
+
     if (bond->balance != BM_AB) {
         if (!bond->recirc_id) {
             bond->recirc_id = recirc_alloc_id(bond->ofproto);
@@ -549,6 +586,12 @@  bond_slave_register(struct bond *bond, void *slave_,
         slave->name = xstrdup(netdev_get_name(netdev));
         bond->bond_revalidate = true;
 
+        if (bond->primary && !strcmp(bond->primary, slave->name)) {
+            slave->is_primary = true;
+        } else {
+            slave->is_primary = false;
+        }
+
         slave->enabled = false;
         bond_enable_slave(slave, netdev_get_carrier(netdev));
     }
@@ -644,6 +687,7 @@  bond_run(struct bond *bond, enum lacp_status lacp_status)
 {
     struct bond_slave *slave;
     bool revalidate;
+    struct bond_slave *primary;
 
     ovs_rwlock_wrlock(&rwlock);
     if (bond->lacp_status != lacp_status) {
@@ -659,11 +703,19 @@  bond_run(struct bond *bond, enum lacp_status lacp_status)
     }
 
     /* Enable slaves based on link status and LACP feedback. */
+    primary = NULL;
     HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
         bond_link_status_update(slave);
         slave->change_seq = seq_read(connectivity_seq_get());
+
+        /* Discover if there is an active slave marked "primary". */
+        if (bond->balance == BM_AB && slave->is_primary && slave->enabled) {
+            primary = slave;
+        }
     }
-    if (!bond->active_slave || !bond->active_slave->enabled) {
+
+    if (!bond->active_slave || !bond->active_slave->enabled ||
+        (primary && primary->enabled && bond->active_slave)) {
         bond_choose_active_slave(bond);
     }
 
@@ -1393,6 +1445,20 @@  bond_print_details(struct ds *ds, const struct bond *bond)
     ds_put_format(ds, "lacp_fallback_ab: %s\n",
                   bond->lacp_fallback_ab ? "true" : "false");
 
+    bool found_primary = false;
+    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
+        if (slave->is_primary) {
+            found_primary = true;
+        }
+    }
+
+    if (bond->balance == BM_AB) {
+        ds_put_format(ds, "primary: %s%s\n",
+                      bond->primary ? bond->primary : "<none>",
+                      (!found_primary && bond->primary) ?
+                      " (no such slave)" : "");
+    }
+
     ds_put_cstr(ds, "active slave mac: ");
     ds_put_format(ds, ETH_ADDR_FMT, ETH_ADDR_ARGS(bond->active_slave_mac));
     slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
@@ -1862,6 +1928,13 @@  bond_choose_slave(const struct bond *bond)
 {
     struct bond_slave *slave, *best;
 
+    /* If there's a primary and it's active, return that. */
+    HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) {
+        if (slave->is_primary && slave->enabled) {
+            return slave;
+        }
+    }
+
     /* Find the last active slave. */
     slave = bond_find_slave_by_mac(bond, bond->active_slave_mac);
     if (slave && slave->enabled) {
diff --git a/ofproto/bond.h b/ofproto/bond.h
index e7c3d9bc3..3a923dcfa 100644
--- a/ofproto/bond.h
+++ b/ofproto/bond.h
@@ -45,6 +45,7 @@  struct bond_settings {
 
     /* Balancing configuration. */
     enum bond_mode balance;
+    const char *primary;        /* For AB balance, primary interface name. */
     int rebalance_interval;     /* Milliseconds between rebalances.
                                    Zero to disable rebalancing. */
 
diff --git a/tests/lacp.at b/tests/lacp.at
index 7b460d7be..696ffc6d4 100644
--- a/tests/lacp.at
+++ b/tests/lacp.at
@@ -125,6 +125,7 @@  updelay: 0 ms
 downdelay: 0 ms
 lacp_status: negotiated
 lacp_fallback_ab: false
+primary: <none>
 active slave mac: 00:00:00:00:00:00(none)
 
 slave p1: disabled
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index 41164d735..b2d8c089f 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -29,7 +29,203 @@  AT_CHECK([ovs-appctl revalidator/wait])
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
-AT_SETUP([ofproto-dpif - active-backup bonding])
+AT_SETUP([ofproto-dpif - active-backup bonding (with primary)])
+
+# Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and
+#    p2 (p1 as primary) and br1 with interfaces p3, p4 and p8.
+# toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
+# With p1 down and p2 up/active, bring p1 back up.  Since p1 is the primary,
+# it should become active.
+OVS_VSWITCHD_START(
+  [add-bond br0 bond0 p1 p2 bond_mode=active-backup \
+                other_config:bond-primary=p1 -- \
+   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
+   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
+   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy -- \
+   add-br br1 -- \
+   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
+   set bridge br1 datapath-type=dummy other-config:datapath-id=1234 \
+                  fail-mode=secure -- \
+   add-port br1 p3 -- set interface p3 type=dummy options:stream=unix:$OVS_RUNDIR/p1.sock ofport_request=3 -- \
+   add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
+   add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
+WAIT_FOR_DUMMY_PORTS([p3], [p4])
+AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
+AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
+
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+AT_CHECK([ovs-ofctl add-flow br1 action=normal])
+ovs-appctl netdev-dummy/set-admin-state up
+ovs-appctl time/warp 100
+ovs-appctl netdev-dummy/set-admin-state p2 down
+ovs-appctl time/stop
+ovs-appctl time/warp 100
+AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
+AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.3,dst=10.0.0.4,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
+ovs-appctl time/warp 100
+ovs-appctl netdev-dummy/set-admin-state p2 up
+ovs-appctl netdev-dummy/set-admin-state p1 down
+ovs-appctl time/warp 100
+AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(src=10.0.0.5,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
+AT_CHECK([ovs-appctl netdev-dummy/receive p7 'in_port(7),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(src=10.0.0.6,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'])
+ovs-appctl time/warp 200 100
+sleep 1
+AT_CHECK([grep 'in_port([[348]])' ovs-vswitchd.log | filter_flow_install | strip_xout], [0], [dnl
+recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(frag=no), actions: <del>
+recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
+recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(frag=no), actions: <del>
+recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
+recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0d),eth_type(0x0800),ipv4(frag=no), actions: <del>
+recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0e),eth_type(0x0800),ipv4(frag=no), actions: <del>
+recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:09,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
+recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:0b,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8035), actions: <del>
+])
+
+ovs-appctl netdev-dummy/set-admin-state p1 up
+AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
+---- bond0 ----
+bond_mode: active-backup
+bond may use recirculation: no, <del>
+bond-hash-basis: 0
+updelay: 0 ms
+downdelay: 0 ms
+lacp_status: off
+lacp_fallback_ab: false
+primary: p1
+<active slave mac del>
+
+slave p1: enabled
+  active slave
+  may_enable: true
+
+slave p2: enabled
+  may_enable: true
+
+])
+
+AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - active-backup bonding (primary validation)])
+# Make a switch with 3 ports in a bond, so that when we delete one of
+# the ports from the bond, there are still 2 ports left and the bond
+# remains functional.
+OVS_VSWITCHD_START(
+  [add-bond br0 bond0 p1 p2 p3 bond_mode=active-backup \
+                other_config:bond-primary=p1 -- \
+   set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
+   set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \
+   set interface p3 type=dummy options:pstream=punix:$OVS_RUNDIR/p3.sock ofport_request=3 -- \
+   add-port br0 p7 -- set interface p7 ofport_request=7 type=dummy --])
+
+dnl Make sure the initial primary interface is set
+AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: p1'`"])
+
+dnl Down the primary interface and verify that we switched.  Then
+dnl bring the primary back and verify that we switched back to the
+dnl primary.
+ovs-appctl netdev-dummy/set-admin-state p1 down
+AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'slave p1: disabled'`"])
+ovs-appctl netdev-dummy/set-admin-state p1 up
+AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
+---- bond0 ----
+bond_mode: active-backup
+bond may use recirculation: no, <del>
+bond-hash-basis: 0
+updelay: 0 ms
+downdelay: 0 ms
+lacp_status: off
+lacp_fallback_ab: false
+primary: p1
+<active slave mac del>
+
+slave p1: enabled
+  active slave
+  may_enable: true
+
+slave p2: enabled
+  may_enable: true
+
+slave p3: enabled
+  may_enable: true
+
+])
+
+dnl Now delete the primary and verify that the output shows that the
+dnl primary is no longer enslaved
+ovs-vsctl --id=@p1 get Interface p1 -- remove Port bond0 interfaces @p1
+AT_CHECK([test -n "`ovs-appctl bond/show | fgrep 'primary: p1 (no such slave)'`"])
+
+dnl Now re-add the primary and verify that the output shows that the
+dnl primary is available again.
+dnl
+dnl First, get the UUIDs of the interfaces that exist on bond0.
+dnl Strip the trailing ] so that we can add a new UUID to the end.
+uuids=`ovs-vsctl get Port bond0 interfaces | sed -e 's/]//'`
+dnl Create a new port "p1" and add its UUID to the set of interfaces
+dnl on bond0.
+ovs-vsctl \
+   --id=@p1 create Interface name=p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \
+   set Port bond0 interfaces="$uuids, @p1]"
+AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
+---- bond0 ----
+bond_mode: active-backup
+bond may use recirculation: no, <del>
+bond-hash-basis: 0
+updelay: 0 ms
+downdelay: 0 ms
+lacp_status: off
+lacp_fallback_ab: false
+primary: p1
+<active slave mac del>
+
+slave p1: enabled
+  active slave
+  may_enable: true
+
+slave p2: enabled
+  may_enable: true
+
+slave p3: enabled
+  may_enable: true
+
+])
+
+dnl Remove the "bond-primary" config directive from the bond.
+dnl First, find the other_config values on bond0.
+other_config=`ovs-vsctl get Port bond0 other_config | sed -e 's/\(, \)*bond-primary=p1//'`
+dnl Now re-set the other_config on bond0.
+ovs-vsctl set Port bond0 other_config="$other_config"
+AT_CHECK([ovs-appctl bond/show | STRIP_RECIRC_ID | STRIP_ACTIVE_SLAVE_MAC], [0], [dnl
+---- bond0 ----
+bond_mode: active-backup
+bond may use recirculation: no, <del>
+bond-hash-basis: 0
+updelay: 0 ms
+downdelay: 0 ms
+lacp_status: off
+lacp_fallback_ab: false
+primary: <none>
+<active slave mac del>
+
+slave p1: enabled
+  active slave
+  may_enable: true
+
+slave p2: enabled
+  may_enable: true
+
+slave p3: enabled
+  may_enable: true
+
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - active-backup bonding (without primary)])
 # Create br0 with interfaces p1, p2 and p7, creating bond0 with p1 and p2
 #    and br1 with interfaces p3, p4 and p8.
 # toggle p1,p2 of bond0 up and down to test bonding in active-backup mode.
@@ -46,6 +242,7 @@  OVS_VSWITCHD_START(
    add-port br1 p4 -- set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock ofport_request=4 -- \
    add-port br1 p8 -- set interface p8 ofport_request=8 type=dummy --])
 WAIT_FOR_DUMMY_PORTS([p3], [p4])
+AT_CHECK([test -n "`ovs-appctl bond/show | grep 'primary: <none>'`"])
 AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg])
 
 AT_CHECK([ovs-ofctl add-flow br0 action=normal])
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index fe73c38d4..5f30b7737 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -4563,6 +4563,11 @@  port_configure_bond(struct port *port, struct bond_settings *s)
                   port->name);
     }
 
+    s->primary = NULL;
+    if (s->balance == BM_AB) {
+        s->primary = smap_get(&port->cfg->other_config, "bond-primary");
+    }
+
     miimon_interval = smap_get_int(&port->cfg->other_config,
                                    "bond-miimon-interval", 0);
     if (miimon_interval <= 0) {
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 6d334370d..a2a26e849 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -1994,6 +1994,14 @@ 
         <code>active-backup</code>.
       </column>
 
+      <column name="other_config" key="bond-primary"
+              type='{"type": "string"}'>
+        If a slave interface with this name exists in the bond and
+        is up, it will be made active.  Relevant only when <ref
+        column="other_config" key="bond-mode"/> is
+        <code>active-backup</code>.
+      </column>
+
       <group title="Link Failure Detection">
         <p>
           An important part of link bonding is detecting that links are down so