mbox series

[ovs-dev,v3,00/16] Support additional-chassis for ports

Message ID 20220217151712.2292329-1-ihrachys@redhat.com
Headers show
Series Support additional-chassis for ports | expand

Message

Ihar Hrachyshka Feb. 17, 2022, 3:16 p.m. UTC
This version of the series is complete. It contains the previously
missing ddlog implementation; also RARP that is used for additional
chassis activation is now reinjected into the pipeline after blocking
flows are deleted by vswitchd.

Ihar Hrachyshka (16):
  tests: log more info on OVN_CHECK_PACKETS* failure
  tests: don't bail from OVN_CHECK_PACKETS_CONTAIN prematurily
  Introduce chassis_is_vtep
  northd: introduce separate function to look up chassis
  northd: separate code for nb->sb port binding chassis update
  Pass chassis and encap into get_port_binding_tun
  Introduce match_outport_dp_and_port_keys in physical.c
  Split code to set zone info into put_zones_ofpacts
  Use get_port_binding_tun instead of chassis_tunnel_find
  Tag all packets that arrived from a tunnel as LOCAL_ONLY
  Update port-up on main chassis only
  Introduce LSP:options:requested-additional-chassis
  Clone packets to both port chassis
  Enforce tunneling when additional-chassis is set
  Implement RARP activation strategy for ports
  Reinject RARP packet when activation-strategy=rarp

---

v0: initial draft (single patch)
v1: split into pieces
v1: renamed options: migration-destination ->
                     requested-additional-chassis,
                     migration-unblocked ->
                     additional-chassis-activated
v1: introduced options:activation-strategy=rarp to allow for other
    strategies / having default no-op strategy
v1: implement in-memory port-activated tracking to avoid races
v1: numerous code cleanup / bug fixes
v1: special handling for localnet attached switches
v2: added ddlog implementation
v2: re-inject RARP packet after vswitch updates flows
v3: re-sent as a single series

---

 controller/binding.c        | 188 +++++--
 controller/binding.h        |   2 +
 controller/if-status.c      |  15 +-
 controller/if-status.h      |   1 +
 controller/lport.c          |  19 +-
 controller/ovn-controller.c |   4 +-
 controller/physical.c       | 390 ++++++++++----
 controller/pinctrl.c        | 254 ++++++++-
 controller/pinctrl.h        |   2 +
 include/ovn/actions.h       |   9 +
 lib/actions.c               |  40 +-
 northd/northd.c             | 130 +++--
 northd/ovn-northd.c         |   7 +-
 northd/ovn-sb.dlopts        |   2 +
 northd/ovn_northd.dl        | 169 +++++-
 ovn-nb.xml                  |  18 +
 ovn-sb.ovsschema            |  17 +-
 ovn-sb.xml                  |  73 ++-
 tests/ovn.at                | 991 +++++++++++++++++++++++++++++++++++-
 utilities/ovn-trace.c       |   3 +
 20 files changed, 2110 insertions(+), 224 deletions(-)

Comments

Mark Michelson Feb. 17, 2022, 9:55 p.m. UTC | #1
Hi Ihar,

I ran out of time to be reviewing this today, but what I can say is that

patches 1-6: Acked-by: Mark Michelson <mmichels@redhat.com>

patch 7: I have a small quibble with. See my response on it.

patches 8-9: Acked-by: Mark Michelson <mmichels@redhat.com>

I had a look at patch 10 but don't want to ack it before looking at the 
rest of the series. I'll have a look sometime early next week.

On 2/17/22 10:16, Ihar Hrachyshka wrote:
> This version of the series is complete. It contains the previously
> missing ddlog implementation; also RARP that is used for additional
> chassis activation is now reinjected into the pipeline after blocking
> flows are deleted by vswitchd.
> 
> Ihar Hrachyshka (16):
>    tests: log more info on OVN_CHECK_PACKETS* failure
>    tests: don't bail from OVN_CHECK_PACKETS_CONTAIN prematurily
>    Introduce chassis_is_vtep
>    northd: introduce separate function to look up chassis
>    northd: separate code for nb->sb port binding chassis update
>    Pass chassis and encap into get_port_binding_tun
>    Introduce match_outport_dp_and_port_keys in physical.c
>    Split code to set zone info into put_zones_ofpacts
>    Use get_port_binding_tun instead of chassis_tunnel_find
>    Tag all packets that arrived from a tunnel as LOCAL_ONLY
>    Update port-up on main chassis only
>    Introduce LSP:options:requested-additional-chassis
>    Clone packets to both port chassis
>    Enforce tunneling when additional-chassis is set
>    Implement RARP activation strategy for ports
>    Reinject RARP packet when activation-strategy=rarp
> 
> ---
> 
> v0: initial draft (single patch)
> v1: split into pieces
> v1: renamed options: migration-destination ->
>                       requested-additional-chassis,
>                       migration-unblocked ->
>                       additional-chassis-activated
> v1: introduced options:activation-strategy=rarp to allow for other
>      strategies / having default no-op strategy
> v1: implement in-memory port-activated tracking to avoid races
> v1: numerous code cleanup / bug fixes
> v1: special handling for localnet attached switches
> v2: added ddlog implementation
> v2: re-inject RARP packet after vswitch updates flows
> v3: re-sent as a single series
> 
> ---
> 
>   controller/binding.c        | 188 +++++--
>   controller/binding.h        |   2 +
>   controller/if-status.c      |  15 +-
>   controller/if-status.h      |   1 +
>   controller/lport.c          |  19 +-
>   controller/ovn-controller.c |   4 +-
>   controller/physical.c       | 390 ++++++++++----
>   controller/pinctrl.c        | 254 ++++++++-
>   controller/pinctrl.h        |   2 +
>   include/ovn/actions.h       |   9 +
>   lib/actions.c               |  40 +-
>   northd/northd.c             | 130 +++--
>   northd/ovn-northd.c         |   7 +-
>   northd/ovn-sb.dlopts        |   2 +
>   northd/ovn_northd.dl        | 169 +++++-
>   ovn-nb.xml                  |  18 +
>   ovn-sb.ovsschema            |  17 +-
>   ovn-sb.xml                  |  73 ++-
>   tests/ovn.at                | 991 +++++++++++++++++++++++++++++++++++-
>   utilities/ovn-trace.c       |   3 +
>   20 files changed, 2110 insertions(+), 224 deletions(-)
>
Mark Michelson Feb. 21, 2022, 11:47 p.m. UTC | #2
Hi Ihar,

I now have read the entire patch series, and I wonder how you would feel 
about an alternate approach. Currently, what you've implemented is a 
supplement to options:requested-chassis, but what if you went for a 
replacement instead?

What if you added a new column (or new "options" key) for logical switch 
ports that took a list of chassis. The list would be ordered by 
preference, meaning that the first chassis would act as the requested 
chassis, and any after would act as a standby [1].

If this method is used for indicating the preferred chassis, we honor 
it. And if this column is empty, then we can fall back to the "classic" 
behavior of using options:requested-chassis.

The advantages of this method are:

1) Keeping the logic separate from options:requested-chassis means that 
it will (hopefully) be simpler to implement, and less likely to cause 
issues since most of the existing requested-chassis logic will remain 
untouched.

2) Using a list allows for an arbitrary number of chassis to be 
specified. For a simple migration, this might not be necessary. But 
allowing for a list allows for other uses of this particular column.

3) Since activation strategy is implemented as a separate option, it 
means that migration doesn't have to be the only use for this column. 
For instance, we could implement an activation strategy such that if BFD 
probes fail on the main chassis, then we could activate the workload on 
the next chassis in the list. That way it could be used for failover in 
addition to migrations.

What do you think?
Mark Michelson

[1] Alternately, you could assign a specific priority value to each 
chassis instead of using the order of the items in the list. This is 
similar to what is done for other options, like gateway_chassis.

On 2/17/22 10:16, Ihar Hrachyshka wrote:
> This version of the series is complete. It contains the previously
> missing ddlog implementation; also RARP that is used for additional
> chassis activation is now reinjected into the pipeline after blocking
> flows are deleted by vswitchd.
> 
> Ihar Hrachyshka (16):
>    tests: log more info on OVN_CHECK_PACKETS* failure
>    tests: don't bail from OVN_CHECK_PACKETS_CONTAIN prematurily
>    Introduce chassis_is_vtep
>    northd: introduce separate function to look up chassis
>    northd: separate code for nb->sb port binding chassis update
>    Pass chassis and encap into get_port_binding_tun
>    Introduce match_outport_dp_and_port_keys in physical.c
>    Split code to set zone info into put_zones_ofpacts
>    Use get_port_binding_tun instead of chassis_tunnel_find
>    Tag all packets that arrived from a tunnel as LOCAL_ONLY
>    Update port-up on main chassis only
>    Introduce LSP:options:requested-additional-chassis
>    Clone packets to both port chassis
>    Enforce tunneling when additional-chassis is set
>    Implement RARP activation strategy for ports
>    Reinject RARP packet when activation-strategy=rarp
> 
> ---
> 
> v0: initial draft (single patch)
> v1: split into pieces
> v1: renamed options: migration-destination ->
>                       requested-additional-chassis,
>                       migration-unblocked ->
>                       additional-chassis-activated
> v1: introduced options:activation-strategy=rarp to allow for other
>      strategies / having default no-op strategy
> v1: implement in-memory port-activated tracking to avoid races
> v1: numerous code cleanup / bug fixes
> v1: special handling for localnet attached switches
> v2: added ddlog implementation
> v2: re-inject RARP packet after vswitch updates flows
> v3: re-sent as a single series
> 
> ---
> 
>   controller/binding.c        | 188 +++++--
>   controller/binding.h        |   2 +
>   controller/if-status.c      |  15 +-
>   controller/if-status.h      |   1 +
>   controller/lport.c          |  19 +-
>   controller/ovn-controller.c |   4 +-
>   controller/physical.c       | 390 ++++++++++----
>   controller/pinctrl.c        | 254 ++++++++-
>   controller/pinctrl.h        |   2 +
>   include/ovn/actions.h       |   9 +
>   lib/actions.c               |  40 +-
>   northd/northd.c             | 130 +++--
>   northd/ovn-northd.c         |   7 +-
>   northd/ovn-sb.dlopts        |   2 +
>   northd/ovn_northd.dl        | 169 +++++-
>   ovn-nb.xml                  |  18 +
>   ovn-sb.ovsschema            |  17 +-
>   ovn-sb.xml                  |  73 ++-
>   tests/ovn.at                | 991 +++++++++++++++++++++++++++++++++++-
>   utilities/ovn-trace.c       |   3 +
>   20 files changed, 2110 insertions(+), 224 deletions(-)
>
Ihar Hrachyshka March 30, 2022, 12:50 a.m. UTC | #3
Hi Mark,

I've redesigned the whole thing to reuse options:requested-chassis 
(which is now comma separated list of chassis); db wise it's also 
backwards compatible (additional chassis are still tracked in a separate 
additional_chassis field, which is a list of chassis ids). The patch 
series is here:

https://patchwork.ozlabs.org/project/ovn/list/?series=292585

Let me know if this covers your concerns.

Ihar

On 2/21/22 6:47 PM, Mark Michelson wrote:
> Hi Ihar,
>
> I now have read the entire patch series, and I wonder how you would 
> feel about an alternate approach. Currently, what you've implemented 
> is a supplement to options:requested-chassis, but what if you went for 
> a replacement instead?
>
> What if you added a new column (or new "options" key) for logical 
> switch ports that took a list of chassis. The list would be ordered by 
> preference, meaning that the first chassis would act as the requested 
> chassis, and any after would act as a standby [1].
>
> If this method is used for indicating the preferred chassis, we honor 
> it. And if this column is empty, then we can fall back to the 
> "classic" behavior of using options:requested-chassis.
>
> The advantages of this method are:
>
> 1) Keeping the logic separate from options:requested-chassis means 
> that it will (hopefully) be simpler to implement, and less likely to 
> cause issues since most of the existing requested-chassis logic will 
> remain untouched.
>
> 2) Using a list allows for an arbitrary number of chassis to be 
> specified. For a simple migration, this might not be necessary. But 
> allowing for a list allows for other uses of this particular column.
>
> 3) Since activation strategy is implemented as a separate option, it 
> means that migration doesn't have to be the only use for this column. 
> For instance, we could implement an activation strategy such that if 
> BFD probes fail on the main chassis, then we could activate the 
> workload on the next chassis in the list. That way it could be used 
> for failover in addition to migrations.
>
> What do you think?
> Mark Michelson
>
> [1] Alternately, you could assign a specific priority value to each 
> chassis instead of using the order of the items in the list. This is 
> similar to what is done for other options, like gateway_chassis.
>
> On 2/17/22 10:16, Ihar Hrachyshka wrote:
>> This version of the series is complete. It contains the previously
>> missing ddlog implementation; also RARP that is used for additional
>> chassis activation is now reinjected into the pipeline after blocking
>> flows are deleted by vswitchd.
>>
>> Ihar Hrachyshka (16):
>>    tests: log more info on OVN_CHECK_PACKETS* failure
>>    tests: don't bail from OVN_CHECK_PACKETS_CONTAIN prematurily
>>    Introduce chassis_is_vtep
>>    northd: introduce separate function to look up chassis
>>    northd: separate code for nb->sb port binding chassis update
>>    Pass chassis and encap into get_port_binding_tun
>>    Introduce match_outport_dp_and_port_keys in physical.c
>>    Split code to set zone info into put_zones_ofpacts
>>    Use get_port_binding_tun instead of chassis_tunnel_find
>>    Tag all packets that arrived from a tunnel as LOCAL_ONLY
>>    Update port-up on main chassis only
>>    Introduce LSP:options:requested-additional-chassis
>>    Clone packets to both port chassis
>>    Enforce tunneling when additional-chassis is set
>>    Implement RARP activation strategy for ports
>>    Reinject RARP packet when activation-strategy=rarp
>>
>> ---
>>
>> v0: initial draft (single patch)
>> v1: split into pieces
>> v1: renamed options: migration-destination ->
>>                       requested-additional-chassis,
>>                       migration-unblocked ->
>>                       additional-chassis-activated
>> v1: introduced options:activation-strategy=rarp to allow for other
>>      strategies / having default no-op strategy
>> v1: implement in-memory port-activated tracking to avoid races
>> v1: numerous code cleanup / bug fixes
>> v1: special handling for localnet attached switches
>> v2: added ddlog implementation
>> v2: re-inject RARP packet after vswitch updates flows
>> v3: re-sent as a single series
>>
>> ---
>>
>>   controller/binding.c        | 188 +++++--
>>   controller/binding.h        |   2 +
>>   controller/if-status.c      |  15 +-
>>   controller/if-status.h      |   1 +
>>   controller/lport.c          |  19 +-
>>   controller/ovn-controller.c |   4 +-
>>   controller/physical.c       | 390 ++++++++++----
>>   controller/pinctrl.c        | 254 ++++++++-
>>   controller/pinctrl.h        |   2 +
>>   include/ovn/actions.h       |   9 +
>>   lib/actions.c               |  40 +-
>>   northd/northd.c             | 130 +++--
>>   northd/ovn-northd.c         |   7 +-
>>   northd/ovn-sb.dlopts        |   2 +
>>   northd/ovn_northd.dl        | 169 +++++-
>>   ovn-nb.xml                  |  18 +
>>   ovn-sb.ovsschema            |  17 +-
>>   ovn-sb.xml                  |  73 ++-
>>   tests/ovn.at                | 991 +++++++++++++++++++++++++++++++++++-
>>   utilities/ovn-trace.c       |   3 +
>>   20 files changed, 2110 insertions(+), 224 deletions(-)
>>
>