diff mbox series

[ovs-dev,ovn,5/6,v2] Support logical switches with multiple localnet ports

Message ID 20200428203635.21822-5-ihrachys@redhat.com
State Superseded, archived
Headers show
Series [ovs-dev,ovn,1/6] Spin out flow generation into build_dhcpv4_options_flows | expand

Commit Message

Ihar Hrachyshka April 28, 2020, 8:36 p.m. UTC
Assuming only a single localnet port is actually plugged mapped on
each chassis, this allows to maintain disjoint networks plugged to the
same switch.  This is useful to simplify resource management for
OpenStack "routed provider networks" feature [1] where a single
"network" (which traditionally maps to logical switches in OVN) is
comprised of multiple L2 segments and assumes external L3 routing
implemented between the segments.

TODO: consider E-W routing between localnet vlan tagged LSs
      (ovn-chassis-mac-mappings).

[1]: https://docs.openstack.org/ocata/networking-guide/config-routed-networks.html

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>

---

v2: rebase on top of series that refactors code dealing with localnet ports.
v2: tests: send packets both ways, more test scenarios covered.
v2: use x2nrealloc to allocate ->localnet_ports.
v2: use n_localnet_ports counter instead of localnet_ports pointer to detect
    switches with localnet ports.
---
 controller/binding.c   |  16 +++
 controller/patch.c     |  24 +++--
 northd/ovn-northd.c    |  63 ++++++-----
 ovn-architecture.7.xml |  25 +++--
 ovn-nb.xml             |  21 ++--
 ovn-sb.xml             |  21 ++--
 tests/ovn.at           | 237 +++++++++++++++++++++++++++++++++++++++++
 7 files changed, 347 insertions(+), 60 deletions(-)

Comments

Han Zhou May 1, 2020, 7:28 a.m. UTC | #1
On Tue, Apr 28, 2020 at 1:38 PM Ihar Hrachyshka <ihrachys@redhat.com> wrote:
>
> Assuming only a single localnet port is actually plugged mapped on
> each chassis, this allows to maintain disjoint networks plugged to the
> same switch.  This is useful to simplify resource management for
> OpenStack "routed provider networks" feature [1] where a single
> "network" (which traditionally maps to logical switches in OVN) is
> comprised of multiple L2 segments and assumes external L3 routing
> implemented between the segments.
>
> TODO: consider E-W routing between localnet vlan tagged LSs
>       (ovn-chassis-mac-mappings).
>
> [1]:
https://docs.openstack.org/ocata/networking-guide/config-routed-networks.html

After reading the context and related discussions, I am a little concerned
about this approach. It changes the concept of a logical switch in OVN,
which was a L2 concept. It may be ok to change the concept if all the
aspects are understood and properly documented and implemented. However, at
least it is not clear to me how would it work on the physical pipeline when
VIF1 on chassisA sends packet to VIF2 on chassis B through physical network
when the two VIFs are on same LS that has multiple localnet ports. Today
only one localnet port is allowed, so it is handled by sending the packet
through *the* localnet port (without localnet port it goes through tunnel).
With this patch it seems would choose the last localnet port, regardless of
the physical location (L2 segment) of each VIF. I would expect the data
structure change in "local_datapath" to maintain a list of localnet ports
instead of only one, and the logic in physical.c to pick up the desired
localnet port, according to the L2 segment it physically belongs to. (it
shouldn't be a mapping between chassis and physical network/vlan, because
it is natural to have multiple VLANs connected to a single chassis and have
VIFs on different VLANs).

Furthermore, if we change the concept of LS being a single L2 domain, we
may need to consider more changes in the L2 logical pipelines, such as:
- L2 lookup stage should only look up destination for the ports that are in
the same segment.
- BUM traffic should only go to the localnet port that belong to the same
L2 segment of the LS.

All this are suggesting that we need to introduce a *segment* abstraction
in OVN, and multiple segments belong to same LS. I wonder if it is even
more complex to implement this in OVN than in Neutron plugin. Maybe I
missed something here and made it more complex than necessary. Please
correct me if I am wrong :)

Thanks,
Han

>
> Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
>
> ---
>
> v2: rebase on top of series that refactors code dealing with localnet
ports.
> v2: tests: send packets both ways, more test scenarios covered.
> v2: use x2nrealloc to allocate ->localnet_ports.
> v2: use n_localnet_ports counter instead of localnet_ports pointer to
detect
>     switches with localnet ports.
> ---
>  controller/binding.c   |  16 +++
>  controller/patch.c     |  24 +++--
>  northd/ovn-northd.c    |  63 ++++++-----
>  ovn-architecture.7.xml |  25 +++--
>  ovn-nb.xml             |  21 ++--
>  ovn-sb.xml             |  21 ++--
>  tests/ovn.at           | 237 +++++++++++++++++++++++++++++++++++++++++
>  7 files changed, 347 insertions(+), 60 deletions(-)
>
> diff --git a/controller/binding.c b/controller/binding.c
> index 20a89d07d..c88c4ece8 100644
> --- a/controller/binding.c
> +++ b/controller/binding.c
> @@ -692,12 +692,28 @@ add_localnet_egress_interface_mappings(
>      }
>  }
>
> +static bool
> +is_network_plugged(const struct sbrec_port_binding *binding_rec,
> +                   struct shash *bridge_mappings)
> +{
> +    const char *network = smap_get(&binding_rec->options,
"network_name");
> +    if (!network) {
> +        return false;
> +    }
> +    return shash_find_data(bridge_mappings, network);
> +}
> +
>  static void
>  consider_localnet_port(const struct sbrec_port_binding *binding_rec,
>                         struct shash *bridge_mappings,
>                         struct sset *egress_ifaces,
>                         struct hmap *local_datapaths)
>  {
> +    /* Ignore localnet ports for unplugged networks. */
> +    if (!is_network_plugged(binding_rec, bridge_mappings)) {
> +        return;
> +    }
> +
>      add_localnet_egress_interface_mappings(binding_rec,
>              bridge_mappings, egress_ifaces);
>
> diff --git a/controller/patch.c b/controller/patch.c
> index 349faae17..52255cc3a 100644
> --- a/controller/patch.c
> +++ b/controller/patch.c
> @@ -198,9 +198,9 @@ add_bridge_mappings(struct ovsdb_idl_txn *ovs_idl_txn,
>              continue;
>          }
>
> -        const char *patch_port_id;
> +        bool is_localnet = false;
>          if (!strcmp(binding->type, "localnet")) {
> -            patch_port_id = "ovn-localnet-port";
> +            is_localnet = true;
>          } else if (!strcmp(binding->type, "l2gateway")) {
>              if (!binding->chassis
>                  || strcmp(chassis->name, binding->chassis->name)) {
> @@ -208,7 +208,6 @@ add_bridge_mappings(struct ovsdb_idl_txn *ovs_idl_txn,
>                   * so we should not create any patch ports for it. */
>                  continue;
>              }
> -            patch_port_id = "ovn-l2gateway-port";
>          } else {
>              /* not a localnet or L2 gateway port. */
>              continue;
> @@ -224,12 +223,25 @@ add_bridge_mappings(struct ovsdb_idl_txn
*ovs_idl_txn,
>          struct ovsrec_bridge *br_ln = shash_find_data(&bridge_mappings,
network);
>          if (!br_ln) {
>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
1);
> -            VLOG_ERR_RL(&rl, "bridge not found for %s port '%s' "
> -                    "with network name '%s'",
> -                    binding->type, binding->logical_port, network);
> +            if (!is_localnet) {
> +                VLOG_ERR_RL(&rl, "bridge not found for %s port '%s' "
> +                        "with network name '%s'",
> +                        binding->type, binding->logical_port, network);
> +            } else {
> +                VLOG_INFO_RL(&rl, "bridge not found for localnet port
'%s' "
> +                        "with network name '%s'; skipping",
> +                        binding->logical_port, network);
> +            }
>              continue;
>          }
>
> +        const char *patch_port_id;
> +        if (is_localnet) {
> +            patch_port_id = "ovn-localnet-port";
> +        } else {
> +            patch_port_id = "ovn-l2gateway-port";
> +        }
> +
>          char *name1 = patch_port_name(br_int->name,
binding->logical_port);
>          char *name2 = patch_port_name(binding->logical_port,
br_int->name);
>          create_patch_port(ovs_idl_txn, patch_port_id,
binding->logical_port,
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 048c28ca6..91ab4449e 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -543,7 +543,9 @@ struct ovn_datapath {
>      /* The "derived" OVN port representing the instance of l3dgw_port on
>       * the "redirect-chassis". */
>      struct ovn_port *l3redirect_port;
> -    struct ovn_port *localnet_port;
> +
> +    struct ovn_port **localnet_ports;
> +    size_t n_localnet_ports;
>
>      struct ovs_list lr_list; /* In list of logical router datapaths. */
>      /* The logical router group to which this datapath belongs.
> @@ -611,6 +613,7 @@ ovn_datapath_destroy(struct hmap *datapaths, struct
ovn_datapath *od)
>          ovn_destroy_tnlids(&od->port_tnlids);
>          bitmap_free(od->ipam_info.allocated_ipv4s);
>          free(od->router_ports);
> +        free(od->localnet_ports);
>          ovn_ls_port_group_destroy(&od->nb_pgs);
>          destroy_mcast_info_for_datapath(od);
>
> @@ -2019,6 +2022,7 @@ join_logical_ports(struct northd_context *ctx,
>      struct ovn_datapath *od;
>      HMAP_FOR_EACH (od, key_node, datapaths) {
>          if (od->nbs) {
> +            size_t allocated_localnet_ports = 0;
>              for (size_t i = 0; i < od->nbs->n_ports; i++) {
>                  const struct nbrec_logical_switch_port *nbsp
>                      = od->nbs->ports[i];
> @@ -2053,7 +2057,12 @@ join_logical_ports(struct northd_context *ctx,
>                  }
>
>                  if (!strcmp(nbsp->type, "localnet")) {
> -                   od->localnet_port = op;
> +                   if (od->n_localnet_ports >= allocated_localnet_ports)
{
> +                       od->localnet_ports = x2nrealloc(
> +                           od->localnet_ports, &allocated_localnet_ports,
> +                           sizeof *od->localnet_ports);
> +                   }
> +                   od->localnet_ports[od->n_localnet_ports++] = op;
>                  }
>
>                  op->lsp_addrs
> @@ -3012,7 +3021,7 @@ ovn_port_update_sbrec(struct northd_context *ctx,
>                                "reside-on-redirect-chassis", false) ||
>                  op->peer == op->peer->od->l3dgw_port)) {
>                  add_router_port_garp = true;
> -            } else if (chassis && op->od->localnet_port) {
> +            } else if (chassis && op->od->n_localnet_ports) {
>                  add_router_port_garp = true;
>              }
>
> @@ -4705,9 +4714,10 @@ build_pre_acls(struct ovn_datapath *od, struct
hmap *lflows)
>              struct ovn_port *op = od->router_ports[i];
>              build_pre_acl_flows_for_nbsp(od, lflows, op->nbsp,
op->json_key);
>          }
> -        if (od->localnet_port) {
> -            build_pre_acl_flows_for_nbsp(od, lflows,
od->localnet_port->nbsp,
> -                                         od->localnet_port->json_key);
> +        for (size_t i = 0; i < od->n_localnet_ports; i++) {
> +            build_pre_acl_flows_for_nbsp(od, lflows,
> +                                         od->localnet_ports[i]->nbsp,
> +
od->localnet_ports[i]->json_key);
>          }
>
>          /* Ingress and Egress Pre-ACL Table (Priority 110).
> @@ -5975,9 +5985,9 @@ build_lswitch_rport_arp_req_flow_for_ip(struct sset
*ips,
>      /* Send a the packet only to the router pipeline and skip flooding it
>       * in the broadcast domain (except for the localnet port).
>       */
> -    if (od->localnet_port) {
> +    for (size_t i = 0; i < od->n_localnet_ports; i++) {
>          ds_put_format(&actions, "clone { outport = %s; output; }; ",
> -                      od->localnet_port->json_key);
> +                      od->localnet_ports[i]->json_key);
>      }
>      ds_put_format(&actions, "outport = %s; output;", patch_op->json_key);
>      ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_L2_LKUP, priority,
> @@ -6570,25 +6580,29 @@ build_lswitch_flows(struct hmap *datapaths,
struct hmap *ports,
>          }
>
>          bool is_external = lsp_is_external(op->nbsp);
> -        if (is_external && (!op->od->localnet_port ||
> +        if (is_external && (!op->od->n_localnet_ports ||
>                              !op->nbsp->ha_chassis_group)) {
> -            /* If it's an external port and there is no localnet port
> +            /* If it's an external port and there are no localnet ports
>               * and if it doesn't belong to an HA chassis group ignore
it. */
>              continue;
>          }
>
>          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> -            const char *json_key;
>              if (is_external) {
> -                json_key = op->od->localnet_port->json_key;
> +                for (size_t j = 0; j < op->od->n_localnet_ports; j++) {
> +                    build_dhcpv4_options_flows(
> +                        op, lflows, &op->lsp_addrs[i],
> +                        op->od->localnet_ports[j]->json_key,
is_external);
> +                    build_dhcpv6_options_flows(
> +                        op, lflows, &op->lsp_addrs[i],
> +                        op->od->localnet_ports[j]->json_key,
is_external);
> +                }
>              } else {
> -                json_key = op->json_key;
> +                build_dhcpv4_options_flows(op, lflows, &op->lsp_addrs[i],
> +                                           op->json_key, is_external);
> +                build_dhcpv6_options_flows(op, lflows, &op->lsp_addrs[i],
> +                                           op->json_key, is_external);
>              }
> -            build_dhcpv4_options_flows(op, lflows, &op->lsp_addrs[i],
json_key,
> -                                       is_external);
> -
> -            build_dhcpv6_options_flows(op, lflows, &op->lsp_addrs[i],
json_key,
> -                                       is_external);
>          }
>      }
>
> @@ -6644,8 +6658,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
hmap *ports,
>      }
>
>      HMAP_FOR_EACH (op, key_node, ports) {
> -        if (!op->nbsp || !lsp_is_external(op->nbsp) ||
> -            !op->od->localnet_port) {
> +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
>             continue;
>          }
>
> @@ -6653,8 +6666,10 @@ build_lswitch_flows(struct hmap *datapaths, struct
hmap *ports,
>           * external ports  on chassis not binding those ports.
>           * This makes the router pipeline to be run only on the chassis
>           * binding the external ports. */
> -        build_drop_arp_nd_flows_for_unbound_router_ports(
> -            lflows, op, op->od->localnet_port);
> +        for (size_t i = 0; i < op->od->n_localnet_ports; i++) {
> +            build_drop_arp_nd_flows_for_unbound_router_ports(
> +                lflows, op, op->od->localnet_ports[i]);
> +        }
>      }
>
>      char *svc_check_match = xasprintf("eth.dst == %s", svc_monitor_mac);
> @@ -6872,7 +6887,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
hmap *ports,
>                                ETH_ADDR_ARGS(mac));
>                  if (op->peer->od->l3dgw_port
>                      && op->peer->od->l3redirect_port
> -                    && op->od->localnet_port) {
> +                    && op->od->n_localnet_ports) {
>                      bool add_chassis_resident_check = false;
>                      if (op->peer == op->peer->od->l3dgw_port) {
>                          /* The peer of this port represents a distributed
> @@ -8178,7 +8193,7 @@ build_lrouter_flows(struct hmap *datapaths, struct
hmap *ports,
>                            op->lrp_networks.ipv4_addrs[i].addr_s);
>
>              if (op->od->l3dgw_port && op->od->l3redirect_port && op->peer
> -                && op->peer->od->localnet_port) {
> +                && op->peer->od->n_localnet_ports) {
>                  bool add_chassis_resident_check = false;
>                  if (op == op->od->l3dgw_port) {
>                      /* Traffic with eth.src =
l3dgw_port->lrp_networks.ea_s
> diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml
> index 533ae716d..88edb6f32 100644
> --- a/ovn-architecture.7.xml
> +++ b/ovn-architecture.7.xml
> @@ -441,9 +441,8 @@
>
>    <p>
>      A <code>localnet</code> logical switch port bridges a logical switch
to a
> -    physical VLAN.  Any given logical switch should have no more than one
> -    <code>localnet</code> port.  Such a logical switch is used in two
> -    scenarios:
> +    physical VLAN.  A logical switch may have one or more
<code>localnet</code>
> +    ports.  Such a logical switch is used in two scenarios:

It is worth describing more details for the new scenario here when multiple
localnet ports are needed.

>    </p>
>
>    <ul>
> @@ -1895,13 +1894,13 @@
>    <ol>
>      <li>
>        The packet first enters the ingress pipeline, and then egress
pipeline of
> -      the source localnet logical switch datapath and is sent out via the
> +      the source localnet logical switch datapath and is sent out via a
>        localnet port of the source localnet logical switch (instead of
sending
>        it to router pipeline).
>      </li>
>
>      <li>
> -      The gateway chassis receives the packet via the localnet port of
the
> +      The gateway chassis receives the packet via a localnet port of the
>        source localnet logical switch and sends it to the integration
bridge.
>        The packet then enters the ingress pipeline, and then egress
pipeline of
>        the source localnet logical switch datapath and enters the ingress
> @@ -1916,11 +1915,11 @@
>        From the router datapath, packet enters the ingress pipeline and
then
>        egress pipeline of the destination localnet logical switch
datapath.
>        It then goes out of the integration bridge to the provider bridge (
> -      belonging to the destination logical switch) via the localnet port.
> +      belonging to the destination logical switch) via a localnet port.
>      </li>
>
>      <li>
> -      The destination chassis receives the packet via the localnet port
and
> +      The destination chassis receives the packet via a localnet port and
>        sends it to the integration bridge. The packet enters the
>        ingress pipeline and then egress pipeline of the destination
localnet
>        logical switch and finally delivered to the destination VM port.
> @@ -1935,13 +1934,13 @@
>    <ol>
>      <li>
>        The packet first enters the ingress pipeline, and then egress
pipeline of
> -      the source localnet logical switch datapath and is sent out via the
> +      the source localnet logical switch datapath and is sent out via a
>        localnet port of the source localnet logical switch (instead of
sending
>        it to router pipeline).
>      </li>
>
>      <li>
> -      The gateway chassis receives the packet via the localnet port of
the
> +      The gateway chassis receives the packet via a localnet port of the
>        source localnet logical switch and sends it to the integration
bridge.
>        The packet then enters the ingress pipeline, and then egress
pipeline of
>        the source localnet logical switch datapath and enters the ingress
> @@ -1957,7 +1956,7 @@
>        egress pipeline of the localnet logical switch datapath which
provides
>        external connectivity. It then goes out of the integration bridge
to the
>        provider bridge (belonging to the logical switch which provides
external
> -      connectivity) via the localnet port.
> +      connectivity) via a localnet port.
>      </li>
>    </ol>
>
> @@ -1967,7 +1966,7 @@
>
>    <ol>
>      <li>
> -      The gateway chassis receives the packet from the localnet port of
> +      The gateway chassis receives the packet from a localnet port of
>        the logical switch which provides external connectivity. The
packet then
>        enters the ingress pipeline and then egress pipeline of the
localnet
>        logical switch (which provides external connectivity). The packet
then
> @@ -1978,12 +1977,12 @@
>        The ingress pipeline of the logical router datapath applies the
unNATting
>        rules. The packet then enters the ingress pipeline and then egress
>        pipeline of the source localnet logical switch. Since the source VM
> -      doesn't reside in the gateway chassis, the packet is sent out via
the
> +      doesn't reside in the gateway chassis, the packet is sent out via a
>        localnet port of the source logical switch.
>      </li>
>
>      <li>
> -      The source chassis receives the packet via the localnet port and
> +      The source chassis receives the packet via a localnet port and
>        sends it to the integration bridge. The packet enters the
>        ingress pipeline and then egress pipeline of the source localnet
>        logical switch and finally gets delivered to the source VM port.
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index af15c550a..7e9cd5cc6 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -244,14 +244,14 @@
>      <p>
>        There are two kinds of logical switches, that is, ones that fully
>        virtualize the network (overlay logical switches) and ones that
provide
> -      simple connectivity to a physical network (bridged logical
switches).
> +      simple connectivity to physical networks (bridged logical
switches).
>        They work in the same way when providing connectivity between
logical
> -      ports on same chasis, but differently when connecting remote
logical
> +      ports on same chassis, but differently when connecting remote
logical
>        ports.  Overlay logical switches connect remote logical ports by
tunnels,
>        while bridged logical switches provide connectivity to remote
ports by
> -      bridging the packets to directly connected physical L2 segment
with the
> +      bridging the packets to directly connected physical L2 segments
with the
>        help of <code>localnet</code> ports.  Each bridged logical switch
has
> -      one and only one <code>localnet</code> port, which has only one
special
> +      one or more <code>localnet</code> ports, which have only one
special
>        address <code>unknown</code>.
>      </p>
>
> @@ -527,10 +527,13 @@
>
>            <dt><code>localnet</code></dt>
>            <dd>
> -            A connection to a locally accessible network from each
> -            <code>ovn-controller</code> instance.  A logical switch can
only
> -            have a single <code>localnet</code> port attached.  This is
used
> -            to model direct connectivity to an existing network.
> +            A connection to a locally accessible network from
> +            <code>ovn-controller</code> instances that have corresponding
> +            bridge mapping.  A logical switch can have multiple
> +            <code>localnet</code> ports attached, as long as each
> +            <code>ovn-controller</code> is plugged to a single local
network
> +            only.  In this case, each hypervisor implements part of
switch
> +            external network connectivity.

Today, there is no restriction for one ovn-controller/chassis to plugged to
multiple local networks, if each local network maps to a separate logical
switch. The typical use case is VIFs on different VLAN networks reside on
same chassis. The constraint added here "each ovn-controller is plugged to
a single local network only" seems confusing.

>            </dd>
>
>            <dt><code>localport</code></dt>
> @@ -721,7 +724,7 @@
>            Required.  The name of the network to which the
<code>localnet</code>
>            port is connected.  Each hypervisor, via
<code>ovn-controller</code>,
>            uses its local configuration to determine exactly how to
connect to
> -          this locally accessible network.
> +          this locally accessible network, if at all.
>          </column>
>        </group>
>
> diff --git a/ovn-sb.xml b/ovn-sb.xml
> index 3aa7cd4da..7e05aa0e1 100644
> --- a/ovn-sb.xml
> +++ b/ovn-sb.xml
> @@ -2626,10 +2626,13 @@ tcp.flags = RST;
>
>            <dt><code>localnet</code></dt>
>            <dd>
> -            A connection to a locally accessible network from each
> -            <code>ovn-controller</code> instance.  A logical switch can
only
> -            have a single <code>localnet</code> port attached.  This is
used
> -            to model direct connectivity to an existing network.
> +            A connection to a locally accessible network from some or all
> +            <code>ovn-controller</code> instances.  This is used
> +            to model direct connectivity to existing networks.  A logical
> +            switch can have multiple <code>localnet</code> ports
attached, as
> +            long as each <code>ovn-controller</code> is plugged to a
single
> +            local network only.  In this case, each hypervisor
implements part
> +            of switch external network connectivity.
>            </dd>
>
>            <dt><code>localport</code></dt>
> @@ -2774,10 +2777,12 @@ tcp.flags = RST;
>          <p>
>            When a logical switch has a <code>localnet</code> port
attached,
>            every chassis that may have a local vif attached to that
logical
> -          switch must have a bridge mapping configured to reach that
> -          <code>localnet</code>.  Traffic that arrives on a
> -          <code>localnet</code> port is never forwarded over a tunnel to
> -          another chassis.
> +          switch that needs this external connectivity must have a bridge
> +          mapping configured to reach that <code>localnet</code>.  If the
> +          mapping is missing, the vif won't be plugged to this network.
It may
> +          still reach the other network if routing is implemented by
fabric.
> +          Traffic that arrives on a <code>localnet</code> port is never
> +          forwarded over a tunnel to another chassis.
>          </p>
>        </column>
>
> diff --git a/tests/ovn.at b/tests/ovn.at
> index e6febd4c2..173b8d2eb 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -2475,6 +2475,243 @@ OVN_CLEANUP([hv1],[hv2])
>
>  AT_CLEANUP
>
> +AT_SETUP([ovn -- 2 HVs, 2 LS, switching between multiple localnet ports
with same tags])
> +ovn_start
> +
> +# In this test case we create two switches with multiple localnet ports.
Only a
> +# single localnet of the same tag is connected to fabric for each
switch. Two
> +# hypervisors have VIFs that belong to these switches. The test
validates that
> +# routing between these switches and hypervisors still works regardless
of the
> +# number of (unplugged) localnet ports.
> +
> +# two switches, each connected to lots of networks
> +for i in 1 2; do
> +    ovn-nbctl ls-add ls-$i
> +    for tag in `seq 10 20`; do
> +        ln_port_name=ln-$i-$tag
> +        ovn-nbctl lsp-add ls-$i $ln_port_name "" $tag
> +        ovn-nbctl lsp-set-addresses $ln_port_name unknown
> +        ovn-nbctl lsp-set-type $ln_port_name localnet
> +        ovn-nbctl lsp-set-options $ln_port_name network_name=phys-$tag
> +    done
> +done
> +
> +# multiple networks
> +for tag in `seq 10 20`; do
> +    net_add n-$tag
> +done
> +
> +# two hypervisors, each connected to the same network
> +for i in 1 2; do
> +    sim_add hv-$i
> +    as hv-$i
> +    ovs-vsctl add-br br-phys
> +    ovs-vsctl set open .
external-ids:ovn-bridge-mappings=phys-$tag:br-phys
> +    ovn_attach n-10 br-phys 192.168.0.$i
> +done
> +
> +# two vif ports, one per switch
> +for i in 1 2; do
> +    as hv-$i
> +    ovs-vsctl add-port br-int vif-$i -- \
> +        set Interface vif-$i external-ids:iface-id=lp-$i \
> +                              options:tx_pcap=hv-$i/vif-$i-tx.pcap \
> +                              options:rxq_pcap=hv-$i/vif-$i-rx.pcap \
> +                              ofport-request=$i
> +
> +    lsp_name=lp-$i
> +    ovn-nbctl lsp-add ls-$i $lsp_name
> +    ovn-nbctl lsp-set-addresses $lsp_name f0:00:00:00:00:0$i
> +    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:0$i
> +
> +    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
> +done
> +
> +ovn-nbctl --wait=sb sync
> +ovn-nbctl show
> +ovn-sbctl dump-flows
> +
> +# vif ports
> +for i in 1 2; do
> +    : > vif-$i.expected
> +done
> +
> +# localnet ports
> +for i in 1 2; do
> +    for tag in `seq 10 20`; do
> +        : > $i-$tag.expected
> +    done
> +done
> +
> +test_packet() {
> +    local inport=$1 outport=$2 dst=$3 src=$4 eth=$5 eout=$6 lout=$7
> +
> +    # Expect the packet cloned to all localnet ports
> +    : > expout
> +    for tag in `seq 10 20`; do
> +        echo "output(\"ln-$inport-$tag\");" >> expout
> +    done
> +
> +    # First try tracing the packet.
> +    uflow="inport==\"lp-$inport\" && eth.dst==$dst && eth.src==$src &&
eth.type==0x$eth"
> +    AT_CAPTURE_FILE([trace])
> +    AT_CHECK([ovn-trace --all ls-$inport "$uflow" | tee trace | sed
'1,/Minimal trace/d'], [0], [expout])
> +
> +    # Then actually send a packet, for an end-to-end test.
> +    local packet=$(echo $dst$src | sed 's/://g')${eth}
> +    as hv-$1 ovs-appctl netdev-dummy/receive vif-$inport $packet
> +
> +    # Expect the packet received by the peer VIF port
> +    echo $packet >> vif-$outport.expected
> +
> +    # Expect the packet to transfer through the common fabric network
> +    local packet=$(echo $dst$src | sed 's/://g')81000014${eth}
> +    echo $packet >> $1-10.expected
> +}
> +
> +test_packet 1 2 f0:00:00:00:00:02 f0:00:00:00:00:01 1001 ln-1-10 ln-1-10
> +test_packet 1 2 f0:00:00:00:00:02 f0:00:00:00:00:01 1002 ln-1-10 ln-1-10
> +
> +test_packet 2 1 f0:00:00:00:00:01 f0:00:00:00:00:02 1003 ln-2-10 ln-2-10
> +test_packet 2 1 f0:00:00:00:00:01 f0:00:00:00:00:02 1004 ln-2-10 ln-2-10
> +
> +# Dump a bunch of info helpful for debugging if there's a failure.
> +
> +echo "------ OVN dump ------"
> +ovn-nbctl show
> +ovn-sbctl show
> +
> +for i in 1 2; do
> +    hv=hv-$i
> +    echo "------ $hv dump ------"
> +    as $hv ovs-vsctl show
> +    as $hv ovs-ofctl -O OpenFlow13 dump-flows br-int
> +done
> +
> +# Now check the packets actually received against the ones expected.
> +for i in 1 2; do
> +    OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$i/vif-$i-tx.pcap],
[vif-$i.expected])
> +    OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$i/br-phys_n-10-tx.pcap],
[$i-10.expected])
> +done
> +
> +OVN_CLEANUP([hv-1],[hv-2])
> +
> +AT_CLEANUP
> +
> +AT_SETUP([ovn -- 2 HVs, 1 LS, no switching between multiple localnet
ports with different tags])
> +ovn_start
> +
> +# In this test case we create a single switch connected to two physical
> +# networks via two localnet ports. Then we create two hypervisors, with 2
> +# ports on each. The test validates no interconnectivity between VIF
ports
> +# located on chassis plugged to different physical networks.
> +
> +# create the single switch with two locanet ports
> +ovn-nbctl ls-add ls1
> +for tag in 10 20; do
> +    ln_port_name=ln-$tag
> +    ovn-nbctl lsp-add ls1 $ln_port_name "" $tag
> +    ovn-nbctl lsp-set-addresses $ln_port_name unknown
> +    ovn-nbctl lsp-set-type $ln_port_name localnet
> +    ovn-nbctl lsp-set-options $ln_port_name network_name=phys-$tag
> +done
> +
> +# create fabric networks
> +for tag in 10 20; do
> +    net_add n-$tag
> +done
> +
> +# create four chassis, each connected to one network, each with a single
VIF port
> +for tag in 10 20; do
> +    for i in 1 2; do
> +        sim_add hv-$tag-$i
> +        as hv-$tag-$i
> +        ovs-vsctl add-br br-phys
> +        ovs-vsctl set open .
external-ids:ovn-bridge-mappings=phys-$tag:br-phys
> +        ovn_attach n-$tag br-phys 192.168.$i.$tag
> +
> +        ovs-vsctl add-port br-int vif-$tag-$i -- \
> +            set Interface vif-$tag-$i external-ids:iface-id=lp-$tag-$i \
> +
 options:tx_pcap=hv-$tag-$i/vif-$tag-$i-tx.pcap \
> +
 options:rxq_pcap=hv-$tag-$i/vif-$tag-$i-rx.pcap \
> +                                  ofport-request=$tag$i
> +
> +        lsp_name=lp-$tag-$i
> +        ovn-nbctl lsp-add ls1 $lsp_name
> +        ovn-nbctl lsp-set-addresses $lsp_name f0:00:00:00:0$i:$tag
> +        ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:0$i:$tag
> +
> +        OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
> +    done
> +done
> +ovn-nbctl --wait=sb sync
> +ovn-sbctl dump-flows
> +
> +for tag in 10 20; do
> +    for i in 1 2; do
> +        : > $tag-$i.expected
> +    done
> +done
> +
> +vif_to_hv() {
> +    echo hv-$1
> +}
> +
> +test_packet() {
> +    local inport=$1 dst=$2 src=$3 eth=$4 eout=$5 lout=$6
> +
> +    # First try tracing the packet.
> +    uflow="inport==\"lp-$inport\" && eth.dst==$dst && eth.src==$src &&
eth.type==0x$eth"
> +    echo "output(\"$lout\");" > expout
> +    AT_CAPTURE_FILE([trace])
> +    AT_CHECK([ovn-trace --all ls1 "$uflow" | tee trace | sed '1,/Minimal
trace/d'], [0], [expout])
> +
> +    # Then actually send a packet, for an end-to-end test.
> +    local packet=$(echo $dst$src | sed 's/://g')${eth}
> +    hv=`vif_to_hv $inport`
> +    vif=vif-$inport
> +    as $hv ovs-appctl netdev-dummy/receive $vif $packet
> +    if test $eth = 1002 -o $eth = 2002; then
> +        echo $packet >> ${eout#lp-}.expected
> +    fi
> +}
> +
> +# different fabric networks -> should fail
> +test_packet 10-1 f0:00:00:00:01:20 f0:00:00:00:01:10 1001 lp-20-1 lp-20-1
> +test_packet 20-1 f0:00:00:00:01:10 f0:00:00:00:01:20 2001 lp-10-1 lp-10-1
> +
> +# same fabric networks -> should pass
> +test_packet 10-1 f0:00:00:00:02:10 f0:00:00:00:01:10 1002 lp-10-2 lp-10-2
> +test_packet 20-1 f0:00:00:00:02:20 f0:00:00:00:01:20 2002 lp-20-2 lp-20-2
> +test_packet 10-2 f0:00:00:00:01:10 f0:00:00:00:02:10 1002 lp-10-1 lp-10-1
> +test_packet 20-2 f0:00:00:00:01:20 f0:00:00:00:02:20 2002 lp-20-1 lp-20-1
> +
> +# Dump a bunch of info helpful for debugging if there's a failure.
> +echo "------ OVN dump ------"
> +ovn-nbctl show
> +ovn-sbctl show
> +
> +for tag in 10 20; do
> +    for i in 1 2; do
> +        hv=hv-$tag-$i
> +        echo "------ $hv dump ------"
> +        as $hv ovs-vsctl show
> +        as $hv ovs-ofctl -O OpenFlow13 dump-flows br-int
> +    done
> +done
> +
> +# Now check the packets actually received against the ones expected.
> +for tag in 10 20; do
> +    for i in 1 2; do
> +        echo "hv = $tag-$i"
> +
 OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$tag-$i/vif-$tag-$i-tx.pcap],
[$tag-$i.expected])
> +    done
> +done
> +
> +OVN_CLEANUP([hv-10-1],[hv-10-2],[hv-20-1],[hv-20-2])
> +
> +AT_CLEANUP
> +
>  AT_SETUP([ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS])
>  AT_KEYWORDS([vtep])
>  ovn_start
> --
> 2.26.2
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ihar Hrachyshka May 4, 2020, 4:52 p.m. UTC | #2
On Fri, May 1, 2020 at 3:28 AM Han Zhou <hzhou@ovn.org> wrote:
>
>
>
> On Tue, Apr 28, 2020 at 1:38 PM Ihar Hrachyshka <ihrachys@redhat.com> wrote:
> >
> > Assuming only a single localnet port is actually plugged mapped on
> > each chassis, this allows to maintain disjoint networks plugged to the
> > same switch.  This is useful to simplify resource management for
> > OpenStack "routed provider networks" feature [1] where a single
> > "network" (which traditionally maps to logical switches in OVN) is
> > comprised of multiple L2 segments and assumes external L3 routing
> > implemented between the segments.
> >
> > TODO: consider E-W routing between localnet vlan tagged LSs
> >       (ovn-chassis-mac-mappings).
> >
> > [1]: https://docs.openstack.org/ocata/networking-guide/config-routed-networks.html
>
> After reading the context and related discussions, I am a little concerned about this approach. It changes the concept of a logical switch in OVN, which was a L2 concept. It may be ok to change the concept if all the aspects are understood and properly documented and implemented. However, at least it is not clear to me how would it work on the physical pipeline when VIF1 on chassisA sends packet to VIF2 on chassis B through physical network when the two VIFs are on same LS that has multiple localnet ports. Today only one localnet port is allowed, so it is handled by sending the packet through *the* localnet port (without localnet port it goes through tunnel). With this patch it seems would choose the last localnet port, regardless of the physical location (L2 segment) of each VIF. I would expect the data structure change in "local_datapath" to maintain a list of localnet ports instead of only one, and the logic in physical.c to pick up the desired localnet port, according t
 o the L2
  segment it physically belongs to. (it shouldn't be a mapping between chassis and physical network/vlan, because it is natural to have multiple VLANs connected to a single chassis and have VIFs on different VLANs).

The assumption in this patch is, while the north database may have
multiple localnet ports per switch, only one segment can actually be
plugged per chassis. Which reduces the problem of multiple localnet
ports and selecting the right one to the current situation where
there's a single localnet port per switch for each chassis. (The
assumption is implicit right now but we can also enforce it if
needed.) Perhaps in the future we should consider expanding the
feature to allow for multiple localnet ports plugged to fabric per
chassis, but that's not the goal here.

>
> Furthermore, if we change the concept of LS being a single L2 domain, we may need to consider more changes in the L2 logical pipelines, such as:
> - L2 lookup stage should only look up destination for the ports that are in the same segment.
> - BUM traffic should only go to the localnet port that belong to the same L2 segment of the LS.
>
> All this are suggesting that we need to introduce a *segment* abstraction in OVN, and multiple segments belong to same LS. I wonder if it is even more complex to implement this in OVN than in Neutron plugin. Maybe I missed something here and made it more complex than necessary. Please correct me if I am wrong :)

Indeed, if OVN models the OpenStack multi-segment network concept
natively, 1) it would be easier for OpenStack and 2) it would just
shift complexity from OpenStack to OVN. (I don't think it would be
MORE complex than implementing it in the neutron plugin, but about the
same.) This patch series is an attempt to facilitate neutron
implementation of routed provider networks (multi-segment networks
with L3-adjacent L2-domains) with as little blood on both sides.

> Thanks,
> Han
>
> >
> > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> >
> > ---
> >
> > v2: rebase on top of series that refactors code dealing with localnet ports.
> > v2: tests: send packets both ways, more test scenarios covered.
> > v2: use x2nrealloc to allocate ->localnet_ports.
> > v2: use n_localnet_ports counter instead of localnet_ports pointer to detect
> >     switches with localnet ports.
> > ---
> >  controller/binding.c   |  16 +++
> >  controller/patch.c     |  24 +++--
> >  northd/ovn-northd.c    |  63 ++++++-----
> >  ovn-architecture.7.xml |  25 +++--
> >  ovn-nb.xml             |  21 ++--
> >  ovn-sb.xml             |  21 ++--
> >  tests/ovn.at           | 237 +++++++++++++++++++++++++++++++++++++++++
> >  7 files changed, 347 insertions(+), 60 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index 20a89d07d..c88c4ece8 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -692,12 +692,28 @@ add_localnet_egress_interface_mappings(
> >      }
> >  }
> >
> > +static bool
> > +is_network_plugged(const struct sbrec_port_binding *binding_rec,
> > +                   struct shash *bridge_mappings)
> > +{
> > +    const char *network = smap_get(&binding_rec->options, "network_name");
> > +    if (!network) {
> > +        return false;
> > +    }
> > +    return shash_find_data(bridge_mappings, network);
> > +}
> > +
> >  static void
> >  consider_localnet_port(const struct sbrec_port_binding *binding_rec,
> >                         struct shash *bridge_mappings,
> >                         struct sset *egress_ifaces,
> >                         struct hmap *local_datapaths)
> >  {
> > +    /* Ignore localnet ports for unplugged networks. */
> > +    if (!is_network_plugged(binding_rec, bridge_mappings)) {
> > +        return;
> > +    }
> > +
> >      add_localnet_egress_interface_mappings(binding_rec,
> >              bridge_mappings, egress_ifaces);
> >
> > diff --git a/controller/patch.c b/controller/patch.c
> > index 349faae17..52255cc3a 100644
> > --- a/controller/patch.c
> > +++ b/controller/patch.c
> > @@ -198,9 +198,9 @@ add_bridge_mappings(struct ovsdb_idl_txn *ovs_idl_txn,
> >              continue;
> >          }
> >
> > -        const char *patch_port_id;
> > +        bool is_localnet = false;
> >          if (!strcmp(binding->type, "localnet")) {
> > -            patch_port_id = "ovn-localnet-port";
> > +            is_localnet = true;
> >          } else if (!strcmp(binding->type, "l2gateway")) {
> >              if (!binding->chassis
> >                  || strcmp(chassis->name, binding->chassis->name)) {
> > @@ -208,7 +208,6 @@ add_bridge_mappings(struct ovsdb_idl_txn *ovs_idl_txn,
> >                   * so we should not create any patch ports for it. */
> >                  continue;
> >              }
> > -            patch_port_id = "ovn-l2gateway-port";
> >          } else {
> >              /* not a localnet or L2 gateway port. */
> >              continue;
> > @@ -224,12 +223,25 @@ add_bridge_mappings(struct ovsdb_idl_txn *ovs_idl_txn,
> >          struct ovsrec_bridge *br_ln = shash_find_data(&bridge_mappings, network);
> >          if (!br_ln) {
> >              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> > -            VLOG_ERR_RL(&rl, "bridge not found for %s port '%s' "
> > -                    "with network name '%s'",
> > -                    binding->type, binding->logical_port, network);
> > +            if (!is_localnet) {
> > +                VLOG_ERR_RL(&rl, "bridge not found for %s port '%s' "
> > +                        "with network name '%s'",
> > +                        binding->type, binding->logical_port, network);
> > +            } else {
> > +                VLOG_INFO_RL(&rl, "bridge not found for localnet port '%s' "
> > +                        "with network name '%s'; skipping",
> > +                        binding->logical_port, network);
> > +            }
> >              continue;
> >          }
> >
> > +        const char *patch_port_id;
> > +        if (is_localnet) {
> > +            patch_port_id = "ovn-localnet-port";
> > +        } else {
> > +            patch_port_id = "ovn-l2gateway-port";
> > +        }
> > +
> >          char *name1 = patch_port_name(br_int->name, binding->logical_port);
> >          char *name2 = patch_port_name(binding->logical_port, br_int->name);
> >          create_patch_port(ovs_idl_txn, patch_port_id, binding->logical_port,
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 048c28ca6..91ab4449e 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -543,7 +543,9 @@ struct ovn_datapath {
> >      /* The "derived" OVN port representing the instance of l3dgw_port on
> >       * the "redirect-chassis". */
> >      struct ovn_port *l3redirect_port;
> > -    struct ovn_port *localnet_port;
> > +
> > +    struct ovn_port **localnet_ports;
> > +    size_t n_localnet_ports;
> >
> >      struct ovs_list lr_list; /* In list of logical router datapaths. */
> >      /* The logical router group to which this datapath belongs.
> > @@ -611,6 +613,7 @@ ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od)
> >          ovn_destroy_tnlids(&od->port_tnlids);
> >          bitmap_free(od->ipam_info.allocated_ipv4s);
> >          free(od->router_ports);
> > +        free(od->localnet_ports);
> >          ovn_ls_port_group_destroy(&od->nb_pgs);
> >          destroy_mcast_info_for_datapath(od);
> >
> > @@ -2019,6 +2022,7 @@ join_logical_ports(struct northd_context *ctx,
> >      struct ovn_datapath *od;
> >      HMAP_FOR_EACH (od, key_node, datapaths) {
> >          if (od->nbs) {
> > +            size_t allocated_localnet_ports = 0;
> >              for (size_t i = 0; i < od->nbs->n_ports; i++) {
> >                  const struct nbrec_logical_switch_port *nbsp
> >                      = od->nbs->ports[i];
> > @@ -2053,7 +2057,12 @@ join_logical_ports(struct northd_context *ctx,
> >                  }
> >
> >                  if (!strcmp(nbsp->type, "localnet")) {
> > -                   od->localnet_port = op;
> > +                   if (od->n_localnet_ports >= allocated_localnet_ports) {
> > +                       od->localnet_ports = x2nrealloc(
> > +                           od->localnet_ports, &allocated_localnet_ports,
> > +                           sizeof *od->localnet_ports);
> > +                   }
> > +                   od->localnet_ports[od->n_localnet_ports++] = op;
> >                  }
> >
> >                  op->lsp_addrs
> > @@ -3012,7 +3021,7 @@ ovn_port_update_sbrec(struct northd_context *ctx,
> >                                "reside-on-redirect-chassis", false) ||
> >                  op->peer == op->peer->od->l3dgw_port)) {
> >                  add_router_port_garp = true;
> > -            } else if (chassis && op->od->localnet_port) {
> > +            } else if (chassis && op->od->n_localnet_ports) {
> >                  add_router_port_garp = true;
> >              }
> >
> > @@ -4705,9 +4714,10 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows)
> >              struct ovn_port *op = od->router_ports[i];
> >              build_pre_acl_flows_for_nbsp(od, lflows, op->nbsp, op->json_key);
> >          }
> > -        if (od->localnet_port) {
> > -            build_pre_acl_flows_for_nbsp(od, lflows, od->localnet_port->nbsp,
> > -                                         od->localnet_port->json_key);
> > +        for (size_t i = 0; i < od->n_localnet_ports; i++) {
> > +            build_pre_acl_flows_for_nbsp(od, lflows,
> > +                                         od->localnet_ports[i]->nbsp,
> > +                                         od->localnet_ports[i]->json_key);
> >          }
> >
> >          /* Ingress and Egress Pre-ACL Table (Priority 110).
> > @@ -5975,9 +5985,9 @@ build_lswitch_rport_arp_req_flow_for_ip(struct sset *ips,
> >      /* Send a the packet only to the router pipeline and skip flooding it
> >       * in the broadcast domain (except for the localnet port).
> >       */
> > -    if (od->localnet_port) {
> > +    for (size_t i = 0; i < od->n_localnet_ports; i++) {
> >          ds_put_format(&actions, "clone { outport = %s; output; }; ",
> > -                      od->localnet_port->json_key);
> > +                      od->localnet_ports[i]->json_key);
> >      }
> >      ds_put_format(&actions, "outport = %s; output;", patch_op->json_key);
> >      ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_L2_LKUP, priority,
> > @@ -6570,25 +6580,29 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >          }
> >
> >          bool is_external = lsp_is_external(op->nbsp);
> > -        if (is_external && (!op->od->localnet_port ||
> > +        if (is_external && (!op->od->n_localnet_ports ||
> >                              !op->nbsp->ha_chassis_group)) {
> > -            /* If it's an external port and there is no localnet port
> > +            /* If it's an external port and there are no localnet ports
> >               * and if it doesn't belong to an HA chassis group ignore it. */
> >              continue;
> >          }
> >
> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> > -            const char *json_key;
> >              if (is_external) {
> > -                json_key = op->od->localnet_port->json_key;
> > +                for (size_t j = 0; j < op->od->n_localnet_ports; j++) {
> > +                    build_dhcpv4_options_flows(
> > +                        op, lflows, &op->lsp_addrs[i],
> > +                        op->od->localnet_ports[j]->json_key, is_external);
> > +                    build_dhcpv6_options_flows(
> > +                        op, lflows, &op->lsp_addrs[i],
> > +                        op->od->localnet_ports[j]->json_key, is_external);
> > +                }
> >              } else {
> > -                json_key = op->json_key;
> > +                build_dhcpv4_options_flows(op, lflows, &op->lsp_addrs[i],
> > +                                           op->json_key, is_external);
> > +                build_dhcpv6_options_flows(op, lflows, &op->lsp_addrs[i],
> > +                                           op->json_key, is_external);
> >              }
> > -            build_dhcpv4_options_flows(op, lflows, &op->lsp_addrs[i], json_key,
> > -                                       is_external);
> > -
> > -            build_dhcpv6_options_flows(op, lflows, &op->lsp_addrs[i], json_key,
> > -                                       is_external);
> >          }
> >      }
> >
> > @@ -6644,8 +6658,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >      }
> >
> >      HMAP_FOR_EACH (op, key_node, ports) {
> > -        if (!op->nbsp || !lsp_is_external(op->nbsp) ||
> > -            !op->od->localnet_port) {
> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
> >             continue;
> >          }
> >
> > @@ -6653,8 +6666,10 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >           * external ports  on chassis not binding those ports.
> >           * This makes the router pipeline to be run only on the chassis
> >           * binding the external ports. */
> > -        build_drop_arp_nd_flows_for_unbound_router_ports(
> > -            lflows, op, op->od->localnet_port);
> > +        for (size_t i = 0; i < op->od->n_localnet_ports; i++) {
> > +            build_drop_arp_nd_flows_for_unbound_router_ports(
> > +                lflows, op, op->od->localnet_ports[i]);
> > +        }
> >      }
> >
> >      char *svc_check_match = xasprintf("eth.dst == %s", svc_monitor_mac);
> > @@ -6872,7 +6887,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >                                ETH_ADDR_ARGS(mac));
> >                  if (op->peer->od->l3dgw_port
> >                      && op->peer->od->l3redirect_port
> > -                    && op->od->localnet_port) {
> > +                    && op->od->n_localnet_ports) {
> >                      bool add_chassis_resident_check = false;
> >                      if (op->peer == op->peer->od->l3dgw_port) {
> >                          /* The peer of this port represents a distributed
> > @@ -8178,7 +8193,7 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
> >                            op->lrp_networks.ipv4_addrs[i].addr_s);
> >
> >              if (op->od->l3dgw_port && op->od->l3redirect_port && op->peer
> > -                && op->peer->od->localnet_port) {
> > +                && op->peer->od->n_localnet_ports) {
> >                  bool add_chassis_resident_check = false;
> >                  if (op == op->od->l3dgw_port) {
> >                      /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s
> > diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml
> > index 533ae716d..88edb6f32 100644
> > --- a/ovn-architecture.7.xml
> > +++ b/ovn-architecture.7.xml
> > @@ -441,9 +441,8 @@
> >
> >    <p>
> >      A <code>localnet</code> logical switch port bridges a logical switch to a
> > -    physical VLAN.  Any given logical switch should have no more than one
> > -    <code>localnet</code> port.  Such a logical switch is used in two
> > -    scenarios:
> > +    physical VLAN.  A logical switch may have one or more <code>localnet</code>
> > +    ports.  Such a logical switch is used in two scenarios:
>
> It is worth describing more details for the new scenario here when multiple localnet ports are needed.
>
> >    </p>
> >
> >    <ul>
> > @@ -1895,13 +1894,13 @@
> >    <ol>
> >      <li>
> >        The packet first enters the ingress pipeline, and then egress pipeline of
> > -      the source localnet logical switch datapath and is sent out via the
> > +      the source localnet logical switch datapath and is sent out via a
> >        localnet port of the source localnet logical switch (instead of sending
> >        it to router pipeline).
> >      </li>
> >
> >      <li>
> > -      The gateway chassis receives the packet via the localnet port of the
> > +      The gateway chassis receives the packet via a localnet port of the
> >        source localnet logical switch and sends it to the integration bridge.
> >        The packet then enters the ingress pipeline, and then egress pipeline of
> >        the source localnet logical switch datapath and enters the ingress
> > @@ -1916,11 +1915,11 @@
> >        From the router datapath, packet enters the ingress pipeline and then
> >        egress pipeline of the destination localnet logical switch datapath.
> >        It then goes out of the integration bridge to the provider bridge (
> > -      belonging to the destination logical switch) via the localnet port.
> > +      belonging to the destination logical switch) via a localnet port.
> >      </li>
> >
> >      <li>
> > -      The destination chassis receives the packet via the localnet port and
> > +      The destination chassis receives the packet via a localnet port and
> >        sends it to the integration bridge. The packet enters the
> >        ingress pipeline and then egress pipeline of the destination localnet
> >        logical switch and finally delivered to the destination VM port.
> > @@ -1935,13 +1934,13 @@
> >    <ol>
> >      <li>
> >        The packet first enters the ingress pipeline, and then egress pipeline of
> > -      the source localnet logical switch datapath and is sent out via the
> > +      the source localnet logical switch datapath and is sent out via a
> >        localnet port of the source localnet logical switch (instead of sending
> >        it to router pipeline).
> >      </li>
> >
> >      <li>
> > -      The gateway chassis receives the packet via the localnet port of the
> > +      The gateway chassis receives the packet via a localnet port of the
> >        source localnet logical switch and sends it to the integration bridge.
> >        The packet then enters the ingress pipeline, and then egress pipeline of
> >        the source localnet logical switch datapath and enters the ingress
> > @@ -1957,7 +1956,7 @@
> >        egress pipeline of the localnet logical switch datapath which provides
> >        external connectivity. It then goes out of the integration bridge to the
> >        provider bridge (belonging to the logical switch which provides external
> > -      connectivity) via the localnet port.
> > +      connectivity) via a localnet port.
> >      </li>
> >    </ol>
> >
> > @@ -1967,7 +1966,7 @@
> >
> >    <ol>
> >      <li>
> > -      The gateway chassis receives the packet from the localnet port of
> > +      The gateway chassis receives the packet from a localnet port of
> >        the logical switch which provides external connectivity. The packet then
> >        enters the ingress pipeline and then egress pipeline of the localnet
> >        logical switch (which provides external connectivity). The packet then
> > @@ -1978,12 +1977,12 @@
> >        The ingress pipeline of the logical router datapath applies the unNATting
> >        rules. The packet then enters the ingress pipeline and then egress
> >        pipeline of the source localnet logical switch. Since the source VM
> > -      doesn't reside in the gateway chassis, the packet is sent out via the
> > +      doesn't reside in the gateway chassis, the packet is sent out via a
> >        localnet port of the source logical switch.
> >      </li>
> >
> >      <li>
> > -      The source chassis receives the packet via the localnet port and
> > +      The source chassis receives the packet via a localnet port and
> >        sends it to the integration bridge. The packet enters the
> >        ingress pipeline and then egress pipeline of the source localnet
> >        logical switch and finally gets delivered to the source VM port.
> > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > index af15c550a..7e9cd5cc6 100644
> > --- a/ovn-nb.xml
> > +++ b/ovn-nb.xml
> > @@ -244,14 +244,14 @@
> >      <p>
> >        There are two kinds of logical switches, that is, ones that fully
> >        virtualize the network (overlay logical switches) and ones that provide
> > -      simple connectivity to a physical network (bridged logical switches).
> > +      simple connectivity to physical networks (bridged logical switches).
> >        They work in the same way when providing connectivity between logical
> > -      ports on same chasis, but differently when connecting remote logical
> > +      ports on same chassis, but differently when connecting remote logical
> >        ports.  Overlay logical switches connect remote logical ports by tunnels,
> >        while bridged logical switches provide connectivity to remote ports by
> > -      bridging the packets to directly connected physical L2 segment with the
> > +      bridging the packets to directly connected physical L2 segments with the
> >        help of <code>localnet</code> ports.  Each bridged logical switch has
> > -      one and only one <code>localnet</code> port, which has only one special
> > +      one or more <code>localnet</code> ports, which have only one special
> >        address <code>unknown</code>.
> >      </p>
> >
> > @@ -527,10 +527,13 @@
> >
> >            <dt><code>localnet</code></dt>
> >            <dd>
> > -            A connection to a locally accessible network from each
> > -            <code>ovn-controller</code> instance.  A logical switch can only
> > -            have a single <code>localnet</code> port attached.  This is used
> > -            to model direct connectivity to an existing network.
> > +            A connection to a locally accessible network from
> > +            <code>ovn-controller</code> instances that have corresponding
> > +            bridge mapping.  A logical switch can have multiple
> > +            <code>localnet</code> ports attached, as long as each
> > +            <code>ovn-controller</code> is plugged to a single local network
> > +            only.  In this case, each hypervisor implements part of switch
> > +            external network connectivity.
>
> Today, there is no restriction for one ovn-controller/chassis to plugged to multiple local networks, if each local network maps to a separate logical switch. The typical use case is VIFs on different VLAN networks reside on same chassis. The constraint added here "each ovn-controller is plugged to a single local network only" seems confusing.
>
> >            </dd>
> >
> >            <dt><code>localport</code></dt>
> > @@ -721,7 +724,7 @@
> >            Required.  The name of the network to which the <code>localnet</code>
> >            port is connected.  Each hypervisor, via <code>ovn-controller</code>,
> >            uses its local configuration to determine exactly how to connect to
> > -          this locally accessible network.
> > +          this locally accessible network, if at all.
> >          </column>
> >        </group>
> >
> > diff --git a/ovn-sb.xml b/ovn-sb.xml
> > index 3aa7cd4da..7e05aa0e1 100644
> > --- a/ovn-sb.xml
> > +++ b/ovn-sb.xml
> > @@ -2626,10 +2626,13 @@ tcp.flags = RST;
> >
> >            <dt><code>localnet</code></dt>
> >            <dd>
> > -            A connection to a locally accessible network from each
> > -            <code>ovn-controller</code> instance.  A logical switch can only
> > -            have a single <code>localnet</code> port attached.  This is used
> > -            to model direct connectivity to an existing network.
> > +            A connection to a locally accessible network from some or all
> > +            <code>ovn-controller</code> instances.  This is used
> > +            to model direct connectivity to existing networks.  A logical
> > +            switch can have multiple <code>localnet</code> ports attached, as
> > +            long as each <code>ovn-controller</code> is plugged to a single
> > +            local network only.  In this case, each hypervisor implements part
> > +            of switch external network connectivity.
> >            </dd>
> >
> >            <dt><code>localport</code></dt>
> > @@ -2774,10 +2777,12 @@ tcp.flags = RST;
> >          <p>
> >            When a logical switch has a <code>localnet</code> port attached,
> >            every chassis that may have a local vif attached to that logical
> > -          switch must have a bridge mapping configured to reach that
> > -          <code>localnet</code>.  Traffic that arrives on a
> > -          <code>localnet</code> port is never forwarded over a tunnel to
> > -          another chassis.
> > +          switch that needs this external connectivity must have a bridge
> > +          mapping configured to reach that <code>localnet</code>.  If the
> > +          mapping is missing, the vif won't be plugged to this network.  It may
> > +          still reach the other network if routing is implemented by fabric.
> > +          Traffic that arrives on a <code>localnet</code> port is never
> > +          forwarded over a tunnel to another chassis.
> >          </p>
> >        </column>
> >
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index e6febd4c2..173b8d2eb 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -2475,6 +2475,243 @@ OVN_CLEANUP([hv1],[hv2])
> >
> >  AT_CLEANUP
> >
> > +AT_SETUP([ovn -- 2 HVs, 2 LS, switching between multiple localnet ports with same tags])
> > +ovn_start
> > +
> > +# In this test case we create two switches with multiple localnet ports. Only a
> > +# single localnet of the same tag is connected to fabric for each switch. Two
> > +# hypervisors have VIFs that belong to these switches. The test validates that
> > +# routing between these switches and hypervisors still works regardless of the
> > +# number of (unplugged) localnet ports.
> > +
> > +# two switches, each connected to lots of networks
> > +for i in 1 2; do
> > +    ovn-nbctl ls-add ls-$i
> > +    for tag in `seq 10 20`; do
> > +        ln_port_name=ln-$i-$tag
> > +        ovn-nbctl lsp-add ls-$i $ln_port_name "" $tag
> > +        ovn-nbctl lsp-set-addresses $ln_port_name unknown
> > +        ovn-nbctl lsp-set-type $ln_port_name localnet
> > +        ovn-nbctl lsp-set-options $ln_port_name network_name=phys-$tag
> > +    done
> > +done
> > +
> > +# multiple networks
> > +for tag in `seq 10 20`; do
> > +    net_add n-$tag
> > +done
> > +
> > +# two hypervisors, each connected to the same network
> > +for i in 1 2; do
> > +    sim_add hv-$i
> > +    as hv-$i
> > +    ovs-vsctl add-br br-phys
> > +    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys-$tag:br-phys
> > +    ovn_attach n-10 br-phys 192.168.0.$i
> > +done
> > +
> > +# two vif ports, one per switch
> > +for i in 1 2; do
> > +    as hv-$i
> > +    ovs-vsctl add-port br-int vif-$i -- \
> > +        set Interface vif-$i external-ids:iface-id=lp-$i \
> > +                              options:tx_pcap=hv-$i/vif-$i-tx.pcap \
> > +                              options:rxq_pcap=hv-$i/vif-$i-rx.pcap \
> > +                              ofport-request=$i
> > +
> > +    lsp_name=lp-$i
> > +    ovn-nbctl lsp-add ls-$i $lsp_name
> > +    ovn-nbctl lsp-set-addresses $lsp_name f0:00:00:00:00:0$i
> > +    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:0$i
> > +
> > +    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
> > +done
> > +
> > +ovn-nbctl --wait=sb sync
> > +ovn-nbctl show
> > +ovn-sbctl dump-flows
> > +
> > +# vif ports
> > +for i in 1 2; do
> > +    : > vif-$i.expected
> > +done
> > +
> > +# localnet ports
> > +for i in 1 2; do
> > +    for tag in `seq 10 20`; do
> > +        : > $i-$tag.expected
> > +    done
> > +done
> > +
> > +test_packet() {
> > +    local inport=$1 outport=$2 dst=$3 src=$4 eth=$5 eout=$6 lout=$7
> > +
> > +    # Expect the packet cloned to all localnet ports
> > +    : > expout
> > +    for tag in `seq 10 20`; do
> > +        echo "output(\"ln-$inport-$tag\");" >> expout
> > +    done
> > +
> > +    # First try tracing the packet.
> > +    uflow="inport==\"lp-$inport\" && eth.dst==$dst && eth.src==$src && eth.type==0x$eth"
> > +    AT_CAPTURE_FILE([trace])
> > +    AT_CHECK([ovn-trace --all ls-$inport "$uflow" | tee trace | sed '1,/Minimal trace/d'], [0], [expout])
> > +
> > +    # Then actually send a packet, for an end-to-end test.
> > +    local packet=$(echo $dst$src | sed 's/://g')${eth}
> > +    as hv-$1 ovs-appctl netdev-dummy/receive vif-$inport $packet
> > +
> > +    # Expect the packet received by the peer VIF port
> > +    echo $packet >> vif-$outport.expected
> > +
> > +    # Expect the packet to transfer through the common fabric network
> > +    local packet=$(echo $dst$src | sed 's/://g')81000014${eth}
> > +    echo $packet >> $1-10.expected
> > +}
> > +
> > +test_packet 1 2 f0:00:00:00:00:02 f0:00:00:00:00:01 1001 ln-1-10 ln-1-10
> > +test_packet 1 2 f0:00:00:00:00:02 f0:00:00:00:00:01 1002 ln-1-10 ln-1-10
> > +
> > +test_packet 2 1 f0:00:00:00:00:01 f0:00:00:00:00:02 1003 ln-2-10 ln-2-10
> > +test_packet 2 1 f0:00:00:00:00:01 f0:00:00:00:00:02 1004 ln-2-10 ln-2-10
> > +
> > +# Dump a bunch of info helpful for debugging if there's a failure.
> > +
> > +echo "------ OVN dump ------"
> > +ovn-nbctl show
> > +ovn-sbctl show
> > +
> > +for i in 1 2; do
> > +    hv=hv-$i
> > +    echo "------ $hv dump ------"
> > +    as $hv ovs-vsctl show
> > +    as $hv ovs-ofctl -O OpenFlow13 dump-flows br-int
> > +done
> > +
> > +# Now check the packets actually received against the ones expected.
> > +for i in 1 2; do
> > +    OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$i/vif-$i-tx.pcap], [vif-$i.expected])
> > +    OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$i/br-phys_n-10-tx.pcap], [$i-10.expected])
> > +done
> > +
> > +OVN_CLEANUP([hv-1],[hv-2])
> > +
> > +AT_CLEANUP
> > +
> > +AT_SETUP([ovn -- 2 HVs, 1 LS, no switching between multiple localnet ports with different tags])
> > +ovn_start
> > +
> > +# In this test case we create a single switch connected to two physical
> > +# networks via two localnet ports. Then we create two hypervisors, with 2
> > +# ports on each. The test validates no interconnectivity between VIF ports
> > +# located on chassis plugged to different physical networks.
> > +
> > +# create the single switch with two locanet ports
> > +ovn-nbctl ls-add ls1
> > +for tag in 10 20; do
> > +    ln_port_name=ln-$tag
> > +    ovn-nbctl lsp-add ls1 $ln_port_name "" $tag
> > +    ovn-nbctl lsp-set-addresses $ln_port_name unknown
> > +    ovn-nbctl lsp-set-type $ln_port_name localnet
> > +    ovn-nbctl lsp-set-options $ln_port_name network_name=phys-$tag
> > +done
> > +
> > +# create fabric networks
> > +for tag in 10 20; do
> > +    net_add n-$tag
> > +done
> > +
> > +# create four chassis, each connected to one network, each with a single VIF port
> > +for tag in 10 20; do
> > +    for i in 1 2; do
> > +        sim_add hv-$tag-$i
> > +        as hv-$tag-$i
> > +        ovs-vsctl add-br br-phys
> > +        ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys-$tag:br-phys
> > +        ovn_attach n-$tag br-phys 192.168.$i.$tag
> > +
> > +        ovs-vsctl add-port br-int vif-$tag-$i -- \
> > +            set Interface vif-$tag-$i external-ids:iface-id=lp-$tag-$i \
> > +                                  options:tx_pcap=hv-$tag-$i/vif-$tag-$i-tx.pcap \
> > +                                  options:rxq_pcap=hv-$tag-$i/vif-$tag-$i-rx.pcap \
> > +                                  ofport-request=$tag$i
> > +
> > +        lsp_name=lp-$tag-$i
> > +        ovn-nbctl lsp-add ls1 $lsp_name
> > +        ovn-nbctl lsp-set-addresses $lsp_name f0:00:00:00:0$i:$tag
> > +        ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:0$i:$tag
> > +
> > +        OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
> > +    done
> > +done
> > +ovn-nbctl --wait=sb sync
> > +ovn-sbctl dump-flows
> > +
> > +for tag in 10 20; do
> > +    for i in 1 2; do
> > +        : > $tag-$i.expected
> > +    done
> > +done
> > +
> > +vif_to_hv() {
> > +    echo hv-$1
> > +}
> > +
> > +test_packet() {
> > +    local inport=$1 dst=$2 src=$3 eth=$4 eout=$5 lout=$6
> > +
> > +    # First try tracing the packet.
> > +    uflow="inport==\"lp-$inport\" && eth.dst==$dst && eth.src==$src && eth.type==0x$eth"
> > +    echo "output(\"$lout\");" > expout
> > +    AT_CAPTURE_FILE([trace])
> > +    AT_CHECK([ovn-trace --all ls1 "$uflow" | tee trace | sed '1,/Minimal trace/d'], [0], [expout])
> > +
> > +    # Then actually send a packet, for an end-to-end test.
> > +    local packet=$(echo $dst$src | sed 's/://g')${eth}
> > +    hv=`vif_to_hv $inport`
> > +    vif=vif-$inport
> > +    as $hv ovs-appctl netdev-dummy/receive $vif $packet
> > +    if test $eth = 1002 -o $eth = 2002; then
> > +        echo $packet >> ${eout#lp-}.expected
> > +    fi
> > +}
> > +
> > +# different fabric networks -> should fail
> > +test_packet 10-1 f0:00:00:00:01:20 f0:00:00:00:01:10 1001 lp-20-1 lp-20-1
> > +test_packet 20-1 f0:00:00:00:01:10 f0:00:00:00:01:20 2001 lp-10-1 lp-10-1
> > +
> > +# same fabric networks -> should pass
> > +test_packet 10-1 f0:00:00:00:02:10 f0:00:00:00:01:10 1002 lp-10-2 lp-10-2
> > +test_packet 20-1 f0:00:00:00:02:20 f0:00:00:00:01:20 2002 lp-20-2 lp-20-2
> > +test_packet 10-2 f0:00:00:00:01:10 f0:00:00:00:02:10 1002 lp-10-1 lp-10-1
> > +test_packet 20-2 f0:00:00:00:01:20 f0:00:00:00:02:20 2002 lp-20-1 lp-20-1
> > +
> > +# Dump a bunch of info helpful for debugging if there's a failure.
> > +echo "------ OVN dump ------"
> > +ovn-nbctl show
> > +ovn-sbctl show
> > +
> > +for tag in 10 20; do
> > +    for i in 1 2; do
> > +        hv=hv-$tag-$i
> > +        echo "------ $hv dump ------"
> > +        as $hv ovs-vsctl show
> > +        as $hv ovs-ofctl -O OpenFlow13 dump-flows br-int
> > +    done
> > +done
> > +
> > +# Now check the packets actually received against the ones expected.
> > +for tag in 10 20; do
> > +    for i in 1 2; do
> > +        echo "hv = $tag-$i"
> > +        OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$tag-$i/vif-$tag-$i-tx.pcap], [$tag-$i.expected])
> > +    done
> > +done
> > +
> > +OVN_CLEANUP([hv-10-1],[hv-10-2],[hv-20-1],[hv-20-2])
> > +
> > +AT_CLEANUP
> > +
> >  AT_SETUP([ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS])
> >  AT_KEYWORDS([vtep])
> >  ovn_start
> > --
> > 2.26.2
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ihar Hrachyshka May 4, 2020, 5:32 p.m. UTC | #3
On Fri, May 1, 2020 at 3:28 AM Han Zhou <hzhou@ovn.org> wrote:
>
>
>
> On Tue, Apr 28, 2020 at 1:38 PM Ihar Hrachyshka <ihrachys@redhat.com> wrote:
> >
> > Assuming only a single localnet port is actually plugged mapped on
> > each chassis, this allows to maintain disjoint networks plugged to the
> > same switch.  This is useful to simplify resource management for
> > OpenStack "routed provider networks" feature [1] where a single
> > "network" (which traditionally maps to logical switches in OVN) is
> > comprised of multiple L2 segments and assumes external L3 routing
> > implemented between the segments.
> >
> > TODO: consider E-W routing between localnet vlan tagged LSs
> >       (ovn-chassis-mac-mappings).
> >
> > [1]: https://docs.openstack.org/ocata/networking-guide/config-routed-networks.html
>
> After reading the context and related discussions, I am a little concerned about this approach. It changes the concept of a logical switch in OVN, which was a L2 concept. It may be ok to change the concept if all the aspects are understood and properly documented and implemented. However, at least it is not clear to me how would it work on the physical pipeline when VIF1 on chassisA sends packet to VIF2 on chassis B through physical network when the two VIFs are on same LS that has multiple localnet ports. Today only one localnet port is allowed, so it is handled by sending the packet through *the* localnet port (without localnet port it goes through tunnel). With this patch it seems would choose the last localnet port, regardless of the physical location (L2 segment) of each VIF. I would expect the data structure change in "local_datapath" to maintain a list of localnet ports instead of only one, and the logic in physical.c to pick up the desired localnet port, according t
 o the L2
  segment it physically belongs to. (it shouldn't be a mapping between chassis and physical network/vlan, because it is natural to have multiple VLANs connected to a single chassis and have VIFs on different VLANs).
>
> Furthermore, if we change the concept of LS being a single L2 domain, we may need to consider more changes in the L2 logical pipelines, such as:
> - L2 lookup stage should only look up destination for the ports that are in the same segment.
> - BUM traffic should only go to the localnet port that belong to the same L2 segment of the LS.
>
> All this are suggesting that we need to introduce a *segment* abstraction in OVN, and multiple segments belong to same LS. I wonder if it is even more complex to implement this in OVN than in Neutron plugin. Maybe I missed something here and made it more complex than necessary. Please correct me if I am wrong :)
>
> Thanks,
> Han
>
> >
> > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> >
> > ---
> >
> > v2: rebase on top of series that refactors code dealing with localnet ports.
> > v2: tests: send packets both ways, more test scenarios covered.
> > v2: use x2nrealloc to allocate ->localnet_ports.
> > v2: use n_localnet_ports counter instead of localnet_ports pointer to detect
> >     switches with localnet ports.
> > ---
> >  controller/binding.c   |  16 +++
> >  controller/patch.c     |  24 +++--
> >  northd/ovn-northd.c    |  63 ++++++-----
> >  ovn-architecture.7.xml |  25 +++--
> >  ovn-nb.xml             |  21 ++--
> >  ovn-sb.xml             |  21 ++--
> >  tests/ovn.at           | 237 +++++++++++++++++++++++++++++++++++++++++
> >  7 files changed, 347 insertions(+), 60 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index 20a89d07d..c88c4ece8 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -692,12 +692,28 @@ add_localnet_egress_interface_mappings(
> >      }
> >  }
> >
> > +static bool
> > +is_network_plugged(const struct sbrec_port_binding *binding_rec,
> > +                   struct shash *bridge_mappings)
> > +{
> > +    const char *network = smap_get(&binding_rec->options, "network_name");
> > +    if (!network) {
> > +        return false;
> > +    }
> > +    return shash_find_data(bridge_mappings, network);
> > +}
> > +
> >  static void
> >  consider_localnet_port(const struct sbrec_port_binding *binding_rec,
> >                         struct shash *bridge_mappings,
> >                         struct sset *egress_ifaces,
> >                         struct hmap *local_datapaths)
> >  {
> > +    /* Ignore localnet ports for unplugged networks. */
> > +    if (!is_network_plugged(binding_rec, bridge_mappings)) {
> > +        return;
> > +    }
> > +
> >      add_localnet_egress_interface_mappings(binding_rec,
> >              bridge_mappings, egress_ifaces);
> >
> > diff --git a/controller/patch.c b/controller/patch.c
> > index 349faae17..52255cc3a 100644
> > --- a/controller/patch.c
> > +++ b/controller/patch.c
> > @@ -198,9 +198,9 @@ add_bridge_mappings(struct ovsdb_idl_txn *ovs_idl_txn,
> >              continue;
> >          }
> >
> > -        const char *patch_port_id;
> > +        bool is_localnet = false;
> >          if (!strcmp(binding->type, "localnet")) {
> > -            patch_port_id = "ovn-localnet-port";
> > +            is_localnet = true;
> >          } else if (!strcmp(binding->type, "l2gateway")) {
> >              if (!binding->chassis
> >                  || strcmp(chassis->name, binding->chassis->name)) {
> > @@ -208,7 +208,6 @@ add_bridge_mappings(struct ovsdb_idl_txn *ovs_idl_txn,
> >                   * so we should not create any patch ports for it. */
> >                  continue;
> >              }
> > -            patch_port_id = "ovn-l2gateway-port";
> >          } else {
> >              /* not a localnet or L2 gateway port. */
> >              continue;
> > @@ -224,12 +223,25 @@ add_bridge_mappings(struct ovsdb_idl_txn *ovs_idl_txn,
> >          struct ovsrec_bridge *br_ln = shash_find_data(&bridge_mappings, network);
> >          if (!br_ln) {
> >              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> > -            VLOG_ERR_RL(&rl, "bridge not found for %s port '%s' "
> > -                    "with network name '%s'",
> > -                    binding->type, binding->logical_port, network);
> > +            if (!is_localnet) {
> > +                VLOG_ERR_RL(&rl, "bridge not found for %s port '%s' "
> > +                        "with network name '%s'",
> > +                        binding->type, binding->logical_port, network);
> > +            } else {
> > +                VLOG_INFO_RL(&rl, "bridge not found for localnet port '%s' "
> > +                        "with network name '%s'; skipping",
> > +                        binding->logical_port, network);
> > +            }
> >              continue;
> >          }
> >
> > +        const char *patch_port_id;
> > +        if (is_localnet) {
> > +            patch_port_id = "ovn-localnet-port";
> > +        } else {
> > +            patch_port_id = "ovn-l2gateway-port";
> > +        }
> > +
> >          char *name1 = patch_port_name(br_int->name, binding->logical_port);
> >          char *name2 = patch_port_name(binding->logical_port, br_int->name);
> >          create_patch_port(ovs_idl_txn, patch_port_id, binding->logical_port,
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 048c28ca6..91ab4449e 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -543,7 +543,9 @@ struct ovn_datapath {
> >      /* The "derived" OVN port representing the instance of l3dgw_port on
> >       * the "redirect-chassis". */
> >      struct ovn_port *l3redirect_port;
> > -    struct ovn_port *localnet_port;
> > +
> > +    struct ovn_port **localnet_ports;
> > +    size_t n_localnet_ports;
> >
> >      struct ovs_list lr_list; /* In list of logical router datapaths. */
> >      /* The logical router group to which this datapath belongs.
> > @@ -611,6 +613,7 @@ ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od)
> >          ovn_destroy_tnlids(&od->port_tnlids);
> >          bitmap_free(od->ipam_info.allocated_ipv4s);
> >          free(od->router_ports);
> > +        free(od->localnet_ports);
> >          ovn_ls_port_group_destroy(&od->nb_pgs);
> >          destroy_mcast_info_for_datapath(od);
> >
> > @@ -2019,6 +2022,7 @@ join_logical_ports(struct northd_context *ctx,
> >      struct ovn_datapath *od;
> >      HMAP_FOR_EACH (od, key_node, datapaths) {
> >          if (od->nbs) {
> > +            size_t allocated_localnet_ports = 0;
> >              for (size_t i = 0; i < od->nbs->n_ports; i++) {
> >                  const struct nbrec_logical_switch_port *nbsp
> >                      = od->nbs->ports[i];
> > @@ -2053,7 +2057,12 @@ join_logical_ports(struct northd_context *ctx,
> >                  }
> >
> >                  if (!strcmp(nbsp->type, "localnet")) {
> > -                   od->localnet_port = op;
> > +                   if (od->n_localnet_ports >= allocated_localnet_ports) {
> > +                       od->localnet_ports = x2nrealloc(
> > +                           od->localnet_ports, &allocated_localnet_ports,
> > +                           sizeof *od->localnet_ports);
> > +                   }
> > +                   od->localnet_ports[od->n_localnet_ports++] = op;
> >                  }
> >
> >                  op->lsp_addrs
> > @@ -3012,7 +3021,7 @@ ovn_port_update_sbrec(struct northd_context *ctx,
> >                                "reside-on-redirect-chassis", false) ||
> >                  op->peer == op->peer->od->l3dgw_port)) {
> >                  add_router_port_garp = true;
> > -            } else if (chassis && op->od->localnet_port) {
> > +            } else if (chassis && op->od->n_localnet_ports) {
> >                  add_router_port_garp = true;
> >              }
> >
> > @@ -4705,9 +4714,10 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows)
> >              struct ovn_port *op = od->router_ports[i];
> >              build_pre_acl_flows_for_nbsp(od, lflows, op->nbsp, op->json_key);
> >          }
> > -        if (od->localnet_port) {
> > -            build_pre_acl_flows_for_nbsp(od, lflows, od->localnet_port->nbsp,
> > -                                         od->localnet_port->json_key);
> > +        for (size_t i = 0; i < od->n_localnet_ports; i++) {
> > +            build_pre_acl_flows_for_nbsp(od, lflows,
> > +                                         od->localnet_ports[i]->nbsp,
> > +                                         od->localnet_ports[i]->json_key);
> >          }
> >
> >          /* Ingress and Egress Pre-ACL Table (Priority 110).
> > @@ -5975,9 +5985,9 @@ build_lswitch_rport_arp_req_flow_for_ip(struct sset *ips,
> >      /* Send a the packet only to the router pipeline and skip flooding it
> >       * in the broadcast domain (except for the localnet port).
> >       */
> > -    if (od->localnet_port) {
> > +    for (size_t i = 0; i < od->n_localnet_ports; i++) {
> >          ds_put_format(&actions, "clone { outport = %s; output; }; ",
> > -                      od->localnet_port->json_key);
> > +                      od->localnet_ports[i]->json_key);
> >      }
> >      ds_put_format(&actions, "outport = %s; output;", patch_op->json_key);
> >      ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_L2_LKUP, priority,
> > @@ -6570,25 +6580,29 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >          }
> >
> >          bool is_external = lsp_is_external(op->nbsp);
> > -        if (is_external && (!op->od->localnet_port ||
> > +        if (is_external && (!op->od->n_localnet_ports ||
> >                              !op->nbsp->ha_chassis_group)) {
> > -            /* If it's an external port and there is no localnet port
> > +            /* If it's an external port and there are no localnet ports
> >               * and if it doesn't belong to an HA chassis group ignore it. */
> >              continue;
> >          }
> >
> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> > -            const char *json_key;
> >              if (is_external) {
> > -                json_key = op->od->localnet_port->json_key;
> > +                for (size_t j = 0; j < op->od->n_localnet_ports; j++) {
> > +                    build_dhcpv4_options_flows(
> > +                        op, lflows, &op->lsp_addrs[i],
> > +                        op->od->localnet_ports[j]->json_key, is_external);
> > +                    build_dhcpv6_options_flows(
> > +                        op, lflows, &op->lsp_addrs[i],
> > +                        op->od->localnet_ports[j]->json_key, is_external);
> > +                }
> >              } else {
> > -                json_key = op->json_key;
> > +                build_dhcpv4_options_flows(op, lflows, &op->lsp_addrs[i],
> > +                                           op->json_key, is_external);
> > +                build_dhcpv6_options_flows(op, lflows, &op->lsp_addrs[i],
> > +                                           op->json_key, is_external);
> >              }
> > -            build_dhcpv4_options_flows(op, lflows, &op->lsp_addrs[i], json_key,
> > -                                       is_external);
> > -
> > -            build_dhcpv6_options_flows(op, lflows, &op->lsp_addrs[i], json_key,
> > -                                       is_external);
> >          }
> >      }
> >
> > @@ -6644,8 +6658,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >      }
> >
> >      HMAP_FOR_EACH (op, key_node, ports) {
> > -        if (!op->nbsp || !lsp_is_external(op->nbsp) ||
> > -            !op->od->localnet_port) {
> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
> >             continue;
> >          }
> >
> > @@ -6653,8 +6666,10 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >           * external ports  on chassis not binding those ports.
> >           * This makes the router pipeline to be run only on the chassis
> >           * binding the external ports. */
> > -        build_drop_arp_nd_flows_for_unbound_router_ports(
> > -            lflows, op, op->od->localnet_port);
> > +        for (size_t i = 0; i < op->od->n_localnet_ports; i++) {
> > +            build_drop_arp_nd_flows_for_unbound_router_ports(
> > +                lflows, op, op->od->localnet_ports[i]);
> > +        }
> >      }
> >
> >      char *svc_check_match = xasprintf("eth.dst == %s", svc_monitor_mac);
> > @@ -6872,7 +6887,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >                                ETH_ADDR_ARGS(mac));
> >                  if (op->peer->od->l3dgw_port
> >                      && op->peer->od->l3redirect_port
> > -                    && op->od->localnet_port) {
> > +                    && op->od->n_localnet_ports) {
> >                      bool add_chassis_resident_check = false;
> >                      if (op->peer == op->peer->od->l3dgw_port) {
> >                          /* The peer of this port represents a distributed
> > @@ -8178,7 +8193,7 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
> >                            op->lrp_networks.ipv4_addrs[i].addr_s);
> >
> >              if (op->od->l3dgw_port && op->od->l3redirect_port && op->peer
> > -                && op->peer->od->localnet_port) {
> > +                && op->peer->od->n_localnet_ports) {
> >                  bool add_chassis_resident_check = false;
> >                  if (op == op->od->l3dgw_port) {
> >                      /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s
> > diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml
> > index 533ae716d..88edb6f32 100644
> > --- a/ovn-architecture.7.xml
> > +++ b/ovn-architecture.7.xml
> > @@ -441,9 +441,8 @@
> >
> >    <p>
> >      A <code>localnet</code> logical switch port bridges a logical switch to a
> > -    physical VLAN.  Any given logical switch should have no more than one
> > -    <code>localnet</code> port.  Such a logical switch is used in two
> > -    scenarios:
> > +    physical VLAN.  A logical switch may have one or more <code>localnet</code>
> > +    ports.  Such a logical switch is used in two scenarios:
>
> It is worth describing more details for the new scenario here when multiple localnet ports are needed.
>
> >    </p>
> >
> >    <ul>
> > @@ -1895,13 +1894,13 @@
> >    <ol>
> >      <li>
> >        The packet first enters the ingress pipeline, and then egress pipeline of
> > -      the source localnet logical switch datapath and is sent out via the
> > +      the source localnet logical switch datapath and is sent out via a
> >        localnet port of the source localnet logical switch (instead of sending
> >        it to router pipeline).
> >      </li>
> >
> >      <li>
> > -      The gateway chassis receives the packet via the localnet port of the
> > +      The gateway chassis receives the packet via a localnet port of the
> >        source localnet logical switch and sends it to the integration bridge.
> >        The packet then enters the ingress pipeline, and then egress pipeline of
> >        the source localnet logical switch datapath and enters the ingress
> > @@ -1916,11 +1915,11 @@
> >        From the router datapath, packet enters the ingress pipeline and then
> >        egress pipeline of the destination localnet logical switch datapath.
> >        It then goes out of the integration bridge to the provider bridge (
> > -      belonging to the destination logical switch) via the localnet port.
> > +      belonging to the destination logical switch) via a localnet port.
> >      </li>
> >
> >      <li>
> > -      The destination chassis receives the packet via the localnet port and
> > +      The destination chassis receives the packet via a localnet port and
> >        sends it to the integration bridge. The packet enters the
> >        ingress pipeline and then egress pipeline of the destination localnet
> >        logical switch and finally delivered to the destination VM port.
> > @@ -1935,13 +1934,13 @@
> >    <ol>
> >      <li>
> >        The packet first enters the ingress pipeline, and then egress pipeline of
> > -      the source localnet logical switch datapath and is sent out via the
> > +      the source localnet logical switch datapath and is sent out via a
> >        localnet port of the source localnet logical switch (instead of sending
> >        it to router pipeline).
> >      </li>
> >
> >      <li>
> > -      The gateway chassis receives the packet via the localnet port of the
> > +      The gateway chassis receives the packet via a localnet port of the
> >        source localnet logical switch and sends it to the integration bridge.
> >        The packet then enters the ingress pipeline, and then egress pipeline of
> >        the source localnet logical switch datapath and enters the ingress
> > @@ -1957,7 +1956,7 @@
> >        egress pipeline of the localnet logical switch datapath which provides
> >        external connectivity. It then goes out of the integration bridge to the
> >        provider bridge (belonging to the logical switch which provides external
> > -      connectivity) via the localnet port.
> > +      connectivity) via a localnet port.
> >      </li>
> >    </ol>
> >
> > @@ -1967,7 +1966,7 @@
> >
> >    <ol>
> >      <li>
> > -      The gateway chassis receives the packet from the localnet port of
> > +      The gateway chassis receives the packet from a localnet port of
> >        the logical switch which provides external connectivity. The packet then
> >        enters the ingress pipeline and then egress pipeline of the localnet
> >        logical switch (which provides external connectivity). The packet then
> > @@ -1978,12 +1977,12 @@
> >        The ingress pipeline of the logical router datapath applies the unNATting
> >        rules. The packet then enters the ingress pipeline and then egress
> >        pipeline of the source localnet logical switch. Since the source VM
> > -      doesn't reside in the gateway chassis, the packet is sent out via the
> > +      doesn't reside in the gateway chassis, the packet is sent out via a
> >        localnet port of the source logical switch.
> >      </li>
> >
> >      <li>
> > -      The source chassis receives the packet via the localnet port and
> > +      The source chassis receives the packet via a localnet port and
> >        sends it to the integration bridge. The packet enters the
> >        ingress pipeline and then egress pipeline of the source localnet
> >        logical switch and finally gets delivered to the source VM port.
> > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > index af15c550a..7e9cd5cc6 100644
> > --- a/ovn-nb.xml
> > +++ b/ovn-nb.xml
> > @@ -244,14 +244,14 @@
> >      <p>
> >        There are two kinds of logical switches, that is, ones that fully
> >        virtualize the network (overlay logical switches) and ones that provide
> > -      simple connectivity to a physical network (bridged logical switches).
> > +      simple connectivity to physical networks (bridged logical switches).
> >        They work in the same way when providing connectivity between logical
> > -      ports on same chasis, but differently when connecting remote logical
> > +      ports on same chassis, but differently when connecting remote logical
> >        ports.  Overlay logical switches connect remote logical ports by tunnels,
> >        while bridged logical switches provide connectivity to remote ports by
> > -      bridging the packets to directly connected physical L2 segment with the
> > +      bridging the packets to directly connected physical L2 segments with the
> >        help of <code>localnet</code> ports.  Each bridged logical switch has
> > -      one and only one <code>localnet</code> port, which has only one special
> > +      one or more <code>localnet</code> ports, which have only one special
> >        address <code>unknown</code>.
> >      </p>
> >
> > @@ -527,10 +527,13 @@
> >
> >            <dt><code>localnet</code></dt>
> >            <dd>
> > -            A connection to a locally accessible network from each
> > -            <code>ovn-controller</code> instance.  A logical switch can only
> > -            have a single <code>localnet</code> port attached.  This is used
> > -            to model direct connectivity to an existing network.
> > +            A connection to a locally accessible network from
> > +            <code>ovn-controller</code> instances that have corresponding
> > +            bridge mapping.  A logical switch can have multiple
> > +            <code>localnet</code> ports attached, as long as each
> > +            <code>ovn-controller</code> is plugged to a single local network
> > +            only.  In this case, each hypervisor implements part of switch
> > +            external network connectivity.
>
> Today, there is no restriction for one ovn-controller/chassis to plugged to multiple local networks, if each local network maps to a separate logical switch. The typical use case is VIFs on different VLAN networks reside on same chassis. The constraint added here "each ovn-controller is plugged to a single local network only" seems confusing.
>

This is an unfortunate wording that I should fix. Obviously the patch
doesn't force a chassis to be plugged to a single segment. What it
implies is that if there are multiple localnet ports per switch, only
one of them is physically wired to fabric. It should not affect
regular scenarios where LSs have a single localnet port per switch.

> >            </dd>
> >
> >            <dt><code>localport</code></dt>
> > @@ -721,7 +724,7 @@
> >            Required.  The name of the network to which the <code>localnet</code>
> >            port is connected.  Each hypervisor, via <code>ovn-controller</code>,
> >            uses its local configuration to determine exactly how to connect to
> > -          this locally accessible network.
> > +          this locally accessible network, if at all.
> >          </column>
> >        </group>
> >
> > diff --git a/ovn-sb.xml b/ovn-sb.xml
> > index 3aa7cd4da..7e05aa0e1 100644
> > --- a/ovn-sb.xml
> > +++ b/ovn-sb.xml
> > @@ -2626,10 +2626,13 @@ tcp.flags = RST;
> >
> >            <dt><code>localnet</code></dt>
> >            <dd>
> > -            A connection to a locally accessible network from each
> > -            <code>ovn-controller</code> instance.  A logical switch can only
> > -            have a single <code>localnet</code> port attached.  This is used
> > -            to model direct connectivity to an existing network.
> > +            A connection to a locally accessible network from some or all
> > +            <code>ovn-controller</code> instances.  This is used
> > +            to model direct connectivity to existing networks.  A logical
> > +            switch can have multiple <code>localnet</code> ports attached, as
> > +            long as each <code>ovn-controller</code> is plugged to a single
> > +            local network only.  In this case, each hypervisor implements part
> > +            of switch external network connectivity.
> >            </dd>
> >
> >            <dt><code>localport</code></dt>
> > @@ -2774,10 +2777,12 @@ tcp.flags = RST;
> >          <p>
> >            When a logical switch has a <code>localnet</code> port attached,
> >            every chassis that may have a local vif attached to that logical
> > -          switch must have a bridge mapping configured to reach that
> > -          <code>localnet</code>.  Traffic that arrives on a
> > -          <code>localnet</code> port is never forwarded over a tunnel to
> > -          another chassis.
> > +          switch that needs this external connectivity must have a bridge
> > +          mapping configured to reach that <code>localnet</code>.  If the
> > +          mapping is missing, the vif won't be plugged to this network.  It may
> > +          still reach the other network if routing is implemented by fabric.
> > +          Traffic that arrives on a <code>localnet</code> port is never
> > +          forwarded over a tunnel to another chassis.
> >          </p>
> >        </column>
> >
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index e6febd4c2..173b8d2eb 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -2475,6 +2475,243 @@ OVN_CLEANUP([hv1],[hv2])
> >
> >  AT_CLEANUP
> >
> > +AT_SETUP([ovn -- 2 HVs, 2 LS, switching between multiple localnet ports with same tags])
> > +ovn_start
> > +
> > +# In this test case we create two switches with multiple localnet ports. Only a
> > +# single localnet of the same tag is connected to fabric for each switch. Two
> > +# hypervisors have VIFs that belong to these switches. The test validates that
> > +# routing between these switches and hypervisors still works regardless of the
> > +# number of (unplugged) localnet ports.
> > +
> > +# two switches, each connected to lots of networks
> > +for i in 1 2; do
> > +    ovn-nbctl ls-add ls-$i
> > +    for tag in `seq 10 20`; do
> > +        ln_port_name=ln-$i-$tag
> > +        ovn-nbctl lsp-add ls-$i $ln_port_name "" $tag
> > +        ovn-nbctl lsp-set-addresses $ln_port_name unknown
> > +        ovn-nbctl lsp-set-type $ln_port_name localnet
> > +        ovn-nbctl lsp-set-options $ln_port_name network_name=phys-$tag
> > +    done
> > +done
> > +
> > +# multiple networks
> > +for tag in `seq 10 20`; do
> > +    net_add n-$tag
> > +done
> > +
> > +# two hypervisors, each connected to the same network
> > +for i in 1 2; do
> > +    sim_add hv-$i
> > +    as hv-$i
> > +    ovs-vsctl add-br br-phys
> > +    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys-$tag:br-phys
> > +    ovn_attach n-10 br-phys 192.168.0.$i
> > +done
> > +
> > +# two vif ports, one per switch
> > +for i in 1 2; do
> > +    as hv-$i
> > +    ovs-vsctl add-port br-int vif-$i -- \
> > +        set Interface vif-$i external-ids:iface-id=lp-$i \
> > +                              options:tx_pcap=hv-$i/vif-$i-tx.pcap \
> > +                              options:rxq_pcap=hv-$i/vif-$i-rx.pcap \
> > +                              ofport-request=$i
> > +
> > +    lsp_name=lp-$i
> > +    ovn-nbctl lsp-add ls-$i $lsp_name
> > +    ovn-nbctl lsp-set-addresses $lsp_name f0:00:00:00:00:0$i
> > +    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:0$i
> > +
> > +    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
> > +done
> > +
> > +ovn-nbctl --wait=sb sync
> > +ovn-nbctl show
> > +ovn-sbctl dump-flows
> > +
> > +# vif ports
> > +for i in 1 2; do
> > +    : > vif-$i.expected
> > +done
> > +
> > +# localnet ports
> > +for i in 1 2; do
> > +    for tag in `seq 10 20`; do
> > +        : > $i-$tag.expected
> > +    done
> > +done
> > +
> > +test_packet() {
> > +    local inport=$1 outport=$2 dst=$3 src=$4 eth=$5 eout=$6 lout=$7
> > +
> > +    # Expect the packet cloned to all localnet ports
> > +    : > expout
> > +    for tag in `seq 10 20`; do
> > +        echo "output(\"ln-$inport-$tag\");" >> expout
> > +    done
> > +
> > +    # First try tracing the packet.
> > +    uflow="inport==\"lp-$inport\" && eth.dst==$dst && eth.src==$src && eth.type==0x$eth"
> > +    AT_CAPTURE_FILE([trace])
> > +    AT_CHECK([ovn-trace --all ls-$inport "$uflow" | tee trace | sed '1,/Minimal trace/d'], [0], [expout])
> > +
> > +    # Then actually send a packet, for an end-to-end test.
> > +    local packet=$(echo $dst$src | sed 's/://g')${eth}
> > +    as hv-$1 ovs-appctl netdev-dummy/receive vif-$inport $packet
> > +
> > +    # Expect the packet received by the peer VIF port
> > +    echo $packet >> vif-$outport.expected
> > +
> > +    # Expect the packet to transfer through the common fabric network
> > +    local packet=$(echo $dst$src | sed 's/://g')81000014${eth}
> > +    echo $packet >> $1-10.expected
> > +}
> > +
> > +test_packet 1 2 f0:00:00:00:00:02 f0:00:00:00:00:01 1001 ln-1-10 ln-1-10
> > +test_packet 1 2 f0:00:00:00:00:02 f0:00:00:00:00:01 1002 ln-1-10 ln-1-10
> > +
> > +test_packet 2 1 f0:00:00:00:00:01 f0:00:00:00:00:02 1003 ln-2-10 ln-2-10
> > +test_packet 2 1 f0:00:00:00:00:01 f0:00:00:00:00:02 1004 ln-2-10 ln-2-10
> > +
> > +# Dump a bunch of info helpful for debugging if there's a failure.
> > +
> > +echo "------ OVN dump ------"
> > +ovn-nbctl show
> > +ovn-sbctl show
> > +
> > +for i in 1 2; do
> > +    hv=hv-$i
> > +    echo "------ $hv dump ------"
> > +    as $hv ovs-vsctl show
> > +    as $hv ovs-ofctl -O OpenFlow13 dump-flows br-int
> > +done
> > +
> > +# Now check the packets actually received against the ones expected.
> > +for i in 1 2; do
> > +    OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$i/vif-$i-tx.pcap], [vif-$i.expected])
> > +    OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$i/br-phys_n-10-tx.pcap], [$i-10.expected])
> > +done
> > +
> > +OVN_CLEANUP([hv-1],[hv-2])
> > +
> > +AT_CLEANUP
> > +
> > +AT_SETUP([ovn -- 2 HVs, 1 LS, no switching between multiple localnet ports with different tags])
> > +ovn_start
> > +
> > +# In this test case we create a single switch connected to two physical
> > +# networks via two localnet ports. Then we create two hypervisors, with 2
> > +# ports on each. The test validates no interconnectivity between VIF ports
> > +# located on chassis plugged to different physical networks.
> > +
> > +# create the single switch with two locanet ports
> > +ovn-nbctl ls-add ls1
> > +for tag in 10 20; do
> > +    ln_port_name=ln-$tag
> > +    ovn-nbctl lsp-add ls1 $ln_port_name "" $tag
> > +    ovn-nbctl lsp-set-addresses $ln_port_name unknown
> > +    ovn-nbctl lsp-set-type $ln_port_name localnet
> > +    ovn-nbctl lsp-set-options $ln_port_name network_name=phys-$tag
> > +done
> > +
> > +# create fabric networks
> > +for tag in 10 20; do
> > +    net_add n-$tag
> > +done
> > +
> > +# create four chassis, each connected to one network, each with a single VIF port
> > +for tag in 10 20; do
> > +    for i in 1 2; do
> > +        sim_add hv-$tag-$i
> > +        as hv-$tag-$i
> > +        ovs-vsctl add-br br-phys
> > +        ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys-$tag:br-phys
> > +        ovn_attach n-$tag br-phys 192.168.$i.$tag
> > +
> > +        ovs-vsctl add-port br-int vif-$tag-$i -- \
> > +            set Interface vif-$tag-$i external-ids:iface-id=lp-$tag-$i \
> > +                                  options:tx_pcap=hv-$tag-$i/vif-$tag-$i-tx.pcap \
> > +                                  options:rxq_pcap=hv-$tag-$i/vif-$tag-$i-rx.pcap \
> > +                                  ofport-request=$tag$i
> > +
> > +        lsp_name=lp-$tag-$i
> > +        ovn-nbctl lsp-add ls1 $lsp_name
> > +        ovn-nbctl lsp-set-addresses $lsp_name f0:00:00:00:0$i:$tag
> > +        ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:0$i:$tag
> > +
> > +        OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
> > +    done
> > +done
> > +ovn-nbctl --wait=sb sync
> > +ovn-sbctl dump-flows
> > +
> > +for tag in 10 20; do
> > +    for i in 1 2; do
> > +        : > $tag-$i.expected
> > +    done
> > +done
> > +
> > +vif_to_hv() {
> > +    echo hv-$1
> > +}
> > +
> > +test_packet() {
> > +    local inport=$1 dst=$2 src=$3 eth=$4 eout=$5 lout=$6
> > +
> > +    # First try tracing the packet.
> > +    uflow="inport==\"lp-$inport\" && eth.dst==$dst && eth.src==$src && eth.type==0x$eth"
> > +    echo "output(\"$lout\");" > expout
> > +    AT_CAPTURE_FILE([trace])
> > +    AT_CHECK([ovn-trace --all ls1 "$uflow" | tee trace | sed '1,/Minimal trace/d'], [0], [expout])
> > +
> > +    # Then actually send a packet, for an end-to-end test.
> > +    local packet=$(echo $dst$src | sed 's/://g')${eth}
> > +    hv=`vif_to_hv $inport`
> > +    vif=vif-$inport
> > +    as $hv ovs-appctl netdev-dummy/receive $vif $packet
> > +    if test $eth = 1002 -o $eth = 2002; then
> > +        echo $packet >> ${eout#lp-}.expected
> > +    fi
> > +}
> > +
> > +# different fabric networks -> should fail
> > +test_packet 10-1 f0:00:00:00:01:20 f0:00:00:00:01:10 1001 lp-20-1 lp-20-1
> > +test_packet 20-1 f0:00:00:00:01:10 f0:00:00:00:01:20 2001 lp-10-1 lp-10-1
> > +
> > +# same fabric networks -> should pass
> > +test_packet 10-1 f0:00:00:00:02:10 f0:00:00:00:01:10 1002 lp-10-2 lp-10-2
> > +test_packet 20-1 f0:00:00:00:02:20 f0:00:00:00:01:20 2002 lp-20-2 lp-20-2
> > +test_packet 10-2 f0:00:00:00:01:10 f0:00:00:00:02:10 1002 lp-10-1 lp-10-1
> > +test_packet 20-2 f0:00:00:00:01:20 f0:00:00:00:02:20 2002 lp-20-1 lp-20-1
> > +
> > +# Dump a bunch of info helpful for debugging if there's a failure.
> > +echo "------ OVN dump ------"
> > +ovn-nbctl show
> > +ovn-sbctl show
> > +
> > +for tag in 10 20; do
> > +    for i in 1 2; do
> > +        hv=hv-$tag-$i
> > +        echo "------ $hv dump ------"
> > +        as $hv ovs-vsctl show
> > +        as $hv ovs-ofctl -O OpenFlow13 dump-flows br-int
> > +    done
> > +done
> > +
> > +# Now check the packets actually received against the ones expected.
> > +for tag in 10 20; do
> > +    for i in 1 2; do
> > +        echo "hv = $tag-$i"
> > +        OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$tag-$i/vif-$tag-$i-tx.pcap], [$tag-$i.expected])
> > +    done
> > +done
> > +
> > +OVN_CLEANUP([hv-10-1],[hv-10-2],[hv-20-1],[hv-20-2])
> > +
> > +AT_CLEANUP
> > +
> >  AT_SETUP([ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS])
> >  AT_KEYWORDS([vtep])
> >  ovn_start
> > --
> > 2.26.2
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Han Zhou May 5, 2020, 6:31 a.m. UTC | #4
On Mon, May 4, 2020 at 9:52 AM Ihar Hrachyshka <ihrachys@redhat.com> wrote:
>
> On Fri, May 1, 2020 at 3:28 AM Han Zhou <hzhou@ovn.org> wrote:
> >
> >
> >
> > On Tue, Apr 28, 2020 at 1:38 PM Ihar Hrachyshka <ihrachys@redhat.com>
wrote:
> > >
> > > Assuming only a single localnet port is actually plugged mapped on
> > > each chassis, this allows to maintain disjoint networks plugged to the
> > > same switch.  This is useful to simplify resource management for
> > > OpenStack "routed provider networks" feature [1] where a single
> > > "network" (which traditionally maps to logical switches in OVN) is
> > > comprised of multiple L2 segments and assumes external L3 routing
> > > implemented between the segments.
> > >
> > > TODO: consider E-W routing between localnet vlan tagged LSs
> > >       (ovn-chassis-mac-mappings).
> > >
> > > [1]:
https://docs.openstack.org/ocata/networking-guide/config-routed-networks.html
> >
> > After reading the context and related discussions, I am a little
concerned about this approach. It changes the concept of a logical switch
in OVN, which was a L2 concept. It may be ok to change the concept if all
the aspects are understood and properly documented and implemented.
However, at least it is not clear to me how would it work on the physical
pipeline when VIF1 on chassisA sends packet to VIF2 on chassis B through
physical network when the two VIFs are on same LS that has multiple
localnet ports. Today only one localnet port is allowed, so it is handled
by sending the packet through *the* localnet port (without localnet port it
goes through tunnel). With this patch it seems would choose the last
localnet port, regardless of the physical location (L2 segment) of each
VIF. I would expect the data structure change in "local_datapath" to
maintain a list of localnet ports instead of only one, and the logic in
physical.c to pick up the desired localnet port, according to the L2
segment it physically belongs to. (it shouldn't be a mapping between
chassis and physical network/vlan, because it is natural to have multiple
VLANs connected to a single chassis and have VIFs on different VLANs).
>
> The assumption in this patch is, while the north database may have
> multiple localnet ports per switch, only one segment can actually be
> plugged per chassis. Which reduces the problem of multiple localnet
> ports and selecting the right one to the current situation where
> there's a single localnet port per switch for each chassis.

OK. I think I didn't get the real meaning of this assumption earlier. It is
reasonable to have this assumption, just like the earlier assumption that
we only allows one localnet port for a L2 LS. Now that the LS is actually
modeling a L3 fabric connected with multiple L2 segments, it is reasonable
to assume that:
1. Only one localnet port should exist for each L2 segment. (so if there
are N localnet ports in one LS, we assume they represents N L2 segments
connected by L3 fabric)
2. For each chassis, it should plug to only one L2 segment for each such L3
LS. (I think this needs to be validated in the binding.c to at least warn
such misconfiguration.)

It seems the implementation supported these assumptions well, but it may be
documented more clearly for the L3 LS concept (I am not sure what can be a
better word, since LS appears to be L2 concept) and use cases, and the
above assumptions/constraints as well.

In addition, it is better to make it clear that a chassis can still be
plugged with multiple localnet ports for different L3 LSes. This is a
typical scenario of routed network use case with VRFs. It should still work
with this patch, but I'd suggest to add a test case to cover such scenario
to make sure.

> (The
> assumption is implicit right now but we can also enforce it if
> needed.) Perhaps in the future we should consider expanding the
> feature to allow for multiple localnet ports plugged to fabric per
> chassis, but that's not the goal here.
>
For different L3 LSes, I think it is supported (and should be) already. For
multiple localnet port for same L3 LS, I am not sure if it is reasonable in
the future, but at least I totally agree that it is not a goal for now.

> >
> > Furthermore, if we change the concept of LS being a single L2 domain,
we may need to consider more changes in the L2 logical pipelines, such as:
> > - L2 lookup stage should only look up destination for the ports that
are in the same segment.
> > - BUM traffic should only go to the localnet port that belong to the
same L2 segment of the LS.
> >
I thought about the BUM (broadcast, unknown, multicast) traffic again. It
seems current phyisical pipeline handling for localnet would also handle
that well in the case of mulitple localnet ports in same LS. (I implemented
that logical initially, and never thought about this use case, and I am
surprised that it seems would just work for this case, with the earlier
assumptions). I'd suggest to at least add one or two test cases to ensure
it works and avoid any future regressions since there are implications to
make this work and I am afraid if there are any futher changes in the
future could break it. The test cases can be:
- For "unknown MAC", it is sent out through the physical bridge, once and
only once.
- For broadcast traffic (e.g. ARP), it is sent to all ports that are
physically located on the same L2 segment (some on the same chassis, some
on other chassis but same L2), once and only once.

> > All this are suggesting that we need to introduce a *segment*
abstraction in OVN, and multiple segments belong to same LS. I wonder if it
is even more complex to implement this in OVN than in Neutron plugin. Maybe
I missed something here and made it more complex than necessary. Please
correct me if I am wrong :)
>
> Indeed, if OVN models the OpenStack multi-segment network concept
> natively, 1) it would be easier for OpenStack and 2) it would just
> shift complexity from OpenStack to OVN. (I don't think it would be
> MORE complex than implementing it in the neutron plugin, but about the
> same.) This patch series is an attempt to facilitate neutron
> implementation of routed provider networks (multi-segment networks
> with L3-adjacent L2-domains) with as little blood on both sides.
>
Initially I was thinking about the full L2 segment modeling within LS. In
theory it would be same level of complexity as implementing in Neutron
plugin, but C would usually take more effort than python for such things.
However, I realized that the modeling change may be not necessary in OVN to
support the routed network use case, given that the above mentioned
assumptions are stricly followed. Each localnet port is the modeling of a
L2 segment - we just can't tell which port belong to which L2 segment from
NB objects. The mapping is only reflected by the port-bindings and chassis
configuration. So L2 becomes physical part, and NB only tells L3 model for
the routed network. With this design, there are still things that would
looks confusing because of the modeling missing in NB, such as:
- the logical flow generation blindly generate L2 lookup flows, ARP
response flows, etc. for all ports on the LS
- the MC_Groups (MC_Flood, MC_Unknown, which are L2 concept) generated in
SB contains all ports in the L3 network

In most cases the tricks we did with localnet port in physical pipeline
would handle it well. Maybe one weird scenario I could think of is that
when a VIF1 sends ARP request for IP of VIF2 which is on a different
chassis on a different L2 segment, VIF1 would get the ARP resolved, while
this shall never happen in physical network. However, I don't think this
should create any real problem. I hope this would not result in any
unexpected limitations because of the "simplified" modeling in NB for
routed network.

To summarize, I agree with the solution of this patch, with the below
comments:
- Document the concept change of LS (aka L3 LS), the use case, the
assumptions and constraints more clearly, in the localnet description in
ovn-architecture.xml.
- Validate the assumption and print warning logs if it is not met.
- Add test cases to cover:
  a) The VRF use case of routed network. I.e. M logical networks (LSes),
each has N segments (so in total there are M x N localnet ports), M
localnet ports for the M LSes can still work on same chassis.
  b) For "unknown MAC", it is sent out through the physical bridge, once
and only once.
  c) For broadcast traffic (e.g. ARP), it is sent to all ports that are
physically located on the same L2 segment (some on the same chassis, some
on other chassis but same L2), once and only once.

Thanks,
Han
Ihar Hrachyshka May 8, 2020, 4:23 p.m. UTC | #5
Hi Han,

thanks a lot for your comments. I've just sent v3 of the patch that 
should cover issues you mentioned. More comments inline below.

Thanks!
Ihar

On 5/5/20 2:31 AM, Han Zhou wrote:
>
>
> On Mon, May 4, 2020 at 9:52 AM Ihar Hrachyshka <ihrachys@redhat.com 
> <mailto:ihrachys@redhat.com>> wrote:
> >
> > On Fri, May 1, 2020 at 3:28 AM Han Zhou <hzhou@ovn.org 
> <mailto:hzhou@ovn.org>> wrote:
> > >
> > >
> > >
> > > On Tue, Apr 28, 2020 at 1:38 PM Ihar Hrachyshka 
> <ihrachys@redhat.com <mailto:ihrachys@redhat.com>> wrote:
> > > >
> > > > Assuming only a single localnet port is actually plugged mapped on
> > > > each chassis, this allows to maintain disjoint networks plugged 
> to the
> > > > same switch.  This is useful to simplify resource management for
> > > > OpenStack "routed provider networks" feature [1] where a single
> > > > "network" (which traditionally maps to logical switches in OVN) is
> > > > comprised of multiple L2 segments and assumes external L3 routing
> > > > implemented between the segments.
> > > >
> > > > TODO: consider E-W routing between localnet vlan tagged LSs
> > > >       (ovn-chassis-mac-mappings).
> > > >
> > > > [1]: 
> https://docs.openstack.org/ocata/networking-guide/config-routed-networks.html
> > >
> > > After reading the context and related discussions, I am a little 
> concerned about this approach. It changes the concept of a logical 
> switch in OVN, which was a L2 concept. It may be ok to change the 
> concept if all the aspects are understood and properly documented and 
> implemented. However, at least it is not clear to me how would it work 
> on the physical pipeline when VIF1 on chassisA sends packet to VIF2 on 
> chassis B through physical network when the two VIFs are on same LS 
> that has multiple localnet ports. Today only one localnet port is 
> allowed, so it is handled by sending the packet through *the* localnet 
> port (without localnet port it goes through tunnel). With this patch 
> it seems would choose the last localnet port, regardless of the 
> physical location (L2 segment) of each VIF. I would expect the data 
> structure change in "local_datapath" to maintain a list of localnet 
> ports instead of only one, and the logic in physical.c to pick up the 
> desired localnet port, according to the L2 segment it physically 
> belongs to. (it shouldn't be a mapping between chassis and physical 
> network/vlan, because it is natural to have multiple VLANs connected 
> to a single chassis and have VIFs on different VLANs).
> >
> > The assumption in this patch is, while the north database may have
> > multiple localnet ports per switch, only one segment can actually be
> > plugged per chassis. Which reduces the problem of multiple localnet
> > ports and selecting the right one to the current situation where
> > there's a single localnet port per switch for each chassis.
>
> OK. I think I didn't get the real meaning of this assumption earlier. 
> It is reasonable to have this assumption, just like the earlier 
> assumption that we only allows one localnet port for a L2 LS. Now that 
> the LS is actually modeling a L3 fabric connected with multiple L2 
> segments, it is reasonable to assume that:
> 1. Only one localnet port should exist for each L2 segment. (so if 
> there are N localnet ports in one LS, we assume they represents N L2 
> segments connected by L3 fabric)
> 2. For each chassis, it should plug to only one L2 segment for each 
> such L3 LS. (I think this needs to be validated in the binding.c to at 
> least warn such misconfiguration.)


It's already validated and we log a warning message on finding a second 
localnet port that has a bridge mapping for it. We don't fail to start 
the controller or stop flow configuration. I don't think it's needed but 
please clarify if you think it is.


>
> It seems the implementation supported these assumptions well, but it 
> may be documented more clearly for the L3 LS concept (I am not sure 
> what can be a better word, since LS appears to be L2 concept) and use 
> cases, and the above assumptions/constraints as well.
>
> In addition, it is better to make it clear that a chassis can still be 
> plugged with multiple localnet ports for different L3 LSes. This is a 
> typical scenario of routed network use case with VRFs. It should still 
> work with this patch, but I'd suggest to add a test case to cover such 
> scenario to make sure.


I updated documentation with - hopefully - better wording and clarifying 
how this is supposed to be used in real life.


>
> > (The
> > assumption is implicit right now but we can also enforce it if
> > needed.) Perhaps in the future we should consider expanding the
> > feature to allow for multiple localnet ports plugged to fabric per
> > chassis, but that's not the goal here.
> >
> For different L3 LSes, I think it is supported (and should be) 
> already. For multiple localnet port for same L3 LS, I am not sure if 
> it is reasonable in the future, but at least I totally agree that it 
> is not a goal for now.
>
> > >
> > > Furthermore, if we change the concept of LS being a single L2 
> domain, we may need to consider more changes in the L2 logical 
> pipelines, such as:
> > > - L2 lookup stage should only look up destination for the ports 
> that are in the same segment.
> > > - BUM traffic should only go to the localnet port that belong to 
> the same L2 segment of the LS.
> > >
> I thought about the BUM (broadcast, unknown, multicast) traffic again. 
> It seems current phyisical pipeline handling for localnet would also 
> handle that well in the case of mulitple localnet ports in same LS. (I 
> implemented that logical initially, and never thought about this use 
> case, and I am surprised that it seems would just work for this case, 
> with the earlier assumptions). I'd suggest to at least add one or two 
> test cases to ensure it works and avoid any future regressions since 
> there are implications to make this work and I am afraid if there are 
> any futher changes in the future could break it. The test cases can be:
> - For "unknown MAC", it is sent out through the physical bridge, once 
> and only once.
> - For broadcast traffic (e.g. ARP), it is sent to all ports that are 
> physically located on the same L2 segment (some on the same chassis, 
> some on other chassis but same L2), once and only once.
>
> > > All this are suggesting that we need to introduce a *segment* 
> abstraction in OVN, and multiple segments belong to same LS. I wonder 
> if it is even more complex to implement this in OVN than in Neutron 
> plugin. Maybe I missed something here and made it more complex than 
> necessary. Please correct me if I am wrong :)
> >
> > Indeed, if OVN models the OpenStack multi-segment network concept
> > natively, 1) it would be easier for OpenStack and 2) it would just
> > shift complexity from OpenStack to OVN. (I don't think it would be
> > MORE complex than implementing it in the neutron plugin, but about the
> > same.) This patch series is an attempt to facilitate neutron
> > implementation of routed provider networks (multi-segment networks
> > with L3-adjacent L2-domains) with as little blood on both sides.
> >
> Initially I was thinking about the full L2 segment modeling within LS. 
> In theory it would be same level of complexity as implementing in 
> Neutron plugin, but C would usually take more effort than python for 
> such things.
> However, I realized that the modeling change may be not necessary in 
> OVN to support the routed network use case, given that the above 
> mentioned assumptions are stricly followed. Each localnet port is the 
> modeling of a L2 segment - we just can't tell which port belong to 
> which L2 segment from NB objects. The mapping is only reflected by the 
> port-bindings and chassis configuration. So L2 becomes physical part, 
> and NB only tells L3 model for the routed network. With this design, 
> there are still things that would looks confusing because of the 
> modeling missing in NB, such as:
> - the logical flow generation blindly generate L2 lookup flows, ARP 
> response flows, etc. for all ports on the LS
> - the MC_Groups (MC_Flood, MC_Unknown, which are L2 concept) generated 
> in SB contains all ports in the L3 network
>
> In most cases the tricks we did with localnet port in physical 
> pipeline would handle it well. Maybe one weird scenario I could think 
> of is that when a VIF1 sends ARP request for IP of VIF2 which is on a 
> different chassis on a different L2 segment, VIF1 would get the ARP 
> resolved, while this shall never happen in physical network. However, 
> I don't think this should create any real problem. I hope this would 
> not result in any unexpected limitations because of the "simplified" 
> modeling in NB for routed network.
>
> To summarize, I agree with the solution of this patch, with the below 
> comments:
> - Document the concept change of LS (aka L3 LS), the use case, the 
> assumptions and constraints more clearly, in the localnet description 
> in ovn-architecture.xml.
> - Validate the assumption and print warning logs if it is not met.
> - Add test cases to cover:
>   a) The VRF use case of routed network. I.e. M logical networks 
> (LSes), each has N segments (so in total there are M x N localnet 
> ports), M localnet ports for the M LSes can still work on same chassis.
>   b) For "unknown MAC", it is sent out through the physical bridge, 
> once and only once.
>   c) For broadcast traffic (e.g. ARP), it is sent to all ports that 
> are physically located on the same L2 segment (some on the same 
> chassis, some on other chassis but same L2), once and only once.


I've updated the test suite with more test scenarios to - hopefully - 
cover your list where it was not already covered.


>
> Thanks,
> Han
>
diff mbox series

Patch

diff --git a/controller/binding.c b/controller/binding.c
index 20a89d07d..c88c4ece8 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -692,12 +692,28 @@  add_localnet_egress_interface_mappings(
     }
 }
 
+static bool
+is_network_plugged(const struct sbrec_port_binding *binding_rec,
+                   struct shash *bridge_mappings)
+{
+    const char *network = smap_get(&binding_rec->options, "network_name");
+    if (!network) {
+        return false;
+    }
+    return shash_find_data(bridge_mappings, network);
+}
+
 static void
 consider_localnet_port(const struct sbrec_port_binding *binding_rec,
                        struct shash *bridge_mappings,
                        struct sset *egress_ifaces,
                        struct hmap *local_datapaths)
 {
+    /* Ignore localnet ports for unplugged networks. */
+    if (!is_network_plugged(binding_rec, bridge_mappings)) {
+        return;
+    }
+
     add_localnet_egress_interface_mappings(binding_rec,
             bridge_mappings, egress_ifaces);
 
diff --git a/controller/patch.c b/controller/patch.c
index 349faae17..52255cc3a 100644
--- a/controller/patch.c
+++ b/controller/patch.c
@@ -198,9 +198,9 @@  add_bridge_mappings(struct ovsdb_idl_txn *ovs_idl_txn,
             continue;
         }
 
-        const char *patch_port_id;
+        bool is_localnet = false;
         if (!strcmp(binding->type, "localnet")) {
-            patch_port_id = "ovn-localnet-port";
+            is_localnet = true;
         } else if (!strcmp(binding->type, "l2gateway")) {
             if (!binding->chassis
                 || strcmp(chassis->name, binding->chassis->name)) {
@@ -208,7 +208,6 @@  add_bridge_mappings(struct ovsdb_idl_txn *ovs_idl_txn,
                  * so we should not create any patch ports for it. */
                 continue;
             }
-            patch_port_id = "ovn-l2gateway-port";
         } else {
             /* not a localnet or L2 gateway port. */
             continue;
@@ -224,12 +223,25 @@  add_bridge_mappings(struct ovsdb_idl_txn *ovs_idl_txn,
         struct ovsrec_bridge *br_ln = shash_find_data(&bridge_mappings, network);
         if (!br_ln) {
             static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-            VLOG_ERR_RL(&rl, "bridge not found for %s port '%s' "
-                    "with network name '%s'",
-                    binding->type, binding->logical_port, network);
+            if (!is_localnet) {
+                VLOG_ERR_RL(&rl, "bridge not found for %s port '%s' "
+                        "with network name '%s'",
+                        binding->type, binding->logical_port, network);
+            } else {
+                VLOG_INFO_RL(&rl, "bridge not found for localnet port '%s' "
+                        "with network name '%s'; skipping",
+                        binding->logical_port, network);
+            }
             continue;
         }
 
+        const char *patch_port_id;
+        if (is_localnet) {
+            patch_port_id = "ovn-localnet-port";
+        } else {
+            patch_port_id = "ovn-l2gateway-port";
+        }
+
         char *name1 = patch_port_name(br_int->name, binding->logical_port);
         char *name2 = patch_port_name(binding->logical_port, br_int->name);
         create_patch_port(ovs_idl_txn, patch_port_id, binding->logical_port,
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 048c28ca6..91ab4449e 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -543,7 +543,9 @@  struct ovn_datapath {
     /* The "derived" OVN port representing the instance of l3dgw_port on
      * the "redirect-chassis". */
     struct ovn_port *l3redirect_port;
-    struct ovn_port *localnet_port;
+
+    struct ovn_port **localnet_ports;
+    size_t n_localnet_ports;
 
     struct ovs_list lr_list; /* In list of logical router datapaths. */
     /* The logical router group to which this datapath belongs.
@@ -611,6 +613,7 @@  ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od)
         ovn_destroy_tnlids(&od->port_tnlids);
         bitmap_free(od->ipam_info.allocated_ipv4s);
         free(od->router_ports);
+        free(od->localnet_ports);
         ovn_ls_port_group_destroy(&od->nb_pgs);
         destroy_mcast_info_for_datapath(od);
 
@@ -2019,6 +2022,7 @@  join_logical_ports(struct northd_context *ctx,
     struct ovn_datapath *od;
     HMAP_FOR_EACH (od, key_node, datapaths) {
         if (od->nbs) {
+            size_t allocated_localnet_ports = 0;
             for (size_t i = 0; i < od->nbs->n_ports; i++) {
                 const struct nbrec_logical_switch_port *nbsp
                     = od->nbs->ports[i];
@@ -2053,7 +2057,12 @@  join_logical_ports(struct northd_context *ctx,
                 }
 
                 if (!strcmp(nbsp->type, "localnet")) {
-                   od->localnet_port = op;
+                   if (od->n_localnet_ports >= allocated_localnet_ports) {
+                       od->localnet_ports = x2nrealloc(
+                           od->localnet_ports, &allocated_localnet_ports,
+                           sizeof *od->localnet_ports);
+                   }
+                   od->localnet_ports[od->n_localnet_ports++] = op;
                 }
 
                 op->lsp_addrs
@@ -3012,7 +3021,7 @@  ovn_port_update_sbrec(struct northd_context *ctx,
                               "reside-on-redirect-chassis", false) ||
                 op->peer == op->peer->od->l3dgw_port)) {
                 add_router_port_garp = true;
-            } else if (chassis && op->od->localnet_port) {
+            } else if (chassis && op->od->n_localnet_ports) {
                 add_router_port_garp = true;
             }
 
@@ -4705,9 +4714,10 @@  build_pre_acls(struct ovn_datapath *od, struct hmap *lflows)
             struct ovn_port *op = od->router_ports[i];
             build_pre_acl_flows_for_nbsp(od, lflows, op->nbsp, op->json_key);
         }
-        if (od->localnet_port) {
-            build_pre_acl_flows_for_nbsp(od, lflows, od->localnet_port->nbsp,
-                                         od->localnet_port->json_key);
+        for (size_t i = 0; i < od->n_localnet_ports; i++) {
+            build_pre_acl_flows_for_nbsp(od, lflows,
+                                         od->localnet_ports[i]->nbsp,
+                                         od->localnet_ports[i]->json_key);
         }
 
         /* Ingress and Egress Pre-ACL Table (Priority 110).
@@ -5975,9 +5985,9 @@  build_lswitch_rport_arp_req_flow_for_ip(struct sset *ips,
     /* Send a the packet only to the router pipeline and skip flooding it
      * in the broadcast domain (except for the localnet port).
      */
-    if (od->localnet_port) {
+    for (size_t i = 0; i < od->n_localnet_ports; i++) {
         ds_put_format(&actions, "clone { outport = %s; output; }; ",
-                      od->localnet_port->json_key);
+                      od->localnet_ports[i]->json_key);
     }
     ds_put_format(&actions, "outport = %s; output;", patch_op->json_key);
     ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_L2_LKUP, priority,
@@ -6570,25 +6580,29 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
         }
 
         bool is_external = lsp_is_external(op->nbsp);
-        if (is_external && (!op->od->localnet_port ||
+        if (is_external && (!op->od->n_localnet_ports ||
                             !op->nbsp->ha_chassis_group)) {
-            /* If it's an external port and there is no localnet port
+            /* If it's an external port and there are no localnet ports
              * and if it doesn't belong to an HA chassis group ignore it. */
             continue;
         }
 
         for (size_t i = 0; i < op->n_lsp_addrs; i++) {
-            const char *json_key;
             if (is_external) {
-                json_key = op->od->localnet_port->json_key;
+                for (size_t j = 0; j < op->od->n_localnet_ports; j++) {
+                    build_dhcpv4_options_flows(
+                        op, lflows, &op->lsp_addrs[i],
+                        op->od->localnet_ports[j]->json_key, is_external);
+                    build_dhcpv6_options_flows(
+                        op, lflows, &op->lsp_addrs[i],
+                        op->od->localnet_ports[j]->json_key, is_external);
+                }
             } else {
-                json_key = op->json_key;
+                build_dhcpv4_options_flows(op, lflows, &op->lsp_addrs[i],
+                                           op->json_key, is_external);
+                build_dhcpv6_options_flows(op, lflows, &op->lsp_addrs[i],
+                                           op->json_key, is_external);
             }
-            build_dhcpv4_options_flows(op, lflows, &op->lsp_addrs[i], json_key,
-                                       is_external);
-
-            build_dhcpv6_options_flows(op, lflows, &op->lsp_addrs[i], json_key,
-                                       is_external);
         }
     }
 
@@ -6644,8 +6658,7 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
     }
 
     HMAP_FOR_EACH (op, key_node, ports) {
-        if (!op->nbsp || !lsp_is_external(op->nbsp) ||
-            !op->od->localnet_port) {
+        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
            continue;
         }
 
@@ -6653,8 +6666,10 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
          * external ports  on chassis not binding those ports.
          * This makes the router pipeline to be run only on the chassis
          * binding the external ports. */
-        build_drop_arp_nd_flows_for_unbound_router_ports(
-            lflows, op, op->od->localnet_port);
+        for (size_t i = 0; i < op->od->n_localnet_ports; i++) {
+            build_drop_arp_nd_flows_for_unbound_router_ports(
+                lflows, op, op->od->localnet_ports[i]);
+        }
     }
 
     char *svc_check_match = xasprintf("eth.dst == %s", svc_monitor_mac);
@@ -6872,7 +6887,7 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
                               ETH_ADDR_ARGS(mac));
                 if (op->peer->od->l3dgw_port
                     && op->peer->od->l3redirect_port
-                    && op->od->localnet_port) {
+                    && op->od->n_localnet_ports) {
                     bool add_chassis_resident_check = false;
                     if (op->peer == op->peer->od->l3dgw_port) {
                         /* The peer of this port represents a distributed
@@ -8178,7 +8193,7 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
                           op->lrp_networks.ipv4_addrs[i].addr_s);
 
             if (op->od->l3dgw_port && op->od->l3redirect_port && op->peer
-                && op->peer->od->localnet_port) {
+                && op->peer->od->n_localnet_ports) {
                 bool add_chassis_resident_check = false;
                 if (op == op->od->l3dgw_port) {
                     /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s
diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml
index 533ae716d..88edb6f32 100644
--- a/ovn-architecture.7.xml
+++ b/ovn-architecture.7.xml
@@ -441,9 +441,8 @@ 
 
   <p>
     A <code>localnet</code> logical switch port bridges a logical switch to a
-    physical VLAN.  Any given logical switch should have no more than one
-    <code>localnet</code> port.  Such a logical switch is used in two
-    scenarios:
+    physical VLAN.  A logical switch may have one or more <code>localnet</code>
+    ports.  Such a logical switch is used in two scenarios:
   </p>
 
   <ul>
@@ -1895,13 +1894,13 @@ 
   <ol>
     <li>
       The packet first enters the ingress pipeline, and then egress pipeline of
-      the source localnet logical switch datapath and is sent out via the
+      the source localnet logical switch datapath and is sent out via a
       localnet port of the source localnet logical switch (instead of sending
       it to router pipeline).
     </li>
 
     <li>
-      The gateway chassis receives the packet via the localnet port of the
+      The gateway chassis receives the packet via a localnet port of the
       source localnet logical switch and sends it to the integration bridge.
       The packet then enters the ingress pipeline, and then egress pipeline of
       the source localnet logical switch datapath and enters the ingress
@@ -1916,11 +1915,11 @@ 
       From the router datapath, packet enters the ingress pipeline and then
       egress pipeline of the destination localnet logical switch datapath.
       It then goes out of the integration bridge to the provider bridge (
-      belonging to the destination logical switch) via the localnet port.
+      belonging to the destination logical switch) via a localnet port.
     </li>
 
     <li>
-      The destination chassis receives the packet via the localnet port and
+      The destination chassis receives the packet via a localnet port and
       sends it to the integration bridge. The packet enters the
       ingress pipeline and then egress pipeline of the destination localnet
       logical switch and finally delivered to the destination VM port.
@@ -1935,13 +1934,13 @@ 
   <ol>
     <li>
       The packet first enters the ingress pipeline, and then egress pipeline of
-      the source localnet logical switch datapath and is sent out via the
+      the source localnet logical switch datapath and is sent out via a
       localnet port of the source localnet logical switch (instead of sending
       it to router pipeline).
     </li>
 
     <li>
-      The gateway chassis receives the packet via the localnet port of the
+      The gateway chassis receives the packet via a localnet port of the
       source localnet logical switch and sends it to the integration bridge.
       The packet then enters the ingress pipeline, and then egress pipeline of
       the source localnet logical switch datapath and enters the ingress
@@ -1957,7 +1956,7 @@ 
       egress pipeline of the localnet logical switch datapath which provides
       external connectivity. It then goes out of the integration bridge to the
       provider bridge (belonging to the logical switch which provides external
-      connectivity) via the localnet port.
+      connectivity) via a localnet port.
     </li>
   </ol>
 
@@ -1967,7 +1966,7 @@ 
 
   <ol>
     <li>
-      The gateway chassis receives the packet from the localnet port of
+      The gateway chassis receives the packet from a localnet port of
       the logical switch which provides external connectivity. The packet then
       enters the ingress pipeline and then egress pipeline of the localnet
       logical switch (which provides external connectivity). The packet then
@@ -1978,12 +1977,12 @@ 
       The ingress pipeline of the logical router datapath applies the unNATting
       rules. The packet then enters the ingress pipeline and then egress
       pipeline of the source localnet logical switch. Since the source VM
-      doesn't reside in the gateway chassis, the packet is sent out via the
+      doesn't reside in the gateway chassis, the packet is sent out via a
       localnet port of the source logical switch.
     </li>
 
     <li>
-      The source chassis receives the packet via the localnet port and
+      The source chassis receives the packet via a localnet port and
       sends it to the integration bridge. The packet enters the
       ingress pipeline and then egress pipeline of the source localnet
       logical switch and finally gets delivered to the source VM port.
diff --git a/ovn-nb.xml b/ovn-nb.xml
index af15c550a..7e9cd5cc6 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -244,14 +244,14 @@ 
     <p>
       There are two kinds of logical switches, that is, ones that fully
       virtualize the network (overlay logical switches) and ones that provide
-      simple connectivity to a physical network (bridged logical switches).
+      simple connectivity to physical networks (bridged logical switches).
       They work in the same way when providing connectivity between logical
-      ports on same chasis, but differently when connecting remote logical
+      ports on same chassis, but differently when connecting remote logical
       ports.  Overlay logical switches connect remote logical ports by tunnels,
       while bridged logical switches provide connectivity to remote ports by
-      bridging the packets to directly connected physical L2 segment with the
+      bridging the packets to directly connected physical L2 segments with the
       help of <code>localnet</code> ports.  Each bridged logical switch has
-      one and only one <code>localnet</code> port, which has only one special
+      one or more <code>localnet</code> ports, which have only one special
       address <code>unknown</code>.
     </p>
 
@@ -527,10 +527,13 @@ 
 
           <dt><code>localnet</code></dt>
           <dd>
-            A connection to a locally accessible network from each
-            <code>ovn-controller</code> instance.  A logical switch can only
-            have a single <code>localnet</code> port attached.  This is used
-            to model direct connectivity to an existing network.
+            A connection to a locally accessible network from
+            <code>ovn-controller</code> instances that have corresponding
+            bridge mapping.  A logical switch can have multiple
+            <code>localnet</code> ports attached, as long as each
+            <code>ovn-controller</code> is plugged to a single local network
+            only.  In this case, each hypervisor implements part of switch
+            external network connectivity.
           </dd>
 
           <dt><code>localport</code></dt>
@@ -721,7 +724,7 @@ 
           Required.  The name of the network to which the <code>localnet</code>
           port is connected.  Each hypervisor, via <code>ovn-controller</code>,
           uses its local configuration to determine exactly how to connect to
-          this locally accessible network.
+          this locally accessible network, if at all.
         </column>
       </group>
 
diff --git a/ovn-sb.xml b/ovn-sb.xml
index 3aa7cd4da..7e05aa0e1 100644
--- a/ovn-sb.xml
+++ b/ovn-sb.xml
@@ -2626,10 +2626,13 @@  tcp.flags = RST;
 
           <dt><code>localnet</code></dt>
           <dd>
-            A connection to a locally accessible network from each
-            <code>ovn-controller</code> instance.  A logical switch can only
-            have a single <code>localnet</code> port attached.  This is used
-            to model direct connectivity to an existing network.
+            A connection to a locally accessible network from some or all
+            <code>ovn-controller</code> instances.  This is used
+            to model direct connectivity to existing networks.  A logical
+            switch can have multiple <code>localnet</code> ports attached, as
+            long as each <code>ovn-controller</code> is plugged to a single
+            local network only.  In this case, each hypervisor implements part
+            of switch external network connectivity.
           </dd>
 
           <dt><code>localport</code></dt>
@@ -2774,10 +2777,12 @@  tcp.flags = RST;
         <p>
           When a logical switch has a <code>localnet</code> port attached,
           every chassis that may have a local vif attached to that logical
-          switch must have a bridge mapping configured to reach that
-          <code>localnet</code>.  Traffic that arrives on a
-          <code>localnet</code> port is never forwarded over a tunnel to
-          another chassis.
+          switch that needs this external connectivity must have a bridge
+          mapping configured to reach that <code>localnet</code>.  If the
+          mapping is missing, the vif won't be plugged to this network.  It may
+          still reach the other network if routing is implemented by fabric.
+          Traffic that arrives on a <code>localnet</code> port is never
+          forwarded over a tunnel to another chassis.
         </p>
       </column>
 
diff --git a/tests/ovn.at b/tests/ovn.at
index e6febd4c2..173b8d2eb 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -2475,6 +2475,243 @@  OVN_CLEANUP([hv1],[hv2])
 
 AT_CLEANUP
 
+AT_SETUP([ovn -- 2 HVs, 2 LS, switching between multiple localnet ports with same tags])
+ovn_start
+
+# In this test case we create two switches with multiple localnet ports. Only a
+# single localnet of the same tag is connected to fabric for each switch. Two
+# hypervisors have VIFs that belong to these switches. The test validates that
+# routing between these switches and hypervisors still works regardless of the
+# number of (unplugged) localnet ports.
+
+# two switches, each connected to lots of networks
+for i in 1 2; do
+    ovn-nbctl ls-add ls-$i
+    for tag in `seq 10 20`; do
+        ln_port_name=ln-$i-$tag
+        ovn-nbctl lsp-add ls-$i $ln_port_name "" $tag
+        ovn-nbctl lsp-set-addresses $ln_port_name unknown
+        ovn-nbctl lsp-set-type $ln_port_name localnet
+        ovn-nbctl lsp-set-options $ln_port_name network_name=phys-$tag
+    done
+done
+
+# multiple networks
+for tag in `seq 10 20`; do
+    net_add n-$tag
+done
+
+# two hypervisors, each connected to the same network
+for i in 1 2; do
+    sim_add hv-$i
+    as hv-$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys-$tag:br-phys
+    ovn_attach n-10 br-phys 192.168.0.$i
+done
+
+# two vif ports, one per switch
+for i in 1 2; do
+    as hv-$i
+    ovs-vsctl add-port br-int vif-$i -- \
+        set Interface vif-$i external-ids:iface-id=lp-$i \
+                              options:tx_pcap=hv-$i/vif-$i-tx.pcap \
+                              options:rxq_pcap=hv-$i/vif-$i-rx.pcap \
+                              ofport-request=$i
+
+    lsp_name=lp-$i
+    ovn-nbctl lsp-add ls-$i $lsp_name
+    ovn-nbctl lsp-set-addresses $lsp_name f0:00:00:00:00:0$i
+    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:0$i
+
+    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+done
+
+ovn-nbctl --wait=sb sync
+ovn-nbctl show
+ovn-sbctl dump-flows
+
+# vif ports
+for i in 1 2; do
+    : > vif-$i.expected
+done
+
+# localnet ports
+for i in 1 2; do
+    for tag in `seq 10 20`; do
+        : > $i-$tag.expected
+    done
+done
+
+test_packet() {
+    local inport=$1 outport=$2 dst=$3 src=$4 eth=$5 eout=$6 lout=$7
+
+    # Expect the packet cloned to all localnet ports
+    : > expout
+    for tag in `seq 10 20`; do
+        echo "output(\"ln-$inport-$tag\");" >> expout
+    done
+
+    # First try tracing the packet.
+    uflow="inport==\"lp-$inport\" && eth.dst==$dst && eth.src==$src && eth.type==0x$eth"
+    AT_CAPTURE_FILE([trace])
+    AT_CHECK([ovn-trace --all ls-$inport "$uflow" | tee trace | sed '1,/Minimal trace/d'], [0], [expout])
+
+    # Then actually send a packet, for an end-to-end test.
+    local packet=$(echo $dst$src | sed 's/://g')${eth}
+    as hv-$1 ovs-appctl netdev-dummy/receive vif-$inport $packet
+
+    # Expect the packet received by the peer VIF port
+    echo $packet >> vif-$outport.expected
+
+    # Expect the packet to transfer through the common fabric network
+    local packet=$(echo $dst$src | sed 's/://g')81000014${eth}
+    echo $packet >> $1-10.expected
+}
+
+test_packet 1 2 f0:00:00:00:00:02 f0:00:00:00:00:01 1001 ln-1-10 ln-1-10
+test_packet 1 2 f0:00:00:00:00:02 f0:00:00:00:00:01 1002 ln-1-10 ln-1-10
+
+test_packet 2 1 f0:00:00:00:00:01 f0:00:00:00:00:02 1003 ln-2-10 ln-2-10
+test_packet 2 1 f0:00:00:00:00:01 f0:00:00:00:00:02 1004 ln-2-10 ln-2-10
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+
+for i in 1 2; do
+    hv=hv-$i
+    echo "------ $hv dump ------"
+    as $hv ovs-vsctl show
+    as $hv ovs-ofctl -O OpenFlow13 dump-flows br-int
+done
+
+# Now check the packets actually received against the ones expected.
+for i in 1 2; do
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$i/vif-$i-tx.pcap], [vif-$i.expected])
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$i/br-phys_n-10-tx.pcap], [$i-10.expected])
+done
+
+OVN_CLEANUP([hv-1],[hv-2])
+
+AT_CLEANUP
+
+AT_SETUP([ovn -- 2 HVs, 1 LS, no switching between multiple localnet ports with different tags])
+ovn_start
+
+# In this test case we create a single switch connected to two physical
+# networks via two localnet ports. Then we create two hypervisors, with 2
+# ports on each. The test validates no interconnectivity between VIF ports
+# located on chassis plugged to different physical networks.
+
+# create the single switch with two locanet ports
+ovn-nbctl ls-add ls1
+for tag in 10 20; do
+    ln_port_name=ln-$tag
+    ovn-nbctl lsp-add ls1 $ln_port_name "" $tag
+    ovn-nbctl lsp-set-addresses $ln_port_name unknown
+    ovn-nbctl lsp-set-type $ln_port_name localnet
+    ovn-nbctl lsp-set-options $ln_port_name network_name=phys-$tag
+done
+
+# create fabric networks
+for tag in 10 20; do
+    net_add n-$tag
+done
+
+# create four chassis, each connected to one network, each with a single VIF port
+for tag in 10 20; do
+    for i in 1 2; do
+        sim_add hv-$tag-$i
+        as hv-$tag-$i
+        ovs-vsctl add-br br-phys
+        ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys-$tag:br-phys
+        ovn_attach n-$tag br-phys 192.168.$i.$tag
+
+        ovs-vsctl add-port br-int vif-$tag-$i -- \
+            set Interface vif-$tag-$i external-ids:iface-id=lp-$tag-$i \
+                                  options:tx_pcap=hv-$tag-$i/vif-$tag-$i-tx.pcap \
+                                  options:rxq_pcap=hv-$tag-$i/vif-$tag-$i-rx.pcap \
+                                  ofport-request=$tag$i
+
+        lsp_name=lp-$tag-$i
+        ovn-nbctl lsp-add ls1 $lsp_name
+        ovn-nbctl lsp-set-addresses $lsp_name f0:00:00:00:0$i:$tag
+        ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:0$i:$tag
+
+        OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+    done
+done
+ovn-nbctl --wait=sb sync
+ovn-sbctl dump-flows
+
+for tag in 10 20; do
+    for i in 1 2; do
+        : > $tag-$i.expected
+    done
+done
+
+vif_to_hv() {
+    echo hv-$1
+}
+
+test_packet() {
+    local inport=$1 dst=$2 src=$3 eth=$4 eout=$5 lout=$6
+
+    # First try tracing the packet.
+    uflow="inport==\"lp-$inport\" && eth.dst==$dst && eth.src==$src && eth.type==0x$eth"
+    echo "output(\"$lout\");" > expout
+    AT_CAPTURE_FILE([trace])
+    AT_CHECK([ovn-trace --all ls1 "$uflow" | tee trace | sed '1,/Minimal trace/d'], [0], [expout])
+
+    # Then actually send a packet, for an end-to-end test.
+    local packet=$(echo $dst$src | sed 's/://g')${eth}
+    hv=`vif_to_hv $inport`
+    vif=vif-$inport
+    as $hv ovs-appctl netdev-dummy/receive $vif $packet
+    if test $eth = 1002 -o $eth = 2002; then
+        echo $packet >> ${eout#lp-}.expected
+    fi
+}
+
+# different fabric networks -> should fail
+test_packet 10-1 f0:00:00:00:01:20 f0:00:00:00:01:10 1001 lp-20-1 lp-20-1
+test_packet 20-1 f0:00:00:00:01:10 f0:00:00:00:01:20 2001 lp-10-1 lp-10-1
+
+# same fabric networks -> should pass
+test_packet 10-1 f0:00:00:00:02:10 f0:00:00:00:01:10 1002 lp-10-2 lp-10-2
+test_packet 20-1 f0:00:00:00:02:20 f0:00:00:00:01:20 2002 lp-20-2 lp-20-2
+test_packet 10-2 f0:00:00:00:01:10 f0:00:00:00:02:10 1002 lp-10-1 lp-10-1
+test_packet 20-2 f0:00:00:00:01:20 f0:00:00:00:02:20 2002 lp-20-1 lp-20-1
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+
+for tag in 10 20; do
+    for i in 1 2; do
+        hv=hv-$tag-$i
+        echo "------ $hv dump ------"
+        as $hv ovs-vsctl show
+        as $hv ovs-ofctl -O OpenFlow13 dump-flows br-int
+    done
+done
+
+# Now check the packets actually received against the ones expected.
+for tag in 10 20; do
+    for i in 1 2; do
+        echo "hv = $tag-$i"
+        OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv-$tag-$i/vif-$tag-$i-tx.pcap], [$tag-$i.expected])
+    done
+done
+
+OVN_CLEANUP([hv-10-1],[hv-10-2],[hv-20-1],[hv-20-2])
+
+AT_CLEANUP
+
 AT_SETUP([ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS])
 AT_KEYWORDS([vtep])
 ovn_start