diff mbox series

[ovs-dev,v5] ovn: Support a new Logical_Switch_Port.type - 'external'

Message ID 20190115193511.14765-1-nusiddiq@redhat.com
State Changes Requested
Headers show
Series [ovs-dev,v5] ovn: Support a new Logical_Switch_Port.type - 'external' | expand

Commit Message

Numan Siddique Jan. 15, 2019, 7:35 p.m. UTC
From: Numan Siddique <nusiddiq@redhat.com>

In the case of OpenStack + OVN, when the VMs are booted on
hypervisors supporting SR-IOV nics, there are no OVS ports
for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
Router Solicitation requests, the local ovn-controller
cannot reply to these packets. OpenStack Neutron dhcp agent
service needs to be run to serve these requests.

With the new logical port type - 'external', OVN itself can
handle these requests avoiding the need to deploy any
external services like neutron dhcp agent.

To make use of this feature, CMS has to
 - create a logical port for such VMs
 - set the type to 'external'
 - set requested-chassis="<chassis-name>" in the options
   column.
 - create a localnet port for the logical switch
 - configure the ovn-bridge-mappings option in the OVS db.

When the ovn-controller running in that 'chassis', detects
the Port_Binding row, it adds the necessary DHCPv4/v6 OF
flows. Since the packet enters the logical switch pipeline
via the localnet port, the inport register (reg14) is set
to the tunnel key of localnet port in the match conditions.

In case the chassis goes down for some reason, it is the
responsibility of CMS to change the 'requested-chassis'
option to some other active chassis, so that it can serve
these requests.

When the VM with the external port, sends an ARP request for
the router ips, only the chassis which has claimed the port,
will reply to the ARP requests. Rest of the chassis on
receiving these packets drop them in the ingress switch
datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
before S_SWITCH_IN_L2_LKUP.

This would guarantee that only the chassis which has claimed
the external ports will run the router datapath pipeline.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
---

v4 -> v5
------
  * Addressed review comments from Han Zhou.

v3 -> v4
------
  * Updated the documention as per Han Zhou's suggestion.

v2 -> v3
-------
  * Rebased 

 ovn/controller/binding.c        |  12 +
 ovn/controller/lflow.c          |  41 ++-
 ovn/controller/lflow.h          |   2 +
 ovn/controller/lport.c          |  26 ++
 ovn/controller/lport.h          |   5 +
 ovn/controller/ovn-controller.c |   6 +
 ovn/lib/ovn-util.c              |   1 +
 ovn/northd/ovn-northd.8.xml     |  37 ++-
 ovn/northd/ovn-northd.c         |  85 ++++-
 ovn/ovn-architecture.7.xml      |  78 +++++
 ovn/ovn-nb.xml                  |  47 +++
 tests/ovn.at                    | 530 +++++++++++++++++++++++++++++++-
 12 files changed, 848 insertions(+), 22 deletions(-)

Comments

Han Zhou Jan. 17, 2019, 6:50 p.m. UTC | #1
Hi Numan,

With v5 the new test case "external logical port" fails.
And please see more comments inlined.

On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
>
> From: Numan Siddique <nusiddiq@redhat.com>
>
> In the case of OpenStack + OVN, when the VMs are booted on
> hypervisors supporting SR-IOV nics, there are no OVS ports
> for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
> Router Solicitation requests, the local ovn-controller
> cannot reply to these packets. OpenStack Neutron dhcp agent
> service needs to be run to serve these requests.
>
> With the new logical port type - 'external', OVN itself can
> handle these requests avoiding the need to deploy any
> external services like neutron dhcp agent.
>
> To make use of this feature, CMS has to
>  - create a logical port for such VMs
>  - set the type to 'external'
>  - set requested-chassis="<chassis-name>" in the options
>    column.
>  - create a localnet port for the logical switch
>  - configure the ovn-bridge-mappings option in the OVS db.
>
> When the ovn-controller running in that 'chassis', detects
> the Port_Binding row, it adds the necessary DHCPv4/v6 OF
> flows. Since the packet enters the logical switch pipeline
> via the localnet port, the inport register (reg14) is set
> to the tunnel key of localnet port in the match conditions.
>
> In case the chassis goes down for some reason, it is the
> responsibility of CMS to change the 'requested-chassis'
> option to some other active chassis, so that it can serve
> these requests.
>
> When the VM with the external port, sends an ARP request for
> the router ips, only the chassis which has claimed the port,
> will reply to the ARP requests. Rest of the chassis on
> receiving these packets drop them in the ingress switch
> datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
> before S_SWITCH_IN_L2_LKUP.
>
> This would guarantee that only the chassis which has claimed
> the external ports will run the router datapath pipeline.
>
> Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> ---
>
> v4 -> v5
> ------
>   * Addressed review comments from Han Zhou.
>
> v3 -> v4
> ------
>   * Updated the documention as per Han Zhou's suggestion.
>
> v2 -> v3
> -------
>   * Rebased
>
>  ovn/controller/binding.c        |  12 +
>  ovn/controller/lflow.c          |  41 ++-
>  ovn/controller/lflow.h          |   2 +
>  ovn/controller/lport.c          |  26 ++
>  ovn/controller/lport.h          |   5 +
>  ovn/controller/ovn-controller.c |   6 +
>  ovn/lib/ovn-util.c              |   1 +
>  ovn/northd/ovn-northd.8.xml     |  37 ++-
>  ovn/northd/ovn-northd.c         |  85 ++++-
>  ovn/ovn-architecture.7.xml      |  78 +++++
>  ovn/ovn-nb.xml                  |  47 +++
>  tests/ovn.at                    | 530 +++++++++++++++++++++++++++++++-
>  12 files changed, 848 insertions(+), 22 deletions(-)
>
> diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
> index 021ecddcf..64e605b92 100644
> --- a/ovn/controller/binding.c
> +++ b/ovn/controller/binding.c
> @@ -471,6 +471,18 @@ consider_local_datapath(struct ovsdb_idl_txn *ovnsb_idl_txn,
>           * for them. */
>          sset_add(local_lports, binding_rec->logical_port);
>          our_chassis = false;
> +    } else if (!strcmp(binding_rec->type, "external")) {
> +        const char *chassis_id = smap_get(&binding_rec->options,
> +                                          "requested-chassis");
> +        our_chassis = chassis_id && (
> +            !strcmp(chassis_id, chassis_rec->name) ||
> +            !strcmp(chassis_id, chassis_rec->hostname));
> +        if (our_chassis) {
> +            add_local_datapath(sbrec_datapath_binding_by_key,
> +                               sbrec_port_binding_by_datapath,
> +                               sbrec_port_binding_by_name,
> +                               binding_rec->datapath, true, local_datapaths);
> +        }
>      }
>
>      if (our_chassis
> diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
> index 8db81927e..98e8ed3b9 100644
> --- a/ovn/controller/lflow.c
> +++ b/ovn/controller/lflow.c
> @@ -52,7 +52,10 @@ lflow_init(void)
>  struct lookup_port_aux {
>      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
>      struct ovsdb_idl_index *sbrec_port_binding_by_name;
> +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
> +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
>      const struct sbrec_datapath_binding *dp;
> +    const struct sbrec_chassis *chassis;
>  };
>
>  struct condition_aux {
> @@ -66,6 +69,8 @@ static void consider_logical_flow(
>      struct ovsdb_idl_index *sbrec_chassis_by_name,
>      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>      const struct sbrec_logical_flow *,
>      const struct hmap *local_datapaths,
>      const struct sbrec_chassis *,
> @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char *port_name, unsigned int *portp)
>      const struct sbrec_port_binding *pb
>          = lport_lookup_by_name(aux->sbrec_port_binding_by_name, port_name);
>      if (pb && pb->datapath == aux->dp) {
> -        *portp = pb->tunnel_key;
> -        return true;
> +        if (strcmp(pb->type, "external")) {
> +            *portp = pb->tunnel_key;
> +            return true;
> +        }
> +        const char *chassis_id = smap_get(&pb->options,
> +                                          "requested-chassis");
> +        if (chassis_id && (!strcmp(chassis_id, aux->chassis->name) ||
> +                           !strcmp(chassis_id, aux->chassis->hostname))) {
> +            const struct sbrec_port_binding *localnet_pb
> +                = lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
> +                                       aux->sbrec_port_binding_by_type,
> +                                       aux->dp->tunnel_key, "localnet");
> +            if (localnet_pb) {
> +                *portp = localnet_pb->tunnel_key;
> +                return true;
> +            }
> +        }
> +        return false;
>      }
>
>      const struct sbrec_multicast_group *mg = mcgroup_lookup_by_dp_name(
> @@ -144,6 +165,8 @@ add_logical_flows(
>      struct ovsdb_idl_index *sbrec_chassis_by_name,
>      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>      const struct sbrec_dhcp_options_table *dhcp_options_table,
>      const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
>      const struct sbrec_logical_flow_table *logical_flow_table,
> @@ -183,6 +206,8 @@ add_logical_flows(
>          consider_logical_flow(sbrec_chassis_by_name,
>                                sbrec_multicast_group_by_name_datapath,
>                                sbrec_port_binding_by_name,
> +                              sbrec_port_binding_by_type,
> +                              sbrec_datapath_binding_by_key,
>                                lflow, local_datapaths,
>                                chassis, &dhcp_opts, &dhcpv6_opts, &nd_ra_opts,
>                                addr_sets, port_groups, active_tunnels,
> @@ -200,6 +225,8 @@ consider_logical_flow(
>      struct ovsdb_idl_index *sbrec_chassis_by_name,
>      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>      const struct sbrec_logical_flow *lflow,
>      const struct hmap *local_datapaths,
>      const struct sbrec_chassis *chassis,
> @@ -292,7 +319,10 @@ consider_logical_flow(
>          .sbrec_multicast_group_by_name_datapath
>              = sbrec_multicast_group_by_name_datapath,
>          .sbrec_port_binding_by_name = sbrec_port_binding_by_name,
> -        .dp = lflow->logical_datapath
> +        .sbrec_port_binding_by_type = sbrec_port_binding_by_type,
> +        .sbrec_datapath_binding_by_key = sbrec_datapath_binding_by_key,
> +        .dp = lflow->logical_datapath,
> +        .chassis = chassis
>      };
>      struct condition_aux cond_aux = {
>          .sbrec_chassis_by_name = sbrec_chassis_by_name,
> @@ -463,6 +493,8 @@ void
>  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>            struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>            struct ovsdb_idl_index *sbrec_port_binding_by_name,
> +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
> +          struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>            const struct sbrec_dhcp_options_table *dhcp_options_table,
>            const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
>            const struct sbrec_logical_flow_table *logical_flow_table,
> @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>
>      add_logical_flows(sbrec_chassis_by_name,
>                        sbrec_multicast_group_by_name_datapath,
> -                      sbrec_port_binding_by_name, dhcp_options_table,
> +                      sbrec_port_binding_by_name, sbrec_port_binding_by_type,
> +                      sbrec_datapath_binding_by_key, dhcp_options_table,
>                        dhcpv6_options_table, logical_flow_table,
>                        local_datapaths, chassis, addr_sets, port_groups,
>                        active_tunnels, local_lport_ids, flow_table, group_table,
> diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
> index d19338140..b2911e0eb 100644
> --- a/ovn/controller/lflow.h
> +++ b/ovn/controller/lflow.h
> @@ -68,6 +68,8 @@ void lflow_init(void);
>  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>                 struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>                 struct ovsdb_idl_index *sbrec_port_binding_by_name,
> +               struct ovsdb_idl_index *sbrec_port_binding_by_type,
> +               struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>                 const struct sbrec_dhcp_options_table *,
>                 const struct sbrec_dhcpv6_options_table *,
>                 const struct sbrec_logical_flow_table *,
> diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
> index cc5c5fbb2..9c827d9b0 100644
> --- a/ovn/controller/lport.c
> +++ b/ovn/controller/lport.c
> @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>      return retval;
>  }
>
> +const struct sbrec_port_binding *
> +lport_lookup_by_type(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> +                     struct ovsdb_idl_index *sbrec_port_binding_by_type,
> +                     uint64_t dp_key, const char *port_type)
> +{
> +    /* Lookup datapath corresponding to dp_key. */
> +    const struct sbrec_datapath_binding *db = datapath_lookup_by_key(
> +        sbrec_datapath_binding_by_key, dp_key);
> +    if (!db) {
> +        return NULL;
> +    }
> +
> +    /* Build key for an indexed lookup. */
> +    struct sbrec_port_binding *pb = sbrec_port_binding_index_init_row(
> +            sbrec_port_binding_by_type);
> +    sbrec_port_binding_index_set_datapath(pb, db);
> +    sbrec_port_binding_index_set_type(pb, port_type);
> +
> +    const struct sbrec_port_binding *retval = sbrec_port_binding_index_find(
> +            sbrec_port_binding_by_type, pb);
> +
> +    sbrec_port_binding_index_destroy_row(pb);
> +
> +    return retval;
> +}
> +
>  const struct sbrec_datapath_binding *
>  datapath_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>                         uint64_t dp_key)
> diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
> index 7dcd5bee0..2d49792f6 100644
> --- a/ovn/controller/lport.h
> +++ b/ovn/controller/lport.h
> @@ -42,6 +42,11 @@ const struct sbrec_port_binding *lport_lookup_by_key(
>      struct ovsdb_idl_index *sbrec_port_binding_by_key,
>      uint64_t dp_key, uint64_t port_key);
>
> +const struct sbrec_port_binding *lport_lookup_by_type(
> +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> +    uint64_t dp_key, const char *port_type);
> +
>  const struct sbrec_datapath_binding *datapath_lookup_by_key(
>      struct ovsdb_idl_index *sbrec_datapath_binding_by_key, uint64_t dp_key);
>
> diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
> index 4e9a5865f..5aab9142f 100644
> --- a/ovn/controller/ovn-controller.c
> +++ b/ovn/controller/ovn-controller.c
> @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
>       * ports that have a Gateway_Chassis that point's to our own
>       * chassis */
>      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "chassisredirect");
> +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "external");
>      if (chassis) {
>          /* This should be mostly redundant with the other clauses for port
>           * bindings, but it allows us to catch any ports that are assigned to
> @@ -616,6 +617,9 @@ main(int argc, char *argv[])
>      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
>          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>                                    &sbrec_port_binding_col_datapath);
> +    struct ovsdb_idl_index *sbrec_port_binding_by_type
> +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> +                                  &sbrec_port_binding_col_type);

This index is used with two columns: datapath_binding and type, so it
should be created with both columns using create2.

>      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
>          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>                                    &sbrec_datapath_binding_col_tunnel_key);
> @@ -743,6 +747,8 @@ main(int argc, char *argv[])
>                              sbrec_chassis_by_name,
>                              sbrec_multicast_group_by_name_datapath,
>                              sbrec_port_binding_by_name,
> +                            sbrec_port_binding_by_type,
> +                            sbrec_datapath_binding_by_key,
>                              sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
>                              sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
>                              sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
> diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
> index aa03919bb..a9d4b8736 100644
> --- a/ovn/lib/ovn-util.c
> +++ b/ovn/lib/ovn-util.c
> @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
>      "localport",
>      "router",
>      "vtep",
> +    "external",
>  };
>
>  bool
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index 392a5efc9..c8883d60d 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> @@ -626,7 +626,8 @@ nd_na_router {
>      <p>
>        This table adds the DHCPv4 options to a DHCPv4 packet from the
>        logical ports configured with IPv4 address(es) and DHCPv4 options,
> -      and similarly for DHCPv6 options.
> +      and similarly for DHCPv6 options. This table also adds flows for the
> +      logical ports of type <code>external</code>.
>      </p>
>
>      <ul>
> @@ -827,7 +828,39 @@ output;
>        </li>
>      </ul>
>
> -    <h3>Ingress Table 16 Destination Lookup</h3>
> +    <h3>Ingress table 16 External ports</h3>
> +
> +    <p>
> +      Traffic from the <code>external</code> logical ports enter the ingress
> +      datapath pipeline via the <code>localnet</code> port. This table adds the
> +      below logical flows to handle the traffic from these ports.
> +    </p>
> +
> +    <ul>
> +      <li>
> +        <p>
> +          A priority-100 flow is added for each <code>external</code> logical
> +          port which doesn't reside on a chassis to drop the ARP/IPv6 NS
> +          request to the router IP(s) (of the logical switch) which matches
> +          on the <code>inport</code> of the <code>external</code> logical port
> +          and the valid <code>eth.src</code> address(es) of the
> +          <code>external</code> logical port.
> +        </p>
> +
> +        <p>
> +          This flow guarantees that the ARP/NS request to the router IP
> +          address from the external ports is responded by only the chassis
> +          which has claimed these external ports. All the other chassis,
> +          drops these packets.
> +        </p>
> +      </li>
> +
> +      <li>
> +        A priority-0 flow that matches all packets to advances to table 17.
> +      </li>
> +    </ul>
> +
> +    <h3>Ingress Table 17 Destination Lookup</h3>
>
>      <p>
>        This table implements switching behavior.  It contains these logical
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 3fd8a8757..87208c6c1 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -119,7 +119,8 @@ enum ovn_stage {
>      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13, "ls_in_dhcp_response") \
>      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14, "ls_in_dns_lookup")    \
>      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15, "ls_in_dns_response")  \
> -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16, "ls_in_l2_lkup")       \
> +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16, "ls_in_external_port") \
> +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17, "ls_in_l2_lkup")       \
>                                                                            \
>      /* Logical switch egress stages. */                                   \
>      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")         \
> @@ -2942,6 +2943,12 @@ lsp_is_up(const struct nbrec_logical_switch_port *lsp)
>      return !lsp->up || *lsp->up;
>  }
>
> +static bool
> +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
> +{
> +    return !strcmp(nbsp->type, "external");
> +}
> +
>  static bool
>  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
>                      struct ds *options_action, struct ds *response_action,
> @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>           *  - port type is localport
>           */
>          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") &&
> -            strcmp(op->nbsp->type, "localport")) {
> +            strcmp(op->nbsp->type, "localport") && lsp_is_external(op->nbsp)) {

Sorry that I missed this in last review. The && condition has problem.
It will cause ARP responder flows added for all lports that are not
external. I think it should be || here.

>              continue;
>          }
>
> @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>              continue;
>          }
>
> +        bool is_external = lsp_is_external(op->nbsp);
> +        if (is_external && !op->od->localnet_port) {
> +            /* If it's an external port and there is no localnet port
> +             * ignore it. */
> +            continue;
> +        }
> +
>          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
>                  struct ds options_action = DS_EMPTY_INITIALIZER;
> @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>                      ds_put_format(
>                          &match, "inport == %s && eth.src == %s && "
>                          "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
> -                        "udp.src == 68 && udp.dst == 67", op->json_key,
> -                        op->lsp_addrs[i].ea_s);
> +                        "udp.src == 68 && udp.dst == 67",
> +                        op->json_key, op->lsp_addrs[i].ea_s);

No change here?
>
>                      ovn_lflow_add(lflows, op->od, S_SWITCH_IN_DHCP_OPTIONS,
>                                    100, ds_cstr(&match),
> @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>      /* Ingress table 12 and 13: DHCP options and response, by default goto
>       * next. (priority 0).
>       * Ingress table 14 and 15: DNS lookup and response, by default goto next.
> -     * (priority 0).*/
> +     * (priority 0).
> +     * Ingress table 16 - External port handling, by default goto next.
> +     * (priority 0). */
>
>      HMAP_FOR_EACH (od, key_node, datapaths) {
>          if (!od->nbs) {
> @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1", "next;");
>          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, "1", "next;");
>          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0, "1", "next;");
> +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0, "1", "next;");
>      }
>
> -    /* Ingress table 16: Destination lookup, broadcast and multicast handling
> +    HMAP_FOR_EACH (op, key_node, ports) {
> +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
> +           continue;
> +        }
> +
> +        /* Table 16: External port. Drop ARP request for router ips from
> +         * external ports  on chassis not binding those ports.
> +         * This makes the router pipeline to be run only on the chassis
> +         * binding the external ports. */
> +
> +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> +            for (size_t j = 0; j < op->od->n_router_ports; j++) {
> +                struct ovn_port *rp = op->od->router_ports[j];
> +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
> +                    for (size_t l = 0; l < rp->lsp_addrs[k].n_ipv4_addrs;
> +                         l++) {
> +                        ds_clear(&match);
> +                        ds_put_cstr(&match, "ip4");
> +                        ds_put_format(
> +                            &match, "inport == %s && eth.src == %s"
> +                            " && !is_chassis_resident(%s)"
> +                            " && arp.tpa == %s && arp.op == 1",
> +                            op->json_key, op->lsp_addrs[i].ea_s, op->json_key,

I believe the inport should match the localnet port's json_key here,
since it is coming from a localnet port.

> +                            rp->lsp_addrs[k].ipv4_addrs[l].addr_s);
> +                        ovn_lflow_add(lflows, op->od,
> +                                      S_SWITCH_IN_EXTERNAL_PORT, 100,
> +                                      ds_cstr(&match), "drop;");
> +                    }
> +                    for (size_t l = 0; l < rp->lsp_addrs[k].n_ipv6_addrs;
> +                         l++) {
> +                        ds_clear(&match);
> +                        ds_put_format(
> +                            &match, "inport == %s && eth.src == %s"
> +                            " && !is_chassis_resident(%s)"
> +                            " && nd_ns && ip6.dst == {%s, %s} && "
> +                            "nd.target == %s",
> +                            op->json_key, op->lsp_addrs[i].ea_s, op->json_key,

same as above.

> +                            rp->lsp_addrs[k].ipv6_addrs[l].addr_s,
> +                            rp->lsp_addrs[k].ipv6_addrs[l].sn_addr_s,
> +                            rp->lsp_addrs[k].ipv6_addrs[l].addr_s);
> +                        ovn_lflow_add(lflows, op->od,
> +                                      S_SWITCH_IN_EXTERNAL_PORT, 100,
> +                                      ds_cstr(&match), "drop;");
> +                    }
> +                }
> +            }
> +        }
> +    }
> +    /* Ingress table 17: Destination lookup, broadcast and multicast handling
>       * (priority 100). */
>      HMAP_FOR_EACH (op, key_node, ports) {
>          if (!op->nbsp) {
> @@ -4448,9 +4513,9 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>                        "outport = \""MC_FLOOD"\"; output;");
>      }
>
> -    /* Ingress table 16: Destination lookup, unicast handling (priority 50), */
> +    /* Ingress table 17: Destination lookup, unicast handling (priority 50), */
>      HMAP_FOR_EACH (op, key_node, ports) {
> -        if (!op->nbsp) {
> +        if (!op->nbsp || lsp_is_external(op->nbsp)) {
>              continue;
>          }
>
> @@ -4567,7 +4632,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>          }
>      }
>
> -    /* Ingress table 16: Destination lookup for unknown MACs (priority 0). */
> +    /* Ingress table 17: Destination lookup for unknown MACs (priority 0). */
>      HMAP_FOR_EACH (od, key_node, datapaths) {
>          if (!od->nbs) {
>              continue;
> @@ -4602,7 +4667,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>       * Priority 150 rules drop packets to disabled logical ports, so that they
>       * don't even receive multicast or broadcast packets. */
>      HMAP_FOR_EACH (op, key_node, ports) {
> -        if (!op->nbsp) {
> +        if (!op->nbsp || lsp_is_external(op->nbsp)) {
>              continue;
>          }
>
> diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
> index 3936e6016..405975b7b 100644
> --- a/ovn/ovn-architecture.7.xml
> +++ b/ovn/ovn-architecture.7.xml
> @@ -1678,6 +1678,84 @@
>      </li>
>    </ol>
>
> +  <h2>Native OVN services for external logical ports</h2>
> +
> +  <p>
> +    To support OVN native services (like DHCP/IPv6 RA/DNS lookup) to the
> +    cloud resources which are external, OVN supports <code>external</code>
> +    logical ports.
> +  </p>
> +
> +  <p>
> +    Below are some of the use cases where <code>external</code> ports can be
> +    used.
> +  </p>
> +
> +  <ul>
> +    <li>
> +      VMs connected to SR-IOV nics - Traffic from these VMs by passes the
> +      kernel stack and local <code>ovn-controller</code> do not bind these
> +      ports and cannot serve the native services.
> +    </li>
> +    <li>
> +      When CMS supports provisioning baremetal servers.
> +    </li>
> +  </ul>
> +
> +  <p>
> +    OVN will provide the native services if CMS has done the below
> +    configuration in the <dfn>OVN Northbound Database</dfn>.
> +  </p>
> +
> +  <ul>
> +    <li>
> +      A row is created in <code>Logical_Switch_Port</code>, configuring the
> +      <ref column="addresses" table="Logical_Switch_Port" db="OVN_NB"/> column
> +      and setting the <ref column="type" table="Logical_Switch_Port"
> +      db="OVN_NB"/> to <code>external</code>.
> +    </li>
> +
> +    <li>
> +      <ref column="options:requested-chassis" table="Logical_Switch_Port"
> +      db="OVN_NB"/> column is configured to a desired chassis.
> +    </li>
> +
> +    <li>
> +      The chassis on which this logical port is requested has the
> +      <code>ovn-bridge-mappings</code> configured and has proper L2
> +      connectivity so that it can receive the DHCP and other related request
> +      packets from these external resources.
> +    </li>
> +
> +    <li>
> +      The Logical_Switch of this port has a <code>localnet</code> port.
> +    </li>
> +
> +    <li>
> +      Native OVN services are enabled by configuring the DHCP and other
> +      options like the way it is done for the normal logical ports.
> +    </li>
> +  </ul>
> +
> +  <p>
> +    OVN doesn't support HA for these <code>external</code> ports. In case
> +    the <code>ovn-controller</code> running on the requested chassis goes down,
> +    it is the responsiblity of CMS, to reschedule these <code>external</code>
> +    ports to other active chassis.
> +  </p>
> +
> +  <p>
> +    It is recommended to request the same chassis for all the external ports
> +    of a logical switch. Otherwise, the physical switch might see MAC flap
> +    issue when different chassis provide the native services. For example when
> +    supporting native DHCPv4 service, DHCPv4 server mac (configured in
> +    <ref column="options:server_mac" table="DHCP_Options" db="OVN_NB"/> column
> +    in table <ref table="DHCP_Options"/>)
> +    originating from different ports can cause MAC flap issue. The MAC of the
> +    logical router IP(s) can also flap if the same chassis is not requested for
> +    all the external ports of a logical switch.
> +  </p>
> +
>    <h1>Security</h1>
>
>    <h2>Role-Based Access Controls for the Soutbound DB</h2>
> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
> index 6d6fb055a..fdf9adbfa 100644
> --- a/ovn/ovn-nb.xml
> +++ b/ovn/ovn-nb.xml
> @@ -353,6 +353,53 @@
>            <dd>
>              A port to a logical switch on a VTEP gateway.
>            </dd>
> +
> +          <dt><code>external</code></dt>
> +          <dd>
> +            <p>
> +              Represents a logical port which is external and not having
> +              an OVS port in the integration bridge.
> +              <code>OVN</code> will never receive any traffic from this port or
> +              send any traffic to this port. <code>OVN</code> can support
> +              native services like DHCPv4/DHCPv6/DNS for this port.
> +              If <ref column="options:requested-chassis"/> is defined,
> +              <code>ovn-controller</code> running in that chassis will bind
> +              this port to provide these native services. It is expected that
> +              this port belong to a bridged logical switch
> +              (with a <code>localnet</code> port).
> +            </p>
> +
> +            <p>
> +              It is recommended to request the same chassis for all the
> +              external ports of a logical switch. Otherwise, the physical
> +              switch might see MAC flap issue when different chassis provide
> +              the native services. For example when supporting native DHCPv4
> +              service, DHCPv4 server mac (configured in
> +              <ref column="options:server_mac" table="DHCP_Options"
> +              db="OVN_NB"/> column in table <ref table="DHCP_Options"/>)
> +              originating from different ports can cause MAC flap issue.
> +              The MAC of the logical router IP(s) can also flap if the
> +              same chassis is not requested for all the external ports
> +              of a logical switch.
> +            </p>
> +
> +            <p>
> +              Below are some of the use cases where <code>external</code>
> +              ports can be used.
> +            </p>
> +
> +            <ul>
> +              <li>
> +                VMs connected to SR-IOV nics - Traffic from these VMs by passes
> +                the kernel stack and local <code>ovn-controller</code> do not
> +                bind these ports and cannot serve the native services.
> +              </li>
> +
> +              <li>
> +                When CMS supports provisioning baremetal servers.
> +              </li>
> +            </ul>
> +          </dd>
>          </dl>
>        </column>
>      </group>
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 8bada3241..94c774e8b 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -9594,9 +9594,9 @@ AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=32 | grep active_backup | gre
>  sleep 3 # let BFD sessions settle so we get the right flows on the right chassis
>
>  # make sure that flows for handling the outside router port reside on gw1
> -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
> +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
>  ]])
> -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
> +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
>  ]])
>
>  # make sure ARP responder flows for outside router port reside on gw1 too
> @@ -9686,9 +9686,9 @@ AT_CHECK([ovs-vsctl --bare --columns bfd find Interface name=ovn-hv1-0],[0],
>  sleep 3  # let BFD sessions settle so we get the right flows on the right chassis
>
>  # make sure that flows for handling the outside router port reside on gw2 now
> -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
> +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
>  ]])
> -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
> +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
>  ]])
>
>  # disconnect GW2 from the network, GW1 should take over
> @@ -9700,9 +9700,9 @@ sleep 4
>  bfd_dump
>
>  # make sure that flows for handling the outside router port reside on gw2 now
> -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
> +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
>  ]])
> -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
> +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
>  ]])
>
>  # check that the chassis redirect port has been reclaimed by the gw1 chassis
> @@ -11619,6 +11619,524 @@ as hv2 start_daemon ovn-controller
>  OVN_CLEANUP([hv1],[hv2])
>  AT_CLEANUP
>
> +AT_SETUP([ovn -- external logical port])
> +AT_SKIP_IF([test $HAVE_PYTHON = no])
> +ovn_start
> +
> +net_add n1
> +sim_add hv1
> +sim_add hv2
> +
> +ovn-nbctl ls-add ls1
> +ovn-nbctl lsp-add ls1 ls1-lp1 \
> +-- lsp-set-addresses ls1-lp1 "f0:00:00:00:00:01 10.0.0.4 ae70::4"
> +
> +# Add a couple of external logical port
> +ovn-nbctl lsp-add ls1 ls1-lp_ext1 \
> +-- lsp-set-addresses ls1-lp_ext1 "f0:00:00:00:00:03 10.0.0.6 ae70::6"
> +ovn-nbctl lsp-set-port-security ls1-lp_ext1 \
> +"f0:00:00:00:00:03 10.0.0.6 ae70::6"
> +ovn-nbctl lsp-set-type ls1-lp_ext1 external
> +
> +ovn-nbctl lsp-add ls1 ls1-lp_ext2 \
> +-- lsp-set-addresses ls1-lp_ext2 "f0:00:00:00:00:04 10.0.0.7 ae70::7"
> +ovn-nbctl lsp-set-port-security ls1-lp_ext2 \
> +"f0:00:00:00:00:04 10.0.0.7 ae70::8"
> +ovn-nbctl lsp-set-type ls1-lp_ext2 external
> +
> +d1="$(ovn-nbctl create DHCP_Options cidr=10.0.0.0/24 \
> +options="\"server_id\"=\"10.0.0.1\" \"server_mac\"=\"ff:10:00:00:00:01\" \
> +\"lease_time\"=\"3600\" \"router\"=\"10.0.0.1\"")"
> +
> +d2="$(ovn-nbctl create DHCP_Options cidr="ae70\:\:/64" \
> +options="\"server_id\"=\"00:00:00:10:00:01\"")"
> +
> +ovn-nbctl lsp-set-dhcpv4-options ls1-lp1 ${d1}
> +ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext1 ${d1}
> +ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext2 ${d1}
> +
> +ovn-nbctl lsp-set-dhcpv6-options ls1-lp1 ${d2}
> +ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext1 ${d2}
> +ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext2 ${d2}
> +
> +# Create a logical router and connect it to ls1
> +ovn-nbctl lr-add lr0
> +ovn-nbctl lrp-add lr0 lr0-ls1 a0:10:00:00:00:01 10.0.0.1/24
> +ovn-nbctl lsp-add ls1 ls1-lr0
> +ovn-nbctl set Logical_Switch_Port ls1-lr0 type=router \
> +    options:router-port=lr0-ls1 addresses=router
> +
> +as hv1
> +ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.1
> +ovs-vsctl -- add-port br-phys hv1-ext1 -- \
> +    set interface hv1-ext1 options:tx_pcap=hv1/ext1-tx.pcap \
> +    options:rxq_pcap=hv1/ext1-rx.pcap \
> +    ofport-request=2
> +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +
> +as hv2
> +ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.2
> +ovs-vsctl -- add-port br-phys hv2-ext2 -- \
> +    set interface hv2-ext2 options:tx_pcap=hv2/ext2-tx.pcap \
> +    options:rxq_pcap=hv2/ext2-rx.pcap \
> +    ofport-request=2
> +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +
> +ovn-sbctl dump-flows > lflows_n.txt
> +
> +# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in hv1 and
> +# hv2 as requested-chassis option is not set and no localnet port added to ls1.
> +AT_CHECK([ovn-sbctl dump-flows ls1 | grep "offerip = 10.0.0.6" | \
> +wc -l], [0], [0
> +])
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> +])
> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> +])
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep tp_src=546 | grep \
> +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> +])
> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep tp_src=546 | grep \
> +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> +])
> +
> +hv1_uuid=$(ovn-sbctl list chassis hv1 | grep uuid | awk '{print $3}')
> +
> +# The port_binding row for ls1-lp_ext1 should have empty chassis
> +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
> +grep -v requested | grep chassis | awk '{print $3}')
> +
> +AT_CHECK([test $chassis == "[[]]"], [0], [])
> +
> +# Set the requested-chassis option for ls1-lp_ext1
> +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1
> +
> +# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in hv1 and hv2
> +# as no localnet port added to ls1 yet.
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> +])
> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> +])
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep tp_src=546 | grep \
> +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> +])
> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep tp_src=546 | grep \
> +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> +])
> +
> +# Add the localnet port to the logical switch ls1
> +ovn-nbctl lsp-add ls1 ln-public
> +ovn-nbctl lsp-set-addresses ln-public unknown
> +ovn-nbctl lsp-set-type ln-public localnet
> +ovn-nbctl --wait=hv lsp-set-options ln-public network_name=phys
> +
> +ln_public_key=$(ovn-sbctl list port_binding ln-public | grep  tunnel_key | \
> +awk '{print $3}')
> +
> +# The ls1-lp_ext1 should be bound to hv1
> +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
> +grep -v requested | grep chassis | awk '{print $3}')
> +AT_CHECK([test $chassis == "$hv1_uuid"], [0], [])
> +
> +# There should be DHCPv4/v6 OF flows for the ls1-lp_ext1 port in hv1
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \
> +wc -l], [0], [3
> +])
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep tp_src=546 | grep \
> +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
> +grep reg14=0x$ln_public_key | wc -l], [0], [1
> +])
> +
> +# There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv2
> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> +])
> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep tp_src=546 | grep \
> +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> +])
> +
> +# No DHCPv4/v6 flows for the external port - ls1-lp_ext2 - 10.0.0.7 in hv1 and
> +# hv2 as requested-chassis option is not set.
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep "0a.00.00.07" | wc -l], [0], [0
> +])
> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep "0a.00.00.07" | wc -l], [0], [0
> +])
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep tp_src=546 | grep \
> +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0
> +])
> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep tp_src=546 | grep \
> +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0
> +])
> +
> +as hv1
> +ovs-vsctl show
> +
> +# This shell function sends a DHCP request packet
> +# test_dhcp INPORT SRC_MAC DHCP_TYPE OFFER_IP ...
> +test_dhcp() {
> +    local inport=$1 src_mac=$2 dhcp_type=$3 offer_ip=$4 use_ip=$5
> +    shift; shift; shift; shift; shift;
> +    if test $use_ip != 0; then
> +        src_ip=$1
> +        dst_ip=$2
> +        shift; shift;
> +    else
> +        src_ip=`ip_to_hex 0 0 0 0`
> +        dst_ip=`ip_to_hex 255 255 255 255`
> +    fi
> +    local request=ffffffffffff${src_mac}0800451001100000000080110000${src_ip}${dst_ip}
> +    # udp header and dhcp header
> +    request=${request}0044004300fc0000
> +    request=${request}010106006359aa760000000000000000000000000000000000000000${src_mac}
> +    # client hardware padding
> +    request=${request}00000000000000000000
> +    # server hostname
> +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
> +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
> +    # boot file name
> +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
> +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
> +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
> +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
> +    # dhcp magic cookie
> +    request=${request}63825363
> +    # dhcp message type
> +    request=${request}3501${dhcp_type}ff
> +
> +    local srv_mac=$1 srv_ip=$2 expected_dhcp_opts=$3
> +    # total IP length will be the IP length of the request packet
> +    # (which is 272 in our case) + 8 (padding bytes) + (expected_dhcp_opts / 2)
> +    ip_len=`expr 280 + ${#expected_dhcp_opts} / 2`
> +    udp_len=`expr $ip_len - 20`
> +    ip_len=$(printf "%x" $ip_len)
> +    udp_len=$(printf "%x" $udp_len)
> +    # $ip_len var will be in 3 digits i.e 134. So adding a '0' before $ip_len
> +    local reply=${src_mac}${srv_mac}080045100${ip_len}000000008011XXXX${srv_ip}${offer_ip}
> +    # udp header and dhcp header.
> +    # $udp_len var will be in 3 digits. So adding a '0' before $udp_len
> +    reply=${reply}004300440${udp_len}0000020106006359aa760000000000000000
> +    # your ip address
> +    reply=${reply}${offer_ip}
> +    # next server ip address, relay agent ip address, client mac address
> +    reply=${reply}0000000000000000${src_mac}
> +    # client hardware padding
> +    reply=${reply}00000000000000000000
> +    # server hostname
> +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> +    # boot file name
> +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> +    # dhcp magic cookie
> +    reply=${reply}63825363
> +    # dhcp message type
> +    local dhcp_reply_type=02
> +    if test $dhcp_type = 03; then
> +        dhcp_reply_type=05
> +    fi
> +    reply=${reply}3501${dhcp_reply_type}${expected_dhcp_opts}00000000ff00000000
> +    echo $reply >> ext1_v4.expected
> +
> +    as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport} $request
> +}
> +
> +
> +trim_zeros() {
> +    sed 's/\(00\)\{1,\}$//'
> +}
> +
> +# This shell function sends a DHCPv6 request packet
> +# test_dhcpv6 INPORT SRC_MAC SRC_LLA DHCPv6_MSG_TYPE OFFER_IP OUTPORT...
> +# The OUTPORTs (zero or more) list the VIFs on which the original DHCPv6
> +# packet should be received twice (one from ovn-controller and the other
> +# from the "ovs-ofctl monitor br-int resume"
> +test_dhcpv6() {
> +    local inport=$1 src_mac=$2 src_lla=$3 msg_code=$4 offer_ip=$5
> +    local req_pkt_in_expected=$6
> +    local request=ffffffffffff${src_mac}86dd00000000002a1101${src_lla}
> +    # dst ip ff02::1:2
> +    request=${request}ff020000000000000000000000010002
> +    # udp header and dhcpv6 header
> +    request=${request}02220223002affff${msg_code}010203
> +    # Client identifier
> +    request=${request}0001000a00030001${src_mac}
> +    # IA-NA (Identity Association for Non Temporary Address)
> +    request=${request}0003000c0102030400000e1000001518
> +    shift; shift; shift; shift; shift;
> +
> +    local server_mac=000000100001
> +    local server_lla=fe80000000000000020000fffe100001
> +    local reply_code=07
> +    if test $msg_code = 01; then
> +        reply_code=02
> +    fi
> +    local msg_len=54
> +    if test $offer_ip = 1; then
> +        msg_len=28
> +    fi
> +    local reply=${src_mac}${server_mac}86dd0000000000${msg_len}1101
> +    reply=${reply}${server_lla}${src_lla}
> +
> +    # udp header and dhcpv6 header
> +    reply=${reply}0223022200${msg_len}ffff${reply_code}010203
> +    # Client identifier
> +    reply=${reply}0001000a00030001${src_mac}
> +    # IA-NA
> +    if test $offer_ip != 1; then
> +        reply=${reply}0003002801020304ffffffffffffffff00050018${offer_ip}
> +        reply=${reply}ffffffffffffffff
> +    fi
> +    # Server identifier
> +    reply=${reply}0002000a00030001${server_mac}
> +
> +    echo $reply | trim_zeros >> ext${inport}_v6.expected
> +    # The inport also receives the request packet since it is connected
> +    # to the br-phys.
> +    #echo $request >> ext${inport}_v6.expected
> +
> +    as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport} $request
> +}
> +
> +reset_pcap_file() {
> +    local iface=$1
> +    local pcap_file=$2
> +    ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
> +options:rxq_pcap=dummy-rx.pcap
> +    rm -f ${pcap_file}*.pcap
> +    ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \
> +options:rxq_pcap=${pcap_file}-rx.pcap
> +}
> +
> +ip_to_hex() {
> +    printf "%02x%02x%02x%02x" "$@"
> +}
> +
> +AT_CAPTURE_FILE([ofctl_monitor0_hv1.log])
> +as hv1 ovs-ofctl monitor br-int resume --detach --no-chdir \
> +--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv1.log
> +
> +AT_CAPTURE_FILE([ofctl_monitor0_hv2.log])
> +as hv2 ovs-ofctl monitor br-int resume --detach --no-chdir \
> +--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv2.log
> +
> +# Send DHCPDISCOVER.
> +offer_ip=`ip_to_hex 10 0 0 6`
> +server_ip=`ip_to_hex 10 0 0 1`
> +server_mac=ff1000000001
> +expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001
> +test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \
> +$expected_dhcp_opts
> +
> +# NXT_RESUMEs should be 1 in hv1.
> +OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
> +
> +# NXT_RESUMEs should be 0 in hv2.
> +OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
> +
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_v4.packets
> +cat ext1_v4.expected | cut -c -48 > expout
> +AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout])
> +# Skipping the IPv4 checksum.
> +cat ext1_v4.expected | cut -c 53- > expout
> +AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout])
> +
> +# ovs-ofctl also resumes the packets and this causes other ports to receive
> +# the DHCP request packet. So reset the pcap files so that its easier to test.
> +reset_pcap_file hv1-ext1 hv1/ext1
> +rm -f ext1_v4.expected
> +rm -f ext1_v4.packets
> +
> +# Send DHCPv6 request
> +src_mac=f00000000003
> +src_lla=fe80000000000000f20000fffe000003
> +offer_ip=ae700000000000000000000000000006
> +test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip
> +
> +# NXT_RESUMEs should be 2 in hv1.
> +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
> +
> +# NXT_RESUMEs should be 0 in hv2.
> +OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
> +
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \
> +sort > ext1_v6.packets
> +cat ext1_v6.expected | cut -c -120 > expout
> +AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout])
> +# Skipping the UDP checksum
> +cat ext1_v6.expected | cut -c 125- > expout
> +AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout])
> +
> +rm -f ext1_v6.expected
> +rm -f ext1_v6.packets
> +reset_pcap_file hv1-ext1 hv1/ext1
> +
> +# Change the requested-chassis option for ls1-lp_ext1 from hv1 to hv2
> +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv2
> +
> +hv2_uuid=$(ovn-sbctl list chassis hv2 | grep uuid | awk '{print $3}')
> +
> +# The ls1-lp_ext1 should be bound to hv2
> +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
> +grep -v requested | grep chassis | awk '{print $3}')
> +AT_CHECK([test $chassis == "$hv2_uuid"], [0], [])
> +
> +# There should be OF flows for DHCP4/v6 for the ls1-lp_ext1 port in hv2
> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \
> +wc -l], [0], [3
> +])
> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep tp_src=546 | grep \
> +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
> +grep reg14=0x$ln_public_key | wc -l], [0], [1
> +])
> +
> +# There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv1
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> +])
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> +grep controller | grep tp_src=546 | grep \
> +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
> +grep reg14=0x$ln_public_key | wc -l], [0], [0
> +])
> +
> +# Send DHCPDISCOVER again for hv1/ext1. The DHCP response should come from
> +# hv2 ovn-controller. Due to the test setup, the port hv1/ext1 is also
> +# receiving the expected packet.
> +offer_ip=`ip_to_hex 10 0 0 6`
> +server_ip=`ip_to_hex 10 0 0 1`
> +server_mac=ff1000000001
> +expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001
> +test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \
> +$expected_dhcp_opts
> +
> +# NXT_RESUMEs should be 2 in hv1.
> +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
> +
> +# NXT_RESUMEs should be 1 in hv2.
> +OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
> +
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_v4.packets
> +cat ext1_v4.expected | cut -c -48 > expout
> +AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout])
> +# Skipping the IPv4 checksum.
> +cat ext1_v4.expected | cut -c 53- > expout
> +AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout])
> +
> +# ovs-ofctl also resumes the packets and this causes other ports to receive
> +# the DHCP request packet. So reset the pcap files so that its easier to test.
> +reset_pcap_file hv1-ext1 hv1/ext1
> +rm -f ext1_v4.expected
> +
> +# Send DHCPv6 request again
> +src_mac=f00000000003
> +src_lla=fe80000000000000f20000fffe000003
> +offer_ip=ae700000000000000000000000000006
> +test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip 1
> +
> +# NXT_RESUMEs should be 2 in hv1.
> +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
> +
> +# NXT_RESUMEs should be 2 in hv2.
> +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
> +
> +as hv1
> +ovs-vsctl show
> +ovs-ofctl dump-flows br-int
> +
> +as hv2
> +ovs-vsctl show
> +ovs-ofctl dump-flows br-int
> +
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \
> +sort > ext1_v6.packets
> +cat ext1_v6.expected | cut -c -120 > expout
> +AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout])
> +# Skipping the UDP checksum
> +cat ext1_v6.expected | cut -c 125- > expout
> +AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout])
> +
> +rm -f ext1_v6.expected
> +rm -f ext1_v6.packets
> +
> +as hv1
> +ovs-vsctl show
> +reset_pcap_file hv1-ext1 hv1/ext1
> +reset_pcap_file br-phys_n1 hv1/br-phys_n1
> +reset_pcap_file br-phys hv1/br-phys
> +
> +as hv2
> +ovs-vsctl show
> +reset_pcap_file hv2-ext2 hv2/ext2
> +reset_pcap_file br-phys_n1 hv2/br-phys_n1
> +reset_pcap_file br-phys hv2/br-phys
> +
> +# From  ls1-lp_ext1, send ARP request for the router ip. The ARP
> +# response should come from the router pipeline of hv2.
> +ext1_mac=f00000000003
> +router_mac=a01000000001
> +ext1_ip=`ip_to_hex 10 0 0 6`
> +router_ip=`ip_to_hex 10 0 0 1`
> +arp_request=ffffffffffff${ext1_mac}08060001080006040001${ext1_mac}${ext1_ip}000000000000${router_ip}
> +
> +as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request
> +expected_response=${src_mac}${router_mac}08060001080006040002${router_mac}${router_ip}${ext1_mac}${ext1_ip}
> +echo $expected_response > expout
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_arp_resp
> +AT_CHECK([cat ext1_arp_resp], [0], [expout])
> +
> +# Verify that the response came from hv2
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap > ext1_arp_resp
> +AT_CHECK([cat ext1_arp_resp], [0], [expout])
> +
> +
> +# # Change the requested-chassis option for ls1-lp_ext1 from hv2 to hv1
> +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1
> +
> +as hv1
> +ovs-vsctl show
> +reset_pcap_file hv1-ext1 hv1/ext1
> +reset_pcap_file br-phys_n1 hv1/br-phys_n1
> +reset_pcap_file br-phys hv1/br-phys
> +
> +as hv2
> +ovs-vsctl show
> +reset_pcap_file hv2-ext2 hv2/ext2
> +reset_pcap_file br-phys_n1 hv2/br-phys_n1
> +reset_pcap_file br-phys hv2/br-phys
> +
> +as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request
> +
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_arp_resp
> +AT_CHECK([cat ext1_arp_resp], [0], [expout])
> +
> +# Verify that the response didn't come from hv2
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap > ext1_arp_resp
> +AT_CHECK([cat ext1_arp_resp], [0], [])
> +
> +OVN_CLEANUP([hv1],[hv2])
> +AT_CLEANUP
> +
>  AT_SETUP([ovn -- ovn-controller restart])
>  AT_SKIP_IF([test $HAVE_PYTHON = no])
>  ovn_start
> --
> 2.20.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Numan Siddique Jan. 17, 2019, 6:54 p.m. UTC | #2
Thanks for the review Han.
I will address them in v6.


On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com> wrote:

> Hi Numan,
>
> With v5 the new test case "external logical port" fails.
> And please see more comments inlined.
>

Can you please share the testsuite.log ? It's passing locally for me.

Thanks
Numan


>
> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
> >
> > From: Numan Siddique <nusiddiq@redhat.com>
> >
> > In the case of OpenStack + OVN, when the VMs are booted on
> > hypervisors supporting SR-IOV nics, there are no OVS ports
> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
> > Router Solicitation requests, the local ovn-controller
> > cannot reply to these packets. OpenStack Neutron dhcp agent
> > service needs to be run to serve these requests.
> >
> > With the new logical port type - 'external', OVN itself can
> > handle these requests avoiding the need to deploy any
> > external services like neutron dhcp agent.
> >
> > To make use of this feature, CMS has to
> >  - create a logical port for such VMs
> >  - set the type to 'external'
> >  - set requested-chassis="<chassis-name>" in the options
> >    column.
> >  - create a localnet port for the logical switch
> >  - configure the ovn-bridge-mappings option in the OVS db.
> >
> > When the ovn-controller running in that 'chassis', detects
> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
> > flows. Since the packet enters the logical switch pipeline
> > via the localnet port, the inport register (reg14) is set
> > to the tunnel key of localnet port in the match conditions.
> >
> > In case the chassis goes down for some reason, it is the
> > responsibility of CMS to change the 'requested-chassis'
> > option to some other active chassis, so that it can serve
> > these requests.
> >
> > When the VM with the external port, sends an ARP request for
> > the router ips, only the chassis which has claimed the port,
> > will reply to the ARP requests. Rest of the chassis on
> > receiving these packets drop them in the ingress switch
> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
> > before S_SWITCH_IN_L2_LKUP.
> >
> > This would guarantee that only the chassis which has claimed
> > the external ports will run the router datapath pipeline.
> >
> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> > ---
> >
> > v4 -> v5
> > ------
> >   * Addressed review comments from Han Zhou.
> >
> > v3 -> v4
> > ------
> >   * Updated the documention as per Han Zhou's suggestion.
> >
> > v2 -> v3
> > -------
> >   * Rebased
> >
> >  ovn/controller/binding.c        |  12 +
> >  ovn/controller/lflow.c          |  41 ++-
> >  ovn/controller/lflow.h          |   2 +
> >  ovn/controller/lport.c          |  26 ++
> >  ovn/controller/lport.h          |   5 +
> >  ovn/controller/ovn-controller.c |   6 +
> >  ovn/lib/ovn-util.c              |   1 +
> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
> >  ovn/northd/ovn-northd.c         |  85 ++++-
> >  ovn/ovn-architecture.7.xml      |  78 +++++
> >  ovn/ovn-nb.xml                  |  47 +++
> >  tests/ovn.at                    | 530 +++++++++++++++++++++++++++++++-
> >  12 files changed, 848 insertions(+), 22 deletions(-)
> >
> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
> > index 021ecddcf..64e605b92 100644
> > --- a/ovn/controller/binding.c
> > +++ b/ovn/controller/binding.c
> > @@ -471,6 +471,18 @@ consider_local_datapath(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
> >           * for them. */
> >          sset_add(local_lports, binding_rec->logical_port);
> >          our_chassis = false;
> > +    } else if (!strcmp(binding_rec->type, "external")) {
> > +        const char *chassis_id = smap_get(&binding_rec->options,
> > +                                          "requested-chassis");
> > +        our_chassis = chassis_id && (
> > +            !strcmp(chassis_id, chassis_rec->name) ||
> > +            !strcmp(chassis_id, chassis_rec->hostname));
> > +        if (our_chassis) {
> > +            add_local_datapath(sbrec_datapath_binding_by_key,
> > +                               sbrec_port_binding_by_datapath,
> > +                               sbrec_port_binding_by_name,
> > +                               binding_rec->datapath, true,
> local_datapaths);
> > +        }
> >      }
> >
> >      if (our_chassis
> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
> > index 8db81927e..98e8ed3b9 100644
> > --- a/ovn/controller/lflow.c
> > +++ b/ovn/controller/lflow.c
> > @@ -52,7 +52,10 @@ lflow_init(void)
> >  struct lookup_port_aux {
> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
> >      const struct sbrec_datapath_binding *dp;
> > +    const struct sbrec_chassis *chassis;
> >  };
> >
> >  struct condition_aux {
> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >      const struct sbrec_logical_flow *,
> >      const struct hmap *local_datapaths,
> >      const struct sbrec_chassis *,
> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char
> *port_name, unsigned int *portp)
> >      const struct sbrec_port_binding *pb
> >          = lport_lookup_by_name(aux->sbrec_port_binding_by_name,
> port_name);
> >      if (pb && pb->datapath == aux->dp) {
> > -        *portp = pb->tunnel_key;
> > -        return true;
> > +        if (strcmp(pb->type, "external")) {
> > +            *portp = pb->tunnel_key;
> > +            return true;
> > +        }
> > +        const char *chassis_id = smap_get(&pb->options,
> > +                                          "requested-chassis");
> > +        if (chassis_id && (!strcmp(chassis_id, aux->chassis->name) ||
> > +                           !strcmp(chassis_id,
> aux->chassis->hostname))) {
> > +            const struct sbrec_port_binding *localnet_pb
> > +                =
> lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
> > +                                       aux->sbrec_port_binding_by_type,
> > +                                       aux->dp->tunnel_key, "localnet");
> > +            if (localnet_pb) {
> > +                *portp = localnet_pb->tunnel_key;
> > +                return true;
> > +            }
> > +        }
> > +        return false;
> >      }
> >
> >      const struct sbrec_multicast_group *mg = mcgroup_lookup_by_dp_name(
> > @@ -144,6 +165,8 @@ add_logical_flows(
> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >      const struct sbrec_dhcp_options_table *dhcp_options_table,
> >      const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
> >      const struct sbrec_logical_flow_table *logical_flow_table,
> > @@ -183,6 +206,8 @@ add_logical_flows(
> >          consider_logical_flow(sbrec_chassis_by_name,
> >                                sbrec_multicast_group_by_name_datapath,
> >                                sbrec_port_binding_by_name,
> > +                              sbrec_port_binding_by_type,
> > +                              sbrec_datapath_binding_by_key,
> >                                lflow, local_datapaths,
> >                                chassis, &dhcp_opts, &dhcpv6_opts,
> &nd_ra_opts,
> >                                addr_sets, port_groups, active_tunnels,
> > @@ -200,6 +225,8 @@ consider_logical_flow(
> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >      const struct sbrec_logical_flow *lflow,
> >      const struct hmap *local_datapaths,
> >      const struct sbrec_chassis *chassis,
> > @@ -292,7 +319,10 @@ consider_logical_flow(
> >          .sbrec_multicast_group_by_name_datapath
> >              = sbrec_multicast_group_by_name_datapath,
> >          .sbrec_port_binding_by_name = sbrec_port_binding_by_name,
> > -        .dp = lflow->logical_datapath
> > +        .sbrec_port_binding_by_type = sbrec_port_binding_by_type,
> > +        .sbrec_datapath_binding_by_key = sbrec_datapath_binding_by_key,
> > +        .dp = lflow->logical_datapath,
> > +        .chassis = chassis
> >      };
> >      struct condition_aux cond_aux = {
> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
> > @@ -463,6 +493,8 @@ void
> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> >            struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >            struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +          struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >            const struct sbrec_dhcp_options_table *dhcp_options_table,
> >            const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
> >            const struct sbrec_logical_flow_table *logical_flow_table,
> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index
> *sbrec_chassis_by_name,
> >
> >      add_logical_flows(sbrec_chassis_by_name,
> >                        sbrec_multicast_group_by_name_datapath,
> > -                      sbrec_port_binding_by_name, dhcp_options_table,
> > +                      sbrec_port_binding_by_name,
> sbrec_port_binding_by_type,
> > +                      sbrec_datapath_binding_by_key, dhcp_options_table,
> >                        dhcpv6_options_table, logical_flow_table,
> >                        local_datapaths, chassis, addr_sets, port_groups,
> >                        active_tunnels, local_lport_ids, flow_table,
> group_table,
> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
> > index d19338140..b2911e0eb 100644
> > --- a/ovn/controller/lflow.h
> > +++ b/ovn/controller/lflow.h
> > @@ -68,6 +68,8 @@ void lflow_init(void);
> >  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> >                 struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >                 struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > +               struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +               struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >                 const struct sbrec_dhcp_options_table *,
> >                 const struct sbrec_dhcpv6_options_table *,
> >                 const struct sbrec_logical_flow_table *,
> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
> > index cc5c5fbb2..9c827d9b0 100644
> > --- a/ovn/controller/lport.c
> > +++ b/ovn/controller/lport.c
> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >      return retval;
> >  }
> >
> > +const struct sbrec_port_binding *
> > +lport_lookup_by_type(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> > +                     struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +                     uint64_t dp_key, const char *port_type)
> > +{
> > +    /* Lookup datapath corresponding to dp_key. */
> > +    const struct sbrec_datapath_binding *db = datapath_lookup_by_key(
> > +        sbrec_datapath_binding_by_key, dp_key);
> > +    if (!db) {
> > +        return NULL;
> > +    }
> > +
> > +    /* Build key for an indexed lookup. */
> > +    struct sbrec_port_binding *pb = sbrec_port_binding_index_init_row(
> > +            sbrec_port_binding_by_type);
> > +    sbrec_port_binding_index_set_datapath(pb, db);
> > +    sbrec_port_binding_index_set_type(pb, port_type);
> > +
> > +    const struct sbrec_port_binding *retval =
> sbrec_port_binding_index_find(
> > +            sbrec_port_binding_by_type, pb);
> > +
> > +    sbrec_port_binding_index_destroy_row(pb);
> > +
> > +    return retval;
> > +}
> > +
> >  const struct sbrec_datapath_binding *
> >  datapath_lookup_by_key(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >                         uint64_t dp_key)
> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
> > index 7dcd5bee0..2d49792f6 100644
> > --- a/ovn/controller/lport.h
> > +++ b/ovn/controller/lport.h
> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding *lport_lookup_by_key(
> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
> >      uint64_t dp_key, uint64_t port_key);
> >
> > +const struct sbrec_port_binding *lport_lookup_by_type(
> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +    uint64_t dp_key, const char *port_type);
> > +
> >  const struct sbrec_datapath_binding *datapath_lookup_by_key(
> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key, uint64_t
> dp_key);
> >
> > diff --git a/ovn/controller/ovn-controller.c
> b/ovn/controller/ovn-controller.c
> > index 4e9a5865f..5aab9142f 100644
> > --- a/ovn/controller/ovn-controller.c
> > +++ b/ovn/controller/ovn-controller.c
> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
> >       * ports that have a Gateway_Chassis that point's to our own
> >       * chassis */
> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
> "chassisredirect");
> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "external");
> >      if (chassis) {
> >          /* This should be mostly redundant with the other clauses for
> port
> >           * bindings, but it allows us to catch any ports that are
> assigned to
> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >                                    &sbrec_port_binding_col_datapath);
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> > +                                  &sbrec_port_binding_col_type);
>
> This index is used with two columns: datapath_binding and type, so it
> should be created with both columns using create2.
>
> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >
> &sbrec_datapath_binding_col_tunnel_key);
> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
> >                              sbrec_chassis_by_name,
> >                              sbrec_multicast_group_by_name_datapath,
> >                              sbrec_port_binding_by_name,
> > +                            sbrec_port_binding_by_type,
> > +                            sbrec_datapath_binding_by_key,
> >
> sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
> >
> sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
> >
> sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
> > index aa03919bb..a9d4b8736 100644
> > --- a/ovn/lib/ovn-util.c
> > +++ b/ovn/lib/ovn-util.c
> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
> >      "localport",
> >      "router",
> >      "vtep",
> > +    "external",
> >  };
> >
> >  bool
> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> > index 392a5efc9..c8883d60d 100644
> > --- a/ovn/northd/ovn-northd.8.xml
> > +++ b/ovn/northd/ovn-northd.8.xml
> > @@ -626,7 +626,8 @@ nd_na_router {
> >      <p>
> >        This table adds the DHCPv4 options to a DHCPv4 packet from the
> >        logical ports configured with IPv4 address(es) and DHCPv4 options,
> > -      and similarly for DHCPv6 options.
> > +      and similarly for DHCPv6 options. This table also adds flows for
> the
> > +      logical ports of type <code>external</code>.
> >      </p>
> >
> >      <ul>
> > @@ -827,7 +828,39 @@ output;
> >        </li>
> >      </ul>
> >
> > -    <h3>Ingress Table 16 Destination Lookup</h3>
> > +    <h3>Ingress table 16 External ports</h3>
> > +
> > +    <p>
> > +      Traffic from the <code>external</code> logical ports enter the
> ingress
> > +      datapath pipeline via the <code>localnet</code> port. This table
> adds the
> > +      below logical flows to handle the traffic from these ports.
> > +    </p>
> > +
> > +    <ul>
> > +      <li>
> > +        <p>
> > +          A priority-100 flow is added for each <code>external</code>
> logical
> > +          port which doesn't reside on a chassis to drop the ARP/IPv6 NS
> > +          request to the router IP(s) (of the logical switch) which
> matches
> > +          on the <code>inport</code> of the <code>external</code>
> logical port
> > +          and the valid <code>eth.src</code> address(es) of the
> > +          <code>external</code> logical port.
> > +        </p>
> > +
> > +        <p>
> > +          This flow guarantees that the ARP/NS request to the router IP
> > +          address from the external ports is responded by only the
> chassis
> > +          which has claimed these external ports. All the other chassis,
> > +          drops these packets.
> > +        </p>
> > +      </li>
> > +
> > +      <li>
> > +        A priority-0 flow that matches all packets to advances to table
> 17.
> > +      </li>
> > +    </ul>
> > +
> > +    <h3>Ingress Table 17 Destination Lookup</h3>
> >
> >      <p>
> >        This table implements switching behavior.  It contains these
> logical
> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> > index 3fd8a8757..87208c6c1 100644
> > --- a/ovn/northd/ovn-northd.c
> > +++ b/ovn/northd/ovn-northd.c
> > @@ -119,7 +119,8 @@ enum ovn_stage {
> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13,
> "ls_in_dhcp_response") \
> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14, "ls_in_dns_lookup")
>   \
> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15,
> "ls_in_dns_response")  \
> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16, "ls_in_l2_lkup")
>    \
> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16,
> "ls_in_external_port") \
> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17, "ls_in_l2_lkup")
>    \
> >
>   \
> >      /* Logical switch egress stages. */
>    \
> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")
>    \
> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct nbrec_logical_switch_port
> *lsp)
> >      return !lsp->up || *lsp->up;
> >  }
> >
> > +static bool
> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
> > +{
> > +    return !strcmp(nbsp->type, "external");
> > +}
> > +
> >  static bool
> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
> >                      struct ds *options_action, struct ds
> *response_action,
> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >           *  - port type is localport
> >           */
> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") &&
> > -            strcmp(op->nbsp->type, "localport")) {
> > +            strcmp(op->nbsp->type, "localport") &&
> lsp_is_external(op->nbsp)) {
>
> Sorry that I missed this in last review. The && condition has problem.
> It will cause ARP responder flows added for all lports that are not
> external. I think it should be || here.
>
> >              continue;
> >          }
> >
> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap *datapaths,
> struct hmap *ports,
> >              continue;
> >          }
> >
> > +        bool is_external = lsp_is_external(op->nbsp);
> > +        if (is_external && !op->od->localnet_port) {
> > +            /* If it's an external port and there is no localnet port
> > +             * ignore it. */
> > +            continue;
> > +        }
> > +
> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> >              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
> >                  struct ds options_action = DS_EMPTY_INITIALIZER;
> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >                      ds_put_format(
> >                          &match, "inport == %s && eth.src == %s && "
> >                          "ip4.src == 0.0.0.0 && ip4.dst ==
> 255.255.255.255 && "
> > -                        "udp.src == 68 && udp.dst == 67", op->json_key,
> > -                        op->lsp_addrs[i].ea_s);
> > +                        "udp.src == 68 && udp.dst == 67",
> > +                        op->json_key, op->lsp_addrs[i].ea_s);
>
> No change here?
> >
> >                      ovn_lflow_add(lflows, op->od,
> S_SWITCH_IN_DHCP_OPTIONS,
> >                                    100, ds_cstr(&match),
> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >      /* Ingress table 12 and 13: DHCP options and response, by default
> goto
> >       * next. (priority 0).
> >       * Ingress table 14 and 15: DNS lookup and response, by default
> goto next.
> > -     * (priority 0).*/
> > +     * (priority 0).
> > +     * Ingress table 16 - External port handling, by default goto next.
> > +     * (priority 0). */
> >
> >      HMAP_FOR_EACH (od, key_node, datapaths) {
> >          if (!od->nbs) {
> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap *datapaths,
> struct hmap *ports,
> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1",
> "next;");
> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, "1",
> "next;");
> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0, "1",
> "next;");
> > +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0, "1",
> "next;");
> >      }
> >
> > -    /* Ingress table 16: Destination lookup, broadcast and multicast
> handling
> > +    HMAP_FOR_EACH (op, key_node, ports) {
> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
> > +           continue;
> > +        }
> > +
> > +        /* Table 16: External port. Drop ARP request for router ips from
> > +         * external ports  on chassis not binding those ports.
> > +         * This makes the router pipeline to be run only on the chassis
> > +         * binding the external ports. */
> > +
> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> > +            for (size_t j = 0; j < op->od->n_router_ports; j++) {
> > +                struct ovn_port *rp = op->od->router_ports[j];
> > +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
> > +                    for (size_t l = 0; l <
> rp->lsp_addrs[k].n_ipv4_addrs;
> > +                         l++) {
> > +                        ds_clear(&match);
> > +                        ds_put_cstr(&match, "ip4");
> > +                        ds_put_format(
> > +                            &match, "inport == %s && eth.src == %s"
> > +                            " && !is_chassis_resident(%s)"
> > +                            " && arp.tpa == %s && arp.op == 1",
> > +                            op->json_key, op->lsp_addrs[i].ea_s,
> op->json_key,
>
> I believe the inport should match the localnet port's json_key here,
> since it is coming from a localnet port.
>
> > +                            rp->lsp_addrs[k].ipv4_addrs[l].addr_s);
> > +                        ovn_lflow_add(lflows, op->od,
> > +                                      S_SWITCH_IN_EXTERNAL_PORT, 100,
> > +                                      ds_cstr(&match), "drop;");
> > +                    }
> > +                    for (size_t l = 0; l <
> rp->lsp_addrs[k].n_ipv6_addrs;
> > +                         l++) {
> > +                        ds_clear(&match);
> > +                        ds_put_format(
> > +                            &match, "inport == %s && eth.src == %s"
> > +                            " && !is_chassis_resident(%s)"
> > +                            " && nd_ns && ip6.dst == {%s, %s} && "
> > +                            "nd.target == %s",
> > +                            op->json_key, op->lsp_addrs[i].ea_s,
> op->json_key,
>
> same as above.
>
> > +                            rp->lsp_addrs[k].ipv6_addrs[l].addr_s,
> > +                            rp->lsp_addrs[k].ipv6_addrs[l].sn_addr_s,
> > +                            rp->lsp_addrs[k].ipv6_addrs[l].addr_s);
> > +                        ovn_lflow_add(lflows, op->od,
> > +                                      S_SWITCH_IN_EXTERNAL_PORT, 100,
> > +                                      ds_cstr(&match), "drop;");
> > +                    }
> > +                }
> > +            }
> > +        }
> > +    }
> > +    /* Ingress table 17: Destination lookup, broadcast and multicast
> handling
> >       * (priority 100). */
> >      HMAP_FOR_EACH (op, key_node, ports) {
> >          if (!op->nbsp) {
> > @@ -4448,9 +4513,9 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >                        "outport = \""MC_FLOOD"\"; output;");
> >      }
> >
> > -    /* Ingress table 16: Destination lookup, unicast handling (priority
> 50), */
> > +    /* Ingress table 17: Destination lookup, unicast handling (priority
> 50), */
> >      HMAP_FOR_EACH (op, key_node, ports) {
> > -        if (!op->nbsp) {
> > +        if (!op->nbsp || lsp_is_external(op->nbsp)) {
> >              continue;
> >          }
> >
> > @@ -4567,7 +4632,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >          }
> >      }
> >
> > -    /* Ingress table 16: Destination lookup for unknown MACs (priority
> 0). */
> > +    /* Ingress table 17: Destination lookup for unknown MACs (priority
> 0). */
> >      HMAP_FOR_EACH (od, key_node, datapaths) {
> >          if (!od->nbs) {
> >              continue;
> > @@ -4602,7 +4667,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >       * Priority 150 rules drop packets to disabled logical ports, so
> that they
> >       * don't even receive multicast or broadcast packets. */
> >      HMAP_FOR_EACH (op, key_node, ports) {
> > -        if (!op->nbsp) {
> > +        if (!op->nbsp || lsp_is_external(op->nbsp)) {
> >              continue;
> >          }
> >
> > diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
> > index 3936e6016..405975b7b 100644
> > --- a/ovn/ovn-architecture.7.xml
> > +++ b/ovn/ovn-architecture.7.xml
> > @@ -1678,6 +1678,84 @@
> >      </li>
> >    </ol>
> >
> > +  <h2>Native OVN services for external logical ports</h2>
> > +
> > +  <p>
> > +    To support OVN native services (like DHCP/IPv6 RA/DNS lookup) to the
> > +    cloud resources which are external, OVN supports
> <code>external</code>
> > +    logical ports.
> > +  </p>
> > +
> > +  <p>
> > +    Below are some of the use cases where <code>external</code> ports
> can be
> > +    used.
> > +  </p>
> > +
> > +  <ul>
> > +    <li>
> > +      VMs connected to SR-IOV nics - Traffic from these VMs by passes
> the
> > +      kernel stack and local <code>ovn-controller</code> do not bind
> these
> > +      ports and cannot serve the native services.
> > +    </li>
> > +    <li>
> > +      When CMS supports provisioning baremetal servers.
> > +    </li>
> > +  </ul>
> > +
> > +  <p>
> > +    OVN will provide the native services if CMS has done the below
> > +    configuration in the <dfn>OVN Northbound Database</dfn>.
> > +  </p>
> > +
> > +  <ul>
> > +    <li>
> > +      A row is created in <code>Logical_Switch_Port</code>, configuring
> the
> > +      <ref column="addresses" table="Logical_Switch_Port" db="OVN_NB"/>
> column
> > +      and setting the <ref column="type" table="Logical_Switch_Port"
> > +      db="OVN_NB"/> to <code>external</code>.
> > +    </li>
> > +
> > +    <li>
> > +      <ref column="options:requested-chassis"
> table="Logical_Switch_Port"
> > +      db="OVN_NB"/> column is configured to a desired chassis.
> > +    </li>
> > +
> > +    <li>
> > +      The chassis on which this logical port is requested has the
> > +      <code>ovn-bridge-mappings</code> configured and has proper L2
> > +      connectivity so that it can receive the DHCP and other related
> request
> > +      packets from these external resources.
> > +    </li>
> > +
> > +    <li>
> > +      The Logical_Switch of this port has a <code>localnet</code> port.
> > +    </li>
> > +
> > +    <li>
> > +      Native OVN services are enabled by configuring the DHCP and other
> > +      options like the way it is done for the normal logical ports.
> > +    </li>
> > +  </ul>
> > +
> > +  <p>
> > +    OVN doesn't support HA for these <code>external</code> ports. In
> case
> > +    the <code>ovn-controller</code> running on the requested chassis
> goes down,
> > +    it is the responsiblity of CMS, to reschedule these
> <code>external</code>
> > +    ports to other active chassis.
> > +  </p>
> > +
> > +  <p>
> > +    It is recommended to request the same chassis for all the external
> ports
> > +    of a logical switch. Otherwise, the physical switch might see MAC
> flap
> > +    issue when different chassis provide the native services. For
> example when
> > +    supporting native DHCPv4 service, DHCPv4 server mac (configured in
> > +    <ref column="options:server_mac" table="DHCP_Options" db="OVN_NB"/>
> column
> > +    in table <ref table="DHCP_Options"/>)
> > +    originating from different ports can cause MAC flap issue. The MAC
> of the
> > +    logical router IP(s) can also flap if the same chassis is not
> requested for
> > +    all the external ports of a logical switch.
> > +  </p>
> > +
> >    <h1>Security</h1>
> >
> >    <h2>Role-Based Access Controls for the Soutbound DB</h2>
> > diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
> > index 6d6fb055a..fdf9adbfa 100644
> > --- a/ovn/ovn-nb.xml
> > +++ b/ovn/ovn-nb.xml
> > @@ -353,6 +353,53 @@
> >            <dd>
> >              A port to a logical switch on a VTEP gateway.
> >            </dd>
> > +
> > +          <dt><code>external</code></dt>
> > +          <dd>
> > +            <p>
> > +              Represents a logical port which is external and not having
> > +              an OVS port in the integration bridge.
> > +              <code>OVN</code> will never receive any traffic from this
> port or
> > +              send any traffic to this port. <code>OVN</code> can
> support
> > +              native services like DHCPv4/DHCPv6/DNS for this port.
> > +              If <ref column="options:requested-chassis"/> is defined,
> > +              <code>ovn-controller</code> running in that chassis will
> bind
> > +              this port to provide these native services. It is
> expected that
> > +              this port belong to a bridged logical switch
> > +              (with a <code>localnet</code> port).
> > +            </p>
> > +
> > +            <p>
> > +              It is recommended to request the same chassis for all the
> > +              external ports of a logical switch. Otherwise, the
> physical
> > +              switch might see MAC flap issue when different chassis
> provide
> > +              the native services. For example when supporting native
> DHCPv4
> > +              service, DHCPv4 server mac (configured in
> > +              <ref column="options:server_mac" table="DHCP_Options"
> > +              db="OVN_NB"/> column in table <ref table="DHCP_Options"/>)
> > +              originating from different ports can cause MAC flap issue.
> > +              The MAC of the logical router IP(s) can also flap if the
> > +              same chassis is not requested for all the external ports
> > +              of a logical switch.
> > +            </p>
> > +
> > +            <p>
> > +              Below are some of the use cases where
> <code>external</code>
> > +              ports can be used.
> > +            </p>
> > +
> > +            <ul>
> > +              <li>
> > +                VMs connected to SR-IOV nics - Traffic from these VMs
> by passes
> > +                the kernel stack and local <code>ovn-controller</code>
> do not
> > +                bind these ports and cannot serve the native services.
> > +              </li>
> > +
> > +              <li>
> > +                When CMS supports provisioning baremetal servers.
> > +              </li>
> > +            </ul>
> > +          </dd>
> >          </dl>
> >        </column>
> >      </group>
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index 8bada3241..94c774e8b 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -9594,9 +9594,9 @@ AT_CHECK([as hv2 ovs-ofctl dump-flows br-int
> table=32 | grep active_backup | gre
> >  sleep 3 # let BFD sessions settle so we get the right flows on the
> right chassis
> >
> >  # make sure that flows for handling the outside router port reside on
> gw1
> > -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> > +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> >  ]])
> > -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> > +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> >  ]])
> >
> >  # make sure ARP responder flows for outside router port reside on gw1
> too
> > @@ -9686,9 +9686,9 @@ AT_CHECK([ovs-vsctl --bare --columns bfd find
> Interface name=ovn-hv1-0],[0],
> >  sleep 3  # let BFD sessions settle so we get the right flows on the
> right chassis
> >
> >  # make sure that flows for handling the outside router port reside on
> gw2 now
> > -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> > +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> >  ]])
> > -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> > +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> >  ]])
> >
> >  # disconnect GW2 from the network, GW1 should take over
> > @@ -9700,9 +9700,9 @@ sleep 4
> >  bfd_dump
> >
> >  # make sure that flows for handling the outside router port reside on
> gw2 now
> > -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> > +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> >  ]])
> > -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> > +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> >  ]])
> >
> >  # check that the chassis redirect port has been reclaimed by the gw1
> chassis
> > @@ -11619,6 +11619,524 @@ as hv2 start_daemon ovn-controller
> >  OVN_CLEANUP([hv1],[hv2])
> >  AT_CLEANUP
> >
> > +AT_SETUP([ovn -- external logical port])
> > +AT_SKIP_IF([test $HAVE_PYTHON = no])
> > +ovn_start
> > +
> > +net_add n1
> > +sim_add hv1
> > +sim_add hv2
> > +
> > +ovn-nbctl ls-add ls1
> > +ovn-nbctl lsp-add ls1 ls1-lp1 \
> > +-- lsp-set-addresses ls1-lp1 "f0:00:00:00:00:01 10.0.0.4 ae70::4"
> > +
> > +# Add a couple of external logical port
> > +ovn-nbctl lsp-add ls1 ls1-lp_ext1 \
> > +-- lsp-set-addresses ls1-lp_ext1 "f0:00:00:00:00:03 10.0.0.6 ae70::6"
> > +ovn-nbctl lsp-set-port-security ls1-lp_ext1 \
> > +"f0:00:00:00:00:03 10.0.0.6 ae70::6"
> > +ovn-nbctl lsp-set-type ls1-lp_ext1 external
> > +
> > +ovn-nbctl lsp-add ls1 ls1-lp_ext2 \
> > +-- lsp-set-addresses ls1-lp_ext2 "f0:00:00:00:00:04 10.0.0.7 ae70::7"
> > +ovn-nbctl lsp-set-port-security ls1-lp_ext2 \
> > +"f0:00:00:00:00:04 10.0.0.7 ae70::8"
> > +ovn-nbctl lsp-set-type ls1-lp_ext2 external
> > +
> > +d1="$(ovn-nbctl create DHCP_Options cidr=10.0.0.0/24 \
> > +options="\"server_id\"=\"10.0.0.1\"
> \"server_mac\"=\"ff:10:00:00:00:01\" \
> > +\"lease_time\"=\"3600\" \"router\"=\"10.0.0.1\"")"
> > +
> > +d2="$(ovn-nbctl create DHCP_Options cidr="ae70\:\:/64" \
> > +options="\"server_id\"=\"00:00:00:10:00:01\"")"
> > +
> > +ovn-nbctl lsp-set-dhcpv4-options ls1-lp1 ${d1}
> > +ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext1 ${d1}
> > +ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext2 ${d1}
> > +
> > +ovn-nbctl lsp-set-dhcpv6-options ls1-lp1 ${d2}
> > +ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext1 ${d2}
> > +ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext2 ${d2}
> > +
> > +# Create a logical router and connect it to ls1
> > +ovn-nbctl lr-add lr0
> > +ovn-nbctl lrp-add lr0 lr0-ls1 a0:10:00:00:00:01 10.0.0.1/24
> > +ovn-nbctl lsp-add ls1 ls1-lr0
> > +ovn-nbctl set Logical_Switch_Port ls1-lr0 type=router \
> > +    options:router-port=lr0-ls1 addresses=router
> > +
> > +as hv1
> > +ovs-vsctl add-br br-phys
> > +ovn_attach n1 br-phys 192.168.0.1
> > +ovs-vsctl -- add-port br-phys hv1-ext1 -- \
> > +    set interface hv1-ext1 options:tx_pcap=hv1/ext1-tx.pcap \
> > +    options:rxq_pcap=hv1/ext1-rx.pcap \
> > +    ofport-request=2
> > +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> > +
> > +as hv2
> > +ovs-vsctl add-br br-phys
> > +ovn_attach n1 br-phys 192.168.0.2
> > +ovs-vsctl -- add-port br-phys hv2-ext2 -- \
> > +    set interface hv2-ext2 options:tx_pcap=hv2/ext2-tx.pcap \
> > +    options:rxq_pcap=hv2/ext2-rx.pcap \
> > +    ofport-request=2
> > +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> > +
> > +ovn-sbctl dump-flows > lflows_n.txt
> > +
> > +# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in
> hv1 and
> > +# hv2 as requested-chassis option is not set and no localnet port added
> to ls1.
> > +AT_CHECK([ovn-sbctl dump-flows ls1 | grep "offerip = 10.0.0.6" | \
> > +wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> > +])
> > +
> > +hv1_uuid=$(ovn-sbctl list chassis hv1 | grep uuid | awk '{print $3}')
> > +
> > +# The port_binding row for ls1-lp_ext1 should have empty chassis
> > +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
> > +grep -v requested | grep chassis | awk '{print $3}')
> > +
> > +AT_CHECK([test $chassis == "[[]]"], [0], [])
> > +
> > +# Set the requested-chassis option for ls1-lp_ext1
> > +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1
> > +
> > +# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in
> hv1 and hv2
> > +# as no localnet port added to ls1 yet.
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> > +])
> > +
> > +# Add the localnet port to the logical switch ls1
> > +ovn-nbctl lsp-add ls1 ln-public
> > +ovn-nbctl lsp-set-addresses ln-public unknown
> > +ovn-nbctl lsp-set-type ln-public localnet
> > +ovn-nbctl --wait=hv lsp-set-options ln-public network_name=phys
> > +
> > +ln_public_key=$(ovn-sbctl list port_binding ln-public | grep
> tunnel_key | \
> > +awk '{print $3}')
> > +
> > +# The ls1-lp_ext1 should be bound to hv1
> > +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
> > +grep -v requested | grep chassis | awk '{print $3}')
> > +AT_CHECK([test $chassis == "$hv1_uuid"], [0], [])
> > +
> > +# There should be DHCPv4/v6 OF flows for the ls1-lp_ext1 port in hv1
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \
> > +wc -l], [0], [3
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
> > +grep reg14=0x$ln_public_key | wc -l], [0], [1
> > +])
> > +
> > +# There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv2
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> > +])
> > +
> > +# No DHCPv4/v6 flows for the external port - ls1-lp_ext2 - 10.0.0.7 in
> hv1 and
> > +# hv2 as requested-chassis option is not set.
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.07" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.07" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0
> > +])
> > +
> > +as hv1
> > +ovs-vsctl show
> > +
> > +# This shell function sends a DHCP request packet
> > +# test_dhcp INPORT SRC_MAC DHCP_TYPE OFFER_IP ...
> > +test_dhcp() {
> > +    local inport=$1 src_mac=$2 dhcp_type=$3 offer_ip=$4 use_ip=$5
> > +    shift; shift; shift; shift; shift;
> > +    if test $use_ip != 0; then
> > +        src_ip=$1
> > +        dst_ip=$2
> > +        shift; shift;
> > +    else
> > +        src_ip=`ip_to_hex 0 0 0 0`
> > +        dst_ip=`ip_to_hex 255 255 255 255`
> > +    fi
> > +    local
> request=ffffffffffff${src_mac}0800451001100000000080110000${src_ip}${dst_ip}
> > +    # udp header and dhcp header
> > +    request=${request}0044004300fc0000
> > +
> request=${request}010106006359aa760000000000000000000000000000000000000000${src_mac}
> > +    # client hardware padding
> > +    request=${request}00000000000000000000
> > +    # server hostname
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +    # boot file name
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +    # dhcp magic cookie
> > +    request=${request}63825363
> > +    # dhcp message type
> > +    request=${request}3501${dhcp_type}ff
> > +
> > +    local srv_mac=$1 srv_ip=$2 expected_dhcp_opts=$3
> > +    # total IP length will be the IP length of the request packet
> > +    # (which is 272 in our case) + 8 (padding bytes) +
> (expected_dhcp_opts / 2)
> > +    ip_len=`expr 280 + ${#expected_dhcp_opts} / 2`
> > +    udp_len=`expr $ip_len - 20`
> > +    ip_len=$(printf "%x" $ip_len)
> > +    udp_len=$(printf "%x" $udp_len)
> > +    # $ip_len var will be in 3 digits i.e 134. So adding a '0' before
> $ip_len
> > +    local
> reply=${src_mac}${srv_mac}080045100${ip_len}000000008011XXXX${srv_ip}${offer_ip}
> > +    # udp header and dhcp header.
> > +    # $udp_len var will be in 3 digits. So adding a '0' before $udp_len
> > +
> reply=${reply}004300440${udp_len}0000020106006359aa760000000000000000
> > +    # your ip address
> > +    reply=${reply}${offer_ip}
> > +    # next server ip address, relay agent ip address, client mac address
> > +    reply=${reply}0000000000000000${src_mac}
> > +    # client hardware padding
> > +    reply=${reply}00000000000000000000
> > +    # server hostname
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +    # boot file name
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +    # dhcp magic cookie
> > +    reply=${reply}63825363
> > +    # dhcp message type
> > +    local dhcp_reply_type=02
> > +    if test $dhcp_type = 03; then
> > +        dhcp_reply_type=05
> > +    fi
> > +
> reply=${reply}3501${dhcp_reply_type}${expected_dhcp_opts}00000000ff00000000
> > +    echo $reply >> ext1_v4.expected
> > +
> > +    as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport}
> $request
> > +}
> > +
> > +
> > +trim_zeros() {
> > +    sed 's/\(00\)\{1,\}$//'
> > +}
> > +
> > +# This shell function sends a DHCPv6 request packet
> > +# test_dhcpv6 INPORT SRC_MAC SRC_LLA DHCPv6_MSG_TYPE OFFER_IP OUTPORT...
> > +# The OUTPORTs (zero or more) list the VIFs on which the original DHCPv6
> > +# packet should be received twice (one from ovn-controller and the other
> > +# from the "ovs-ofctl monitor br-int resume"
> > +test_dhcpv6() {
> > +    local inport=$1 src_mac=$2 src_lla=$3 msg_code=$4 offer_ip=$5
> > +    local req_pkt_in_expected=$6
> > +    local request=ffffffffffff${src_mac}86dd00000000002a1101${src_lla}
> > +    # dst ip ff02::1:2
> > +    request=${request}ff020000000000000000000000010002
> > +    # udp header and dhcpv6 header
> > +    request=${request}02220223002affff${msg_code}010203
> > +    # Client identifier
> > +    request=${request}0001000a00030001${src_mac}
> > +    # IA-NA (Identity Association for Non Temporary Address)
> > +    request=${request}0003000c0102030400000e1000001518
> > +    shift; shift; shift; shift; shift;
> > +
> > +    local server_mac=000000100001
> > +    local server_lla=fe80000000000000020000fffe100001
> > +    local reply_code=07
> > +    if test $msg_code = 01; then
> > +        reply_code=02
> > +    fi
> > +    local msg_len=54
> > +    if test $offer_ip = 1; then
> > +        msg_len=28
> > +    fi
> > +    local reply=${src_mac}${server_mac}86dd0000000000${msg_len}1101
> > +    reply=${reply}${server_lla}${src_lla}
> > +
> > +    # udp header and dhcpv6 header
> > +    reply=${reply}0223022200${msg_len}ffff${reply_code}010203
> > +    # Client identifier
> > +    reply=${reply}0001000a00030001${src_mac}
> > +    # IA-NA
> > +    if test $offer_ip != 1; then
> > +
> reply=${reply}0003002801020304ffffffffffffffff00050018${offer_ip}
> > +        reply=${reply}ffffffffffffffff
> > +    fi
> > +    # Server identifier
> > +    reply=${reply}0002000a00030001${server_mac}
> > +
> > +    echo $reply | trim_zeros >> ext${inport}_v6.expected
> > +    # The inport also receives the request packet since it is connected
> > +    # to the br-phys.
> > +    #echo $request >> ext${inport}_v6.expected
> > +
> > +    as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport}
> $request
> > +}
> > +
> > +reset_pcap_file() {
> > +    local iface=$1
> > +    local pcap_file=$2
> > +    ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
> > +options:rxq_pcap=dummy-rx.pcap
> > +    rm -f ${pcap_file}*.pcap
> > +    ovs-vsctl -- set Interface $iface
> options:tx_pcap=${pcap_file}-tx.pcap \
> > +options:rxq_pcap=${pcap_file}-rx.pcap
> > +}
> > +
> > +ip_to_hex() {
> > +    printf "%02x%02x%02x%02x" "$@"
> > +}
> > +
> > +AT_CAPTURE_FILE([ofctl_monitor0_hv1.log])
> > +as hv1 ovs-ofctl monitor br-int resume --detach --no-chdir \
> > +--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv1.log
> > +
> > +AT_CAPTURE_FILE([ofctl_monitor0_hv2.log])
> > +as hv2 ovs-ofctl monitor br-int resume --detach --no-chdir \
> > +--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv2.log
> > +
> > +# Send DHCPDISCOVER.
> > +offer_ip=`ip_to_hex 10 0 0 6`
> > +server_ip=`ip_to_hex 10 0 0 1`
> > +server_mac=ff1000000001
> > +expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001
> > +test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \
> > +$expected_dhcp_opts
> > +
> > +# NXT_RESUMEs should be 1 in hv1.
> > +OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv1.log | grep -c
> NXT_RESUME`])
> > +
> > +# NXT_RESUMEs should be 0 in hv2.
> > +OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c
> NXT_RESUME`])
> > +
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap >
> ext1_v4.packets
> > +cat ext1_v4.expected | cut -c -48 > expout
> > +AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout])
> > +# Skipping the IPv4 checksum.
> > +cat ext1_v4.expected | cut -c 53- > expout
> > +AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout])
> > +
> > +# ovs-ofctl also resumes the packets and this causes other ports to
> receive
> > +# the DHCP request packet. So reset the pcap files so that its easier
> to test.
> > +reset_pcap_file hv1-ext1 hv1/ext1
> > +rm -f ext1_v4.expected
> > +rm -f ext1_v4.packets
> > +
> > +# Send DHCPv6 request
> > +src_mac=f00000000003
> > +src_lla=fe80000000000000f20000fffe000003
> > +offer_ip=ae700000000000000000000000000006
> > +test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip
> > +
> > +# NXT_RESUMEs should be 2 in hv1.
> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c
> NXT_RESUME`])
> > +
> > +# NXT_RESUMEs should be 0 in hv2.
> > +OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c
> NXT_RESUME`])
> > +
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \
> > +sort > ext1_v6.packets
> > +cat ext1_v6.expected | cut -c -120 > expout
> > +AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout])
> > +# Skipping the UDP checksum
> > +cat ext1_v6.expected | cut -c 125- > expout
> > +AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout])
> > +
> > +rm -f ext1_v6.expected
> > +rm -f ext1_v6.packets
> > +reset_pcap_file hv1-ext1 hv1/ext1
> > +
> > +# Change the requested-chassis option for ls1-lp_ext1 from hv1 to hv2
> > +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv2
> > +
> > +hv2_uuid=$(ovn-sbctl list chassis hv2 | grep uuid | awk '{print $3}')
> > +
> > +# The ls1-lp_ext1 should be bound to hv2
> > +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
> > +grep -v requested | grep chassis | awk '{print $3}')
> > +AT_CHECK([test $chassis == "$hv2_uuid"], [0], [])
> > +
> > +# There should be OF flows for DHCP4/v6 for the ls1-lp_ext1 port in hv2
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \
> > +wc -l], [0], [3
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
> > +grep reg14=0x$ln_public_key | wc -l], [0], [1
> > +])
> > +
> > +# There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv1
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
> > +grep reg14=0x$ln_public_key | wc -l], [0], [0
> > +])
> > +
> > +# Send DHCPDISCOVER again for hv1/ext1. The DHCP response should come
> from
> > +# hv2 ovn-controller. Due to the test setup, the port hv1/ext1 is also
> > +# receiving the expected packet.
> > +offer_ip=`ip_to_hex 10 0 0 6`
> > +server_ip=`ip_to_hex 10 0 0 1`
> > +server_mac=ff1000000001
> > +expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001
> > +test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \
> > +$expected_dhcp_opts
> > +
> > +# NXT_RESUMEs should be 2 in hv1.
> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c
> NXT_RESUME`])
> > +
> > +# NXT_RESUMEs should be 1 in hv2.
> > +OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv2.log | grep -c
> NXT_RESUME`])
> > +
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap >
> ext1_v4.packets
> > +cat ext1_v4.expected | cut -c -48 > expout
> > +AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout])
> > +# Skipping the IPv4 checksum.
> > +cat ext1_v4.expected | cut -c 53- > expout
> > +AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout])
> > +
> > +# ovs-ofctl also resumes the packets and this causes other ports to
> receive
> > +# the DHCP request packet. So reset the pcap files so that its easier
> to test.
> > +reset_pcap_file hv1-ext1 hv1/ext1
> > +rm -f ext1_v4.expected
> > +
> > +# Send DHCPv6 request again
> > +src_mac=f00000000003
> > +src_lla=fe80000000000000f20000fffe000003
> > +offer_ip=ae700000000000000000000000000006
> > +test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip 1
> > +
> > +# NXT_RESUMEs should be 2 in hv1.
> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c
> NXT_RESUME`])
> > +
> > +# NXT_RESUMEs should be 2 in hv2.
> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv2.log | grep -c
> NXT_RESUME`])
> > +
> > +as hv1
> > +ovs-vsctl show
> > +ovs-ofctl dump-flows br-int
> > +
> > +as hv2
> > +ovs-vsctl show
> > +ovs-ofctl dump-flows br-int
> > +
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \
> > +sort > ext1_v6.packets
> > +cat ext1_v6.expected | cut -c -120 > expout
> > +AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout])
> > +# Skipping the UDP checksum
> > +cat ext1_v6.expected | cut -c 125- > expout
> > +AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout])
> > +
> > +rm -f ext1_v6.expected
> > +rm -f ext1_v6.packets
> > +
> > +as hv1
> > +ovs-vsctl show
> > +reset_pcap_file hv1-ext1 hv1/ext1
> > +reset_pcap_file br-phys_n1 hv1/br-phys_n1
> > +reset_pcap_file br-phys hv1/br-phys
> > +
> > +as hv2
> > +ovs-vsctl show
> > +reset_pcap_file hv2-ext2 hv2/ext2
> > +reset_pcap_file br-phys_n1 hv2/br-phys_n1
> > +reset_pcap_file br-phys hv2/br-phys
> > +
> > +# From  ls1-lp_ext1, send ARP request for the router ip. The ARP
> > +# response should come from the router pipeline of hv2.
> > +ext1_mac=f00000000003
> > +router_mac=a01000000001
> > +ext1_ip=`ip_to_hex 10 0 0 6`
> > +router_ip=`ip_to_hex 10 0 0 1`
> >
> +arp_request=ffffffffffff${ext1_mac}08060001080006040001${ext1_mac}${ext1_ip}000000000000${router_ip}
> > +
> > +as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request
> >
> +expected_response=${src_mac}${router_mac}08060001080006040002${router_mac}${router_ip}${ext1_mac}${ext1_ip}
> > +echo $expected_response > expout
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap >
> ext1_arp_resp
> > +AT_CHECK([cat ext1_arp_resp], [0], [expout])
> > +
> > +# Verify that the response came from hv2
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap >
> ext1_arp_resp
> > +AT_CHECK([cat ext1_arp_resp], [0], [expout])
> > +
> > +
> > +# # Change the requested-chassis option for ls1-lp_ext1 from hv2 to hv1
> > +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1
> > +
> > +as hv1
> > +ovs-vsctl show
> > +reset_pcap_file hv1-ext1 hv1/ext1
> > +reset_pcap_file br-phys_n1 hv1/br-phys_n1
> > +reset_pcap_file br-phys hv1/br-phys
> > +
> > +as hv2
> > +ovs-vsctl show
> > +reset_pcap_file hv2-ext2 hv2/ext2
> > +reset_pcap_file br-phys_n1 hv2/br-phys_n1
> > +reset_pcap_file br-phys hv2/br-phys
> > +
> > +as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request
> > +
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap >
> ext1_arp_resp
> > +AT_CHECK([cat ext1_arp_resp], [0], [expout])
> > +
> > +# Verify that the response didn't come from hv2
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap >
> ext1_arp_resp
> > +AT_CHECK([cat ext1_arp_resp], [0], [])
> > +
> > +OVN_CLEANUP([hv1],[hv2])
> > +AT_CLEANUP
> > +
> >  AT_SETUP([ovn -- ovn-controller restart])
> >  AT_SKIP_IF([test $HAVE_PYTHON = no])
> >  ovn_start
> > --
> > 2.20.1
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Han Zhou Jan. 17, 2019, 7:01 p.m. UTC | #3
On Thu, Jan 17, 2019 at 10:54 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>
> Thanks for the review Han.
> I will address them in v6.
>
>
> On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com> wrote:
>>
>> Hi Numan,
>>
>> With v5 the new test case "external logical port" fails.
>> And please see more comments inlined.
>
>
> Can you please share the testsuite.log ? It's passing locally for me.
>
> Thanks
> Numan
>
Here it is:
----------- 8>< ------------------------------------------------ ><8 ----------
#                             -*- compilation -*-
2761. ovn.at:11622: testing ovn -- external logical port ...
creating ovn-sb database
creating ovn-nb database
starting ovn-northd
starting backup ovn-northd
adding simulator 'main'
adding simulator 'hv1'
adding simulator 'hv2'
./ovn.at:11691: ovn-sbctl dump-flows ls1 | grep "offerip = 10.0.0.6" | \
wc -l
./ovn.at:11694: as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep "0a.00.00.06" | wc -l
./ovn.at:11697: as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep "0a.00.00.06" | wc -l
./ovn.at:11700: as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep tp_src=546 | grep \
"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l
./ovn.at:11704: as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep tp_src=546 | grep \
"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l
./ovn.at:11715: test $chassis == "[]"
./ovn.at:11722: as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep "0a.00.00.06" | wc -l
./ovn.at:11725: as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep "0a.00.00.06" | wc -l
./ovn.at:11728: as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep tp_src=546 | grep \
"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l
./ovn.at:11732: as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep tp_src=546 | grep \
"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l
./ovn.at:11749: test $chassis == "$hv1_uuid"
./ovn.at:11752: as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \
wc -l
./ovn.at:11756: as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep tp_src=546 | grep \
"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
grep reg14=0x$ln_public_key | wc -l
./ovn.at:11763: as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep "0a.00.00.06" | wc -l
./ovn.at:11766: as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep tp_src=546 | grep \
"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l
./ovn.at:11773: as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep "0a.00.00.07" | wc -l
./ovn.at:11776: as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep "0a.00.00.07" | wc -l
./ovn.at:11779: as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep tp_src=546 | grep \
"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l
./ovn.at:11783: as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
grep controller | grep tp_src=546 | grep \
"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l
693ddea5-e6ec-437c-9310-18431f29ff5c
    Bridge br-int
        fail_mode: secure
        Port "ovn-hv2-0"
            Interface "ovn-hv2-0"
                type: geneve
                options: {csum="true", key=flow, remote_ip="192.168.0.2"}
        Port patch-br-int-to-ln-public
            Interface patch-br-int-to-ln-public
                type: patch
                options: {peer=patch-ln-public-to-br-int}
        Port br-int
            Interface br-int
                type: internal
    Bridge br-phys
        Port br-phys
            Interface br-phys
                type: internal
                options:
{rxq_pcap="/home/hzhou/src/ovs/tests/testsuite.dir/2761/hv1/br-phys-rx.pcap",
tx_pcap="/home/hzhou/src/ovs/tests/testsuite.dir/2761/hv1/br-phys-tx.pcap"}
        Port "hv1-ext1"
            Interface "hv1-ext1"
                options: {rxq_pcap="hv1/ext1-rx.pcap",
tx_pcap="hv1/ext1-tx.pcap"}
        Port patch-ln-public-to-br-int
            Interface patch-ln-public-to-br-int
                type: patch
                options: {peer=patch-br-int-to-ln-public}
        Port "br-phys_n1"
            Interface "br-phys_n1"
                options:
{rxq_pcap="/home/hzhou/src/ovs/tests/testsuite.dir/2761/hv1/br-phys_n1-rx.pcap",
stream="unix:/home/hzhou/src/ovs/tests/testsuite.dir/2761/main/hv1_br-phys.sock",
tx_pcap="/home/hzhou/src/ovs/tests/testsuite.dir/2761/hv1/br-phys_n1-tx.pcap"}
ovn.at:11950: waiting until test 1 = `cat ofctl_monitor0_hv1.log |
grep -c NXT_RESUME`...
ovn.at:11950: wait succeeded immediately
ovn.at:11953: waiting until test 0 = `cat ofctl_monitor0_hv2.log |
grep -c NXT_RESUME`...
ovn.at:11953: wait succeeded immediately
./ovn.at:11957: cat ext1_v4.packets | cut -c -48
./ovn.at:11960: cat ext1_v4.packets | cut -c 53-
ovn.at:11975: waiting until test 2 = `cat ofctl_monitor0_hv1.log |
grep -c NXT_RESUME`...
ovn.at:11975: wait succeeded immediately
ovn.at:11978: waiting until test 0 = `cat ofctl_monitor0_hv2.log |
grep -c NXT_RESUME`...
ovn.at:11978: wait succeeded immediately
./ovn.at:11983: cat ext1_v6.packets | cut -c -120
./ovn.at:11986: cat ext1_v6.packets | cut -c 125-
./ovn.at:12000: test $chassis == "$hv2_uuid"
./ovn.at:12000: exit code was 1, expected 0
ofctl_monitor0_hv1.log:
> NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=286 reg11=0x2,reg12=0x1,reg13=0x5,reg14=0x5,metadata=0x1,in_port=2 (via action) data_len=286 (unbuffered)
>  userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.63.0a.00.00.06.33.04.00.00.0e.10.01.04.ff.ff.ff.00.03.04.0a.00.00.01.36.04.0a.00.00.01
>  continuation.bridge=2c65dbbc-a986-4343-a987-d79737ced287
>  continuation.actions=unroll_xlate(table=0, cookie=0),resubmit(,21)
> udp,vlan_tci=0x0000,dl_src=f0:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff,nw_src=0.0.0.0,nw_dst=255.255.255.255,nw_tos=16,nw_ecn=0,nw_ttl=128,tp_src=68,tp_dst=67 udp_csum:0
> send: NXT_RESUME (xid=0x0): cookie=0x0 total_len=286 reg11=0x2,reg12=0x1,reg13=0x5,reg14=0x5,metadata=0x1,in_port=2 (via action) data_len=286 (unbuffered)
>  continuation.bridge=2c65dbbc-a986-4343-a987-d79737ced287
>  continuation.actions=unroll_xlate(table=0, cookie=0),resubmit(,21)
> udp,vlan_tci=0x0000,dl_src=f0:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff,nw_src=0.0.0.0,nw_dst=255.255.255.255,nw_tos=16,nw_ecn=0,nw_ttl=128,tp_src=68,tp_dst=67 udp_csum:0
> NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=96 reg11=0x2,reg12=0x1,reg13=0x5,reg14=0x5,metadata=0x1,in_port=2 (via action) data_len=96 (unbuffered)
>  userdata=00.00.00.05.00.00.00.00.00.01.de.10.00.00.00.63.00.05.00.10.ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06.00.02.00.06.00.00.00.10.00.01
>  continuation.bridge=2c65dbbc-a986-4343-a987-d79737ced287
>  continuation.actions=unroll_xlate(table=0, cookie=0),resubmit(,21)
> udp6,vlan_tci=0x0000,dl_src=f0:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff,ipv6_src=fe80::f200:ff:fe00:3,ipv6_dst=ff02::1:2,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=1,tp_src=546,tp_dst=547 udp_csum:ffff
> send: NXT_RESUME (xid=0x0): cookie=0x0 total_len=96 reg11=0x2,reg12=0x1,reg13=0x5,reg14=0x5,metadata=0x1,in_port=2 (via action) data_len=96 (unbuffered)
>  continuation.bridge=2c65dbbc-a986-4343-a987-d79737ced287
>  continuation.actions=unroll_xlate(table=0, cookie=0),resubmit(,21)
> udp6,vlan_tci=0x0000,dl_src=f0:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff,ipv6_src=fe80::f200:ff:fe00:3,ipv6_dst=ff02::1:2,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=1,tp_src=546,tp_dst=547 udp_csum:ffff
ofctl_monitor0_hv2.log:
> OFPT_PORT_STATUS (xid=0x0): ADD: 2(patch-br-int-to): addr:f2:9f:47:50:ee:4b
>      config:     0
>      state:      0
>      speed: 0 Mbps now, 0 Mbps max
2761. ovn.at:11622: 2761. ovn -- external logical port (ovn.at:11622):
FAILED (ovn.at:12000)
Numan Siddique Jan. 17, 2019, 7:24 p.m. UTC | #4
On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com> wrote:

> Hi Numan,
>
> With v5 the new test case "external logical port" fails.
> And please see more comments inlined.
>
> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
> >
> > From: Numan Siddique <nusiddiq@redhat.com>
> >
> > In the case of OpenStack + OVN, when the VMs are booted on
> > hypervisors supporting SR-IOV nics, there are no OVS ports
> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
> > Router Solicitation requests, the local ovn-controller
> > cannot reply to these packets. OpenStack Neutron dhcp agent
> > service needs to be run to serve these requests.
> >
> > With the new logical port type - 'external', OVN itself can
> > handle these requests avoiding the need to deploy any
> > external services like neutron dhcp agent.
> >
> > To make use of this feature, CMS has to
> >  - create a logical port for such VMs
> >  - set the type to 'external'
> >  - set requested-chassis="<chassis-name>" in the options
> >    column.
> >  - create a localnet port for the logical switch
> >  - configure the ovn-bridge-mappings option in the OVS db.
> >
> > When the ovn-controller running in that 'chassis', detects
> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
> > flows. Since the packet enters the logical switch pipeline
> > via the localnet port, the inport register (reg14) is set
> > to the tunnel key of localnet port in the match conditions.
> >
> > In case the chassis goes down for some reason, it is the
> > responsibility of CMS to change the 'requested-chassis'
> > option to some other active chassis, so that it can serve
> > these requests.
> >
> > When the VM with the external port, sends an ARP request for
> > the router ips, only the chassis which has claimed the port,
> > will reply to the ARP requests. Rest of the chassis on
> > receiving these packets drop them in the ingress switch
> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
> > before S_SWITCH_IN_L2_LKUP.
> >
> > This would guarantee that only the chassis which has claimed
> > the external ports will run the router datapath pipeline.
> >
> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> > ---
> >
> > v4 -> v5
> > ------
> >   * Addressed review comments from Han Zhou.
> >
> > v3 -> v4
> > ------
> >   * Updated the documention as per Han Zhou's suggestion.
> >
> > v2 -> v3
> > -------
> >   * Rebased
> >
> >  ovn/controller/binding.c        |  12 +
> >  ovn/controller/lflow.c          |  41 ++-
> >  ovn/controller/lflow.h          |   2 +
> >  ovn/controller/lport.c          |  26 ++
> >  ovn/controller/lport.h          |   5 +
> >  ovn/controller/ovn-controller.c |   6 +
> >  ovn/lib/ovn-util.c              |   1 +
> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
> >  ovn/northd/ovn-northd.c         |  85 ++++-
> >  ovn/ovn-architecture.7.xml      |  78 +++++
> >  ovn/ovn-nb.xml                  |  47 +++
> >  tests/ovn.at                    | 530 +++++++++++++++++++++++++++++++-
> >  12 files changed, 848 insertions(+), 22 deletions(-)
> >
> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
> > index 021ecddcf..64e605b92 100644
> > --- a/ovn/controller/binding.c
> > +++ b/ovn/controller/binding.c
> > @@ -471,6 +471,18 @@ consider_local_datapath(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
> >           * for them. */
> >          sset_add(local_lports, binding_rec->logical_port);
> >          our_chassis = false;
> > +    } else if (!strcmp(binding_rec->type, "external")) {
> > +        const char *chassis_id = smap_get(&binding_rec->options,
> > +                                          "requested-chassis");
> > +        our_chassis = chassis_id && (
> > +            !strcmp(chassis_id, chassis_rec->name) ||
> > +            !strcmp(chassis_id, chassis_rec->hostname));
> > +        if (our_chassis) {
> > +            add_local_datapath(sbrec_datapath_binding_by_key,
> > +                               sbrec_port_binding_by_datapath,
> > +                               sbrec_port_binding_by_name,
> > +                               binding_rec->datapath, true,
> local_datapaths);
> > +        }
> >      }
> >
> >      if (our_chassis
> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
> > index 8db81927e..98e8ed3b9 100644
> > --- a/ovn/controller/lflow.c
> > +++ b/ovn/controller/lflow.c
> > @@ -52,7 +52,10 @@ lflow_init(void)
> >  struct lookup_port_aux {
> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
> >      const struct sbrec_datapath_binding *dp;
> > +    const struct sbrec_chassis *chassis;
> >  };
> >
> >  struct condition_aux {
> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >      const struct sbrec_logical_flow *,
> >      const struct hmap *local_datapaths,
> >      const struct sbrec_chassis *,
> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char
> *port_name, unsigned int *portp)
> >      const struct sbrec_port_binding *pb
> >          = lport_lookup_by_name(aux->sbrec_port_binding_by_name,
> port_name);
> >      if (pb && pb->datapath == aux->dp) {
> > -        *portp = pb->tunnel_key;
> > -        return true;
> > +        if (strcmp(pb->type, "external")) {
> > +            *portp = pb->tunnel_key;
> > +            return true;
> > +        }
> > +        const char *chassis_id = smap_get(&pb->options,
> > +                                          "requested-chassis");
> > +        if (chassis_id && (!strcmp(chassis_id, aux->chassis->name) ||
> > +                           !strcmp(chassis_id,
> aux->chassis->hostname))) {
> > +            const struct sbrec_port_binding *localnet_pb
> > +                =
> lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
> > +                                       aux->sbrec_port_binding_by_type,
> > +                                       aux->dp->tunnel_key, "localnet");
> > +            if (localnet_pb) {
> > +                *portp = localnet_pb->tunnel_key;
> > +                return true;
> > +            }
> > +        }
> > +        return false;
> >      }
> >
> >      const struct sbrec_multicast_group *mg = mcgroup_lookup_by_dp_name(
> > @@ -144,6 +165,8 @@ add_logical_flows(
> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >      const struct sbrec_dhcp_options_table *dhcp_options_table,
> >      const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
> >      const struct sbrec_logical_flow_table *logical_flow_table,
> > @@ -183,6 +206,8 @@ add_logical_flows(
> >          consider_logical_flow(sbrec_chassis_by_name,
> >                                sbrec_multicast_group_by_name_datapath,
> >                                sbrec_port_binding_by_name,
> > +                              sbrec_port_binding_by_type,
> > +                              sbrec_datapath_binding_by_key,
> >                                lflow, local_datapaths,
> >                                chassis, &dhcp_opts, &dhcpv6_opts,
> &nd_ra_opts,
> >                                addr_sets, port_groups, active_tunnels,
> > @@ -200,6 +225,8 @@ consider_logical_flow(
> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >      const struct sbrec_logical_flow *lflow,
> >      const struct hmap *local_datapaths,
> >      const struct sbrec_chassis *chassis,
> > @@ -292,7 +319,10 @@ consider_logical_flow(
> >          .sbrec_multicast_group_by_name_datapath
> >              = sbrec_multicast_group_by_name_datapath,
> >          .sbrec_port_binding_by_name = sbrec_port_binding_by_name,
> > -        .dp = lflow->logical_datapath
> > +        .sbrec_port_binding_by_type = sbrec_port_binding_by_type,
> > +        .sbrec_datapath_binding_by_key = sbrec_datapath_binding_by_key,
> > +        .dp = lflow->logical_datapath,
> > +        .chassis = chassis
> >      };
> >      struct condition_aux cond_aux = {
> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
> > @@ -463,6 +493,8 @@ void
> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> >            struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >            struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +          struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >            const struct sbrec_dhcp_options_table *dhcp_options_table,
> >            const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
> >            const struct sbrec_logical_flow_table *logical_flow_table,
> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index
> *sbrec_chassis_by_name,
> >
> >      add_logical_flows(sbrec_chassis_by_name,
> >                        sbrec_multicast_group_by_name_datapath,
> > -                      sbrec_port_binding_by_name, dhcp_options_table,
> > +                      sbrec_port_binding_by_name,
> sbrec_port_binding_by_type,
> > +                      sbrec_datapath_binding_by_key, dhcp_options_table,
> >                        dhcpv6_options_table, logical_flow_table,
> >                        local_datapaths, chassis, addr_sets, port_groups,
> >                        active_tunnels, local_lport_ids, flow_table,
> group_table,
> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
> > index d19338140..b2911e0eb 100644
> > --- a/ovn/controller/lflow.h
> > +++ b/ovn/controller/lflow.h
> > @@ -68,6 +68,8 @@ void lflow_init(void);
> >  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> >                 struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >                 struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > +               struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +               struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >                 const struct sbrec_dhcp_options_table *,
> >                 const struct sbrec_dhcpv6_options_table *,
> >                 const struct sbrec_logical_flow_table *,
> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
> > index cc5c5fbb2..9c827d9b0 100644
> > --- a/ovn/controller/lport.c
> > +++ b/ovn/controller/lport.c
> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >      return retval;
> >  }
> >
> > +const struct sbrec_port_binding *
> > +lport_lookup_by_type(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> > +                     struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +                     uint64_t dp_key, const char *port_type)
> > +{
> > +    /* Lookup datapath corresponding to dp_key. */
> > +    const struct sbrec_datapath_binding *db = datapath_lookup_by_key(
> > +        sbrec_datapath_binding_by_key, dp_key);
> > +    if (!db) {
> > +        return NULL;
> > +    }
> > +
> > +    /* Build key for an indexed lookup. */
> > +    struct sbrec_port_binding *pb = sbrec_port_binding_index_init_row(
> > +            sbrec_port_binding_by_type);
> > +    sbrec_port_binding_index_set_datapath(pb, db);
> > +    sbrec_port_binding_index_set_type(pb, port_type);
> > +
> > +    const struct sbrec_port_binding *retval =
> sbrec_port_binding_index_find(
> > +            sbrec_port_binding_by_type, pb);
> > +
> > +    sbrec_port_binding_index_destroy_row(pb);
> > +
> > +    return retval;
> > +}
> > +
> >  const struct sbrec_datapath_binding *
> >  datapath_lookup_by_key(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >                         uint64_t dp_key)
> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
> > index 7dcd5bee0..2d49792f6 100644
> > --- a/ovn/controller/lport.h
> > +++ b/ovn/controller/lport.h
> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding *lport_lookup_by_key(
> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
> >      uint64_t dp_key, uint64_t port_key);
> >
> > +const struct sbrec_port_binding *lport_lookup_by_type(
> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > +    uint64_t dp_key, const char *port_type);
> > +
> >  const struct sbrec_datapath_binding *datapath_lookup_by_key(
> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key, uint64_t
> dp_key);
> >
> > diff --git a/ovn/controller/ovn-controller.c
> b/ovn/controller/ovn-controller.c
> > index 4e9a5865f..5aab9142f 100644
> > --- a/ovn/controller/ovn-controller.c
> > +++ b/ovn/controller/ovn-controller.c
> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
> >       * ports that have a Gateway_Chassis that point's to our own
> >       * chassis */
> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
> "chassisredirect");
> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "external");
> >      if (chassis) {
> >          /* This should be mostly redundant with the other clauses for
> port
> >           * bindings, but it allows us to catch any ports that are
> assigned to
> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >                                    &sbrec_port_binding_col_datapath);
> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> > +                                  &sbrec_port_binding_col_type);
>
> This index is used with two columns: datapath_binding and type, so it
> should be created with both columns using create2.
>
> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >
> &sbrec_datapath_binding_col_tunnel_key);
> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
> >                              sbrec_chassis_by_name,
> >                              sbrec_multicast_group_by_name_datapath,
> >                              sbrec_port_binding_by_name,
> > +                            sbrec_port_binding_by_type,
> > +                            sbrec_datapath_binding_by_key,
> >
> sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
> >
> sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
> >
> sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
> > index aa03919bb..a9d4b8736 100644
> > --- a/ovn/lib/ovn-util.c
> > +++ b/ovn/lib/ovn-util.c
> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
> >      "localport",
> >      "router",
> >      "vtep",
> > +    "external",
> >  };
> >
> >  bool
> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> > index 392a5efc9..c8883d60d 100644
> > --- a/ovn/northd/ovn-northd.8.xml
> > +++ b/ovn/northd/ovn-northd.8.xml
> > @@ -626,7 +626,8 @@ nd_na_router {
> >      <p>
> >        This table adds the DHCPv4 options to a DHCPv4 packet from the
> >        logical ports configured with IPv4 address(es) and DHCPv4 options,
> > -      and similarly for DHCPv6 options.
> > +      and similarly for DHCPv6 options. This table also adds flows for
> the
> > +      logical ports of type <code>external</code>.
> >      </p>
> >
> >      <ul>
> > @@ -827,7 +828,39 @@ output;
> >        </li>
> >      </ul>
> >
> > -    <h3>Ingress Table 16 Destination Lookup</h3>
> > +    <h3>Ingress table 16 External ports</h3>
> > +
> > +    <p>
> > +      Traffic from the <code>external</code> logical ports enter the
> ingress
> > +      datapath pipeline via the <code>localnet</code> port. This table
> adds the
> > +      below logical flows to handle the traffic from these ports.
> > +    </p>
> > +
> > +    <ul>
> > +      <li>
> > +        <p>
> > +          A priority-100 flow is added for each <code>external</code>
> logical
> > +          port which doesn't reside on a chassis to drop the ARP/IPv6 NS
> > +          request to the router IP(s) (of the logical switch) which
> matches
> > +          on the <code>inport</code> of the <code>external</code>
> logical port
> > +          and the valid <code>eth.src</code> address(es) of the
> > +          <code>external</code> logical port.
> > +        </p>
> > +
> > +        <p>
> > +          This flow guarantees that the ARP/NS request to the router IP
> > +          address from the external ports is responded by only the
> chassis
> > +          which has claimed these external ports. All the other chassis,
> > +          drops these packets.
> > +        </p>
> > +      </li>
> > +
> > +      <li>
> > +        A priority-0 flow that matches all packets to advances to table
> 17.
> > +      </li>
> > +    </ul>
> > +
> > +    <h3>Ingress Table 17 Destination Lookup</h3>
> >
> >      <p>
> >        This table implements switching behavior.  It contains these
> logical
> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> > index 3fd8a8757..87208c6c1 100644
> > --- a/ovn/northd/ovn-northd.c
> > +++ b/ovn/northd/ovn-northd.c
> > @@ -119,7 +119,8 @@ enum ovn_stage {
> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13,
> "ls_in_dhcp_response") \
> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14, "ls_in_dns_lookup")
>   \
> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15,
> "ls_in_dns_response")  \
> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16, "ls_in_l2_lkup")
>    \
> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16,
> "ls_in_external_port") \
> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17, "ls_in_l2_lkup")
>    \
> >
>   \
> >      /* Logical switch egress stages. */
>    \
> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")
>    \
> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct nbrec_logical_switch_port
> *lsp)
> >      return !lsp->up || *lsp->up;
> >  }
> >
> > +static bool
> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
> > +{
> > +    return !strcmp(nbsp->type, "external");
> > +}
> > +
> >  static bool
> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
> >                      struct ds *options_action, struct ds
> *response_action,
> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >           *  - port type is localport
> >           */
> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") &&
> > -            strcmp(op->nbsp->type, "localport")) {
> > +            strcmp(op->nbsp->type, "localport") &&
> lsp_is_external(op->nbsp)) {
>
> Sorry that I missed this in last review. The && condition has problem.
> It will cause ARP responder flows added for all lports that are not
> external. I think it should be || here.
>

Agree. To make it easier to read, I will add a new "if" with continue -
below this one for
external port types.



>
> >              continue;
> >          }
> >
> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap *datapaths,
> struct hmap *ports,
> >              continue;
> >          }
> >
> > +        bool is_external = lsp_is_external(op->nbsp);
> > +        if (is_external && !op->od->localnet_port) {
> > +            /* If it's an external port and there is no localnet port
> > +             * ignore it. */
> > +            continue;
> > +        }
> > +
> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> >              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
> >                  struct ds options_action = DS_EMPTY_INITIALIZER;
> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >                      ds_put_format(
> >                          &match, "inport == %s && eth.src == %s && "
> >                          "ip4.src == 0.0.0.0 && ip4.dst ==
> 255.255.255.255 && "
> > -                        "udp.src == 68 && udp.dst == 67", op->json_key,
> > -                        op->lsp_addrs[i].ea_s);
> > +                        "udp.src == 68 && udp.dst == 67",
> > +                        op->json_key, op->lsp_addrs[i].ea_s);
>
> No change here?
>

I think it's unwanted and unrelated change. I will correct it.

> >
> >                      ovn_lflow_add(lflows, op->od,
> S_SWITCH_IN_DHCP_OPTIONS,
> >                                    100, ds_cstr(&match),
> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >      /* Ingress table 12 and 13: DHCP options and response, by default
> goto
> >       * next. (priority 0).
> >       * Ingress table 14 and 15: DNS lookup and response, by default
> goto next.
> > -     * (priority 0).*/
> > +     * (priority 0).
> > +     * Ingress table 16 - External port handling, by default goto next.
> > +     * (priority 0). */
> >
> >      HMAP_FOR_EACH (od, key_node, datapaths) {
> >          if (!od->nbs) {
> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap *datapaths,
> struct hmap *ports,
> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1",
> "next;");
> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, "1",
> "next;");
> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0, "1",
> "next;");
> > +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0, "1",
> "next;");
> >      }
> >
> > -    /* Ingress table 16: Destination lookup, broadcast and multicast
> handling
> > +    HMAP_FOR_EACH (op, key_node, ports) {
> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
> > +           continue;
> > +        }
> > +
> > +        /* Table 16: External port. Drop ARP request for router ips from
> > +         * external ports  on chassis not binding those ports.
> > +         * This makes the router pipeline to be run only on the chassis
> > +         * binding the external ports. */
> > +
> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> > +            for (size_t j = 0; j < op->od->n_router_ports; j++) {
> > +                struct ovn_port *rp = op->od->router_ports[j];
> > +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
> > +                    for (size_t l = 0; l <
> rp->lsp_addrs[k].n_ipv4_addrs;
> > +                         l++) {
> > +                        ds_clear(&match);
> > +                        ds_put_cstr(&match, "ip4");
> > +                        ds_put_format(
> > +                            &match, "inport == %s && eth.src == %s"
> > +                            " && !is_chassis_resident(%s)"
> > +                            " && arp.tpa == %s && arp.op == 1",
> > +                            op->json_key, op->lsp_addrs[i].ea_s,
> op->json_key,
>
> I believe the inport should match the localnet port's json_key here,
> since it is coming from a localnet port.
>

Both would work. If you see the code in lflow.c in this patch - it will get
the tunnel
key of the localnet port if the port_binding type is "external".

That's how even the DHCP requests are handled. ovn-controller will translate
the logical flows with action "put_dhcp_opts" only the chassis claiming the
external ports.

Thanks
Numan



>
> > +                            rp->lsp_addrs[k].ipv4_addrs[l].addr_s);
> > +                        ovn_lflow_add(lflows, op->od,
> > +                                      S_SWITCH_IN_EXTERNAL_PORT, 100,
> > +                                      ds_cstr(&match), "drop;");
> > +                    }
> > +                    for (size_t l = 0; l <
> rp->lsp_addrs[k].n_ipv6_addrs;
> > +                         l++) {
> > +                        ds_clear(&match);
> > +                        ds_put_format(
> > +                            &match, "inport == %s && eth.src == %s"
> > +                            " && !is_chassis_resident(%s)"
> > +                            " && nd_ns && ip6.dst == {%s, %s} && "
> > +                            "nd.target == %s",
> > +                            op->json_key, op->lsp_addrs[i].ea_s,
> op->json_key,
>
> same as above.
>
> > +                            rp->lsp_addrs[k].ipv6_addrs[l].addr_s,
> > +                            rp->lsp_addrs[k].ipv6_addrs[l].sn_addr_s,
> > +                            rp->lsp_addrs[k].ipv6_addrs[l].addr_s);
> > +                        ovn_lflow_add(lflows, op->od,
> > +                                      S_SWITCH_IN_EXTERNAL_PORT, 100,
> > +                                      ds_cstr(&match), "drop;");
> > +                    }
> > +                }
> > +            }
> > +        }
> > +    }
> > +    /* Ingress table 17: Destination lookup, broadcast and multicast
> handling
> >       * (priority 100). */
> >      HMAP_FOR_EACH (op, key_node, ports) {
> >          if (!op->nbsp) {
> > @@ -4448,9 +4513,9 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >                        "outport = \""MC_FLOOD"\"; output;");
> >      }
> >
> > -    /* Ingress table 16: Destination lookup, unicast handling (priority
> 50), */
> > +    /* Ingress table 17: Destination lookup, unicast handling (priority
> 50), */
> >      HMAP_FOR_EACH (op, key_node, ports) {
> > -        if (!op->nbsp) {
> > +        if (!op->nbsp || lsp_is_external(op->nbsp)) {
> >              continue;
> >          }
> >
> > @@ -4567,7 +4632,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >          }
> >      }
> >
> > -    /* Ingress table 16: Destination lookup for unknown MACs (priority
> 0). */
> > +    /* Ingress table 17: Destination lookup for unknown MACs (priority
> 0). */
> >      HMAP_FOR_EACH (od, key_node, datapaths) {
> >          if (!od->nbs) {
> >              continue;
> > @@ -4602,7 +4667,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
> >       * Priority 150 rules drop packets to disabled logical ports, so
> that they
> >       * don't even receive multicast or broadcast packets. */
> >      HMAP_FOR_EACH (op, key_node, ports) {
> > -        if (!op->nbsp) {
> > +        if (!op->nbsp || lsp_is_external(op->nbsp)) {
> >              continue;
> >          }
> >
> > diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
> > index 3936e6016..405975b7b 100644
> > --- a/ovn/ovn-architecture.7.xml
> > +++ b/ovn/ovn-architecture.7.xml
> > @@ -1678,6 +1678,84 @@
> >      </li>
> >    </ol>
> >
> > +  <h2>Native OVN services for external logical ports</h2>
> > +
> > +  <p>
> > +    To support OVN native services (like DHCP/IPv6 RA/DNS lookup) to the
> > +    cloud resources which are external, OVN supports
> <code>external</code>
> > +    logical ports.
> > +  </p>
> > +
> > +  <p>
> > +    Below are some of the use cases where <code>external</code> ports
> can be
> > +    used.
> > +  </p>
> > +
> > +  <ul>
> > +    <li>
> > +      VMs connected to SR-IOV nics - Traffic from these VMs by passes
> the
> > +      kernel stack and local <code>ovn-controller</code> do not bind
> these
> > +      ports and cannot serve the native services.
> > +    </li>
> > +    <li>
> > +      When CMS supports provisioning baremetal servers.
> > +    </li>
> > +  </ul>
> > +
> > +  <p>
> > +    OVN will provide the native services if CMS has done the below
> > +    configuration in the <dfn>OVN Northbound Database</dfn>.
> > +  </p>
> > +
> > +  <ul>
> > +    <li>
> > +      A row is created in <code>Logical_Switch_Port</code>, configuring
> the
> > +      <ref column="addresses" table="Logical_Switch_Port" db="OVN_NB"/>
> column
> > +      and setting the <ref column="type" table="Logical_Switch_Port"
> > +      db="OVN_NB"/> to <code>external</code>.
> > +    </li>
> > +
> > +    <li>
> > +      <ref column="options:requested-chassis"
> table="Logical_Switch_Port"
> > +      db="OVN_NB"/> column is configured to a desired chassis.
> > +    </li>
> > +
> > +    <li>
> > +      The chassis on which this logical port is requested has the
> > +      <code>ovn-bridge-mappings</code> configured and has proper L2
> > +      connectivity so that it can receive the DHCP and other related
> request
> > +      packets from these external resources.
> > +    </li>
> > +
> > +    <li>
> > +      The Logical_Switch of this port has a <code>localnet</code> port.
> > +    </li>
> > +
> > +    <li>
> > +      Native OVN services are enabled by configuring the DHCP and other
> > +      options like the way it is done for the normal logical ports.
> > +    </li>
> > +  </ul>
> > +
> > +  <p>
> > +    OVN doesn't support HA for these <code>external</code> ports. In
> case
> > +    the <code>ovn-controller</code> running on the requested chassis
> goes down,
> > +    it is the responsiblity of CMS, to reschedule these
> <code>external</code>
> > +    ports to other active chassis.
> > +  </p>
> > +
> > +  <p>
> > +    It is recommended to request the same chassis for all the external
> ports
> > +    of a logical switch. Otherwise, the physical switch might see MAC
> flap
> > +    issue when different chassis provide the native services. For
> example when
> > +    supporting native DHCPv4 service, DHCPv4 server mac (configured in
> > +    <ref column="options:server_mac" table="DHCP_Options" db="OVN_NB"/>
> column
> > +    in table <ref table="DHCP_Options"/>)
> > +    originating from different ports can cause MAC flap issue. The MAC
> of the
> > +    logical router IP(s) can also flap if the same chassis is not
> requested for
> > +    all the external ports of a logical switch.
> > +  </p>
> > +
> >    <h1>Security</h1>
> >
> >    <h2>Role-Based Access Controls for the Soutbound DB</h2>
> > diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
> > index 6d6fb055a..fdf9adbfa 100644
> > --- a/ovn/ovn-nb.xml
> > +++ b/ovn/ovn-nb.xml
> > @@ -353,6 +353,53 @@
> >            <dd>
> >              A port to a logical switch on a VTEP gateway.
> >            </dd>
> > +
> > +          <dt><code>external</code></dt>
> > +          <dd>
> > +            <p>
> > +              Represents a logical port which is external and not having
> > +              an OVS port in the integration bridge.
> > +              <code>OVN</code> will never receive any traffic from this
> port or
> > +              send any traffic to this port. <code>OVN</code> can
> support
> > +              native services like DHCPv4/DHCPv6/DNS for this port.
> > +              If <ref column="options:requested-chassis"/> is defined,
> > +              <code>ovn-controller</code> running in that chassis will
> bind
> > +              this port to provide these native services. It is
> expected that
> > +              this port belong to a bridged logical switch
> > +              (with a <code>localnet</code> port).
> > +            </p>
> > +
> > +            <p>
> > +              It is recommended to request the same chassis for all the
> > +              external ports of a logical switch. Otherwise, the
> physical
> > +              switch might see MAC flap issue when different chassis
> provide
> > +              the native services. For example when supporting native
> DHCPv4
> > +              service, DHCPv4 server mac (configured in
> > +              <ref column="options:server_mac" table="DHCP_Options"
> > +              db="OVN_NB"/> column in table <ref table="DHCP_Options"/>)
> > +              originating from different ports can cause MAC flap issue.
> > +              The MAC of the logical router IP(s) can also flap if the
> > +              same chassis is not requested for all the external ports
> > +              of a logical switch.
> > +            </p>
> > +
> > +            <p>
> > +              Below are some of the use cases where
> <code>external</code>
> > +              ports can be used.
> > +            </p>
> > +
> > +            <ul>
> > +              <li>
> > +                VMs connected to SR-IOV nics - Traffic from these VMs
> by passes
> > +                the kernel stack and local <code>ovn-controller</code>
> do not
> > +                bind these ports and cannot serve the native services.
> > +              </li>
> > +
> > +              <li>
> > +                When CMS supports provisioning baremetal servers.
> > +              </li>
> > +            </ul>
> > +          </dd>
> >          </dl>
> >        </column>
> >      </group>
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index 8bada3241..94c774e8b 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -9594,9 +9594,9 @@ AT_CHECK([as hv2 ovs-ofctl dump-flows br-int
> table=32 | grep active_backup | gre
> >  sleep 3 # let BFD sessions settle so we get the right flows on the
> right chassis
> >
> >  # make sure that flows for handling the outside router port reside on
> gw1
> > -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> > +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> >  ]])
> > -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> > +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> >  ]])
> >
> >  # make sure ARP responder flows for outside router port reside on gw1
> too
> > @@ -9686,9 +9686,9 @@ AT_CHECK([ovs-vsctl --bare --columns bfd find
> Interface name=ovn-hv1-0],[0],
> >  sleep 3  # let BFD sessions settle so we get the right flows on the
> right chassis
> >
> >  # make sure that flows for handling the outside router port reside on
> gw2 now
> > -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> > +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> >  ]])
> > -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> > +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> >  ]])
> >
> >  # disconnect GW2 from the network, GW1 should take over
> > @@ -9700,9 +9700,9 @@ sleep 4
> >  bfd_dump
> >
> >  # make sure that flows for handling the outside router port reside on
> gw2 now
> > -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> > +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[1
> >  ]])
> > -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> > +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep
> 00:00:02:01:02:04 | wc -l], [0], [[0
> >  ]])
> >
> >  # check that the chassis redirect port has been reclaimed by the gw1
> chassis
> > @@ -11619,6 +11619,524 @@ as hv2 start_daemon ovn-controller
> >  OVN_CLEANUP([hv1],[hv2])
> >  AT_CLEANUP
> >
> > +AT_SETUP([ovn -- external logical port])
> > +AT_SKIP_IF([test $HAVE_PYTHON = no])
> > +ovn_start
> > +
> > +net_add n1
> > +sim_add hv1
> > +sim_add hv2
> > +
> > +ovn-nbctl ls-add ls1
> > +ovn-nbctl lsp-add ls1 ls1-lp1 \
> > +-- lsp-set-addresses ls1-lp1 "f0:00:00:00:00:01 10.0.0.4 ae70::4"
> > +
> > +# Add a couple of external logical port
> > +ovn-nbctl lsp-add ls1 ls1-lp_ext1 \
> > +-- lsp-set-addresses ls1-lp_ext1 "f0:00:00:00:00:03 10.0.0.6 ae70::6"
> > +ovn-nbctl lsp-set-port-security ls1-lp_ext1 \
> > +"f0:00:00:00:00:03 10.0.0.6 ae70::6"
> > +ovn-nbctl lsp-set-type ls1-lp_ext1 external
> > +
> > +ovn-nbctl lsp-add ls1 ls1-lp_ext2 \
> > +-- lsp-set-addresses ls1-lp_ext2 "f0:00:00:00:00:04 10.0.0.7 ae70::7"
> > +ovn-nbctl lsp-set-port-security ls1-lp_ext2 \
> > +"f0:00:00:00:00:04 10.0.0.7 ae70::8"
> > +ovn-nbctl lsp-set-type ls1-lp_ext2 external
> > +
> > +d1="$(ovn-nbctl create DHCP_Options cidr=10.0.0.0/24 \
> > +options="\"server_id\"=\"10.0.0.1\"
> \"server_mac\"=\"ff:10:00:00:00:01\" \
> > +\"lease_time\"=\"3600\" \"router\"=\"10.0.0.1\"")"
> > +
> > +d2="$(ovn-nbctl create DHCP_Options cidr="ae70\:\:/64" \
> > +options="\"server_id\"=\"00:00:00:10:00:01\"")"
> > +
> > +ovn-nbctl lsp-set-dhcpv4-options ls1-lp1 ${d1}
> > +ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext1 ${d1}
> > +ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext2 ${d1}
> > +
> > +ovn-nbctl lsp-set-dhcpv6-options ls1-lp1 ${d2}
> > +ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext1 ${d2}
> > +ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext2 ${d2}
> > +
> > +# Create a logical router and connect it to ls1
> > +ovn-nbctl lr-add lr0
> > +ovn-nbctl lrp-add lr0 lr0-ls1 a0:10:00:00:00:01 10.0.0.1/24
> > +ovn-nbctl lsp-add ls1 ls1-lr0
> > +ovn-nbctl set Logical_Switch_Port ls1-lr0 type=router \
> > +    options:router-port=lr0-ls1 addresses=router
> > +
> > +as hv1
> > +ovs-vsctl add-br br-phys
> > +ovn_attach n1 br-phys 192.168.0.1
> > +ovs-vsctl -- add-port br-phys hv1-ext1 -- \
> > +    set interface hv1-ext1 options:tx_pcap=hv1/ext1-tx.pcap \
> > +    options:rxq_pcap=hv1/ext1-rx.pcap \
> > +    ofport-request=2
> > +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> > +
> > +as hv2
> > +ovs-vsctl add-br br-phys
> > +ovn_attach n1 br-phys 192.168.0.2
> > +ovs-vsctl -- add-port br-phys hv2-ext2 -- \
> > +    set interface hv2-ext2 options:tx_pcap=hv2/ext2-tx.pcap \
> > +    options:rxq_pcap=hv2/ext2-rx.pcap \
> > +    ofport-request=2
> > +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> > +
> > +ovn-sbctl dump-flows > lflows_n.txt
> > +
> > +# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in
> hv1 and
> > +# hv2 as requested-chassis option is not set and no localnet port added
> to ls1.
> > +AT_CHECK([ovn-sbctl dump-flows ls1 | grep "offerip = 10.0.0.6" | \
> > +wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> > +])
> > +
> > +hv1_uuid=$(ovn-sbctl list chassis hv1 | grep uuid | awk '{print $3}')
> > +
> > +# The port_binding row for ls1-lp_ext1 should have empty chassis
> > +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
> > +grep -v requested | grep chassis | awk '{print $3}')
> > +
> > +AT_CHECK([test $chassis == "[[]]"], [0], [])
> > +
> > +# Set the requested-chassis option for ls1-lp_ext1
> > +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1
> > +
> > +# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in
> hv1 and hv2
> > +# as no localnet port added to ls1 yet.
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> > +])
> > +
> > +# Add the localnet port to the logical switch ls1
> > +ovn-nbctl lsp-add ls1 ln-public
> > +ovn-nbctl lsp-set-addresses ln-public unknown
> > +ovn-nbctl lsp-set-type ln-public localnet
> > +ovn-nbctl --wait=hv lsp-set-options ln-public network_name=phys
> > +
> > +ln_public_key=$(ovn-sbctl list port_binding ln-public | grep
> tunnel_key | \
> > +awk '{print $3}')
> > +
> > +# The ls1-lp_ext1 should be bound to hv1
> > +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
> > +grep -v requested | grep chassis | awk '{print $3}')
> > +AT_CHECK([test $chassis == "$hv1_uuid"], [0], [])
> > +
> > +# There should be DHCPv4/v6 OF flows for the ls1-lp_ext1 port in hv1
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \
> > +wc -l], [0], [3
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
> > +grep reg14=0x$ln_public_key | wc -l], [0], [1
> > +])
> > +
> > +# There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv2
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
> > +])
> > +
> > +# No DHCPv4/v6 flows for the external port - ls1-lp_ext2 - 10.0.0.7 in
> hv1 and
> > +# hv2 as requested-chassis option is not set.
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.07" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.07" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0
> > +])
> > +
> > +as hv1
> > +ovs-vsctl show
> > +
> > +# This shell function sends a DHCP request packet
> > +# test_dhcp INPORT SRC_MAC DHCP_TYPE OFFER_IP ...
> > +test_dhcp() {
> > +    local inport=$1 src_mac=$2 dhcp_type=$3 offer_ip=$4 use_ip=$5
> > +    shift; shift; shift; shift; shift;
> > +    if test $use_ip != 0; then
> > +        src_ip=$1
> > +        dst_ip=$2
> > +        shift; shift;
> > +    else
> > +        src_ip=`ip_to_hex 0 0 0 0`
> > +        dst_ip=`ip_to_hex 255 255 255 255`
> > +    fi
> > +    local
> request=ffffffffffff${src_mac}0800451001100000000080110000${src_ip}${dst_ip}
> > +    # udp header and dhcp header
> > +    request=${request}0044004300fc0000
> > +
> request=${request}010106006359aa760000000000000000000000000000000000000000${src_mac}
> > +    # client hardware padding
> > +    request=${request}00000000000000000000
> > +    # server hostname
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +    # boot file name
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +
> request=${request}0000000000000000000000000000000000000000000000000000000000000000
> > +    # dhcp magic cookie
> > +    request=${request}63825363
> > +    # dhcp message type
> > +    request=${request}3501${dhcp_type}ff
> > +
> > +    local srv_mac=$1 srv_ip=$2 expected_dhcp_opts=$3
> > +    # total IP length will be the IP length of the request packet
> > +    # (which is 272 in our case) + 8 (padding bytes) +
> (expected_dhcp_opts / 2)
> > +    ip_len=`expr 280 + ${#expected_dhcp_opts} / 2`
> > +    udp_len=`expr $ip_len - 20`
> > +    ip_len=$(printf "%x" $ip_len)
> > +    udp_len=$(printf "%x" $udp_len)
> > +    # $ip_len var will be in 3 digits i.e 134. So adding a '0' before
> $ip_len
> > +    local
> reply=${src_mac}${srv_mac}080045100${ip_len}000000008011XXXX${srv_ip}${offer_ip}
> > +    # udp header and dhcp header.
> > +    # $udp_len var will be in 3 digits. So adding a '0' before $udp_len
> > +
> reply=${reply}004300440${udp_len}0000020106006359aa760000000000000000
> > +    # your ip address
> > +    reply=${reply}${offer_ip}
> > +    # next server ip address, relay agent ip address, client mac address
> > +    reply=${reply}0000000000000000${src_mac}
> > +    # client hardware padding
> > +    reply=${reply}00000000000000000000
> > +    # server hostname
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +    # boot file name
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +
> reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
> > +    # dhcp magic cookie
> > +    reply=${reply}63825363
> > +    # dhcp message type
> > +    local dhcp_reply_type=02
> > +    if test $dhcp_type = 03; then
> > +        dhcp_reply_type=05
> > +    fi
> > +
> reply=${reply}3501${dhcp_reply_type}${expected_dhcp_opts}00000000ff00000000
> > +    echo $reply >> ext1_v4.expected
> > +
> > +    as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport}
> $request
> > +}
> > +
> > +
> > +trim_zeros() {
> > +    sed 's/\(00\)\{1,\}$//'
> > +}
> > +
> > +# This shell function sends a DHCPv6 request packet
> > +# test_dhcpv6 INPORT SRC_MAC SRC_LLA DHCPv6_MSG_TYPE OFFER_IP OUTPORT...
> > +# The OUTPORTs (zero or more) list the VIFs on which the original DHCPv6
> > +# packet should be received twice (one from ovn-controller and the other
> > +# from the "ovs-ofctl monitor br-int resume"
> > +test_dhcpv6() {
> > +    local inport=$1 src_mac=$2 src_lla=$3 msg_code=$4 offer_ip=$5
> > +    local req_pkt_in_expected=$6
> > +    local request=ffffffffffff${src_mac}86dd00000000002a1101${src_lla}
> > +    # dst ip ff02::1:2
> > +    request=${request}ff020000000000000000000000010002
> > +    # udp header and dhcpv6 header
> > +    request=${request}02220223002affff${msg_code}010203
> > +    # Client identifier
> > +    request=${request}0001000a00030001${src_mac}
> > +    # IA-NA (Identity Association for Non Temporary Address)
> > +    request=${request}0003000c0102030400000e1000001518
> > +    shift; shift; shift; shift; shift;
> > +
> > +    local server_mac=000000100001
> > +    local server_lla=fe80000000000000020000fffe100001
> > +    local reply_code=07
> > +    if test $msg_code = 01; then
> > +        reply_code=02
> > +    fi
> > +    local msg_len=54
> > +    if test $offer_ip = 1; then
> > +        msg_len=28
> > +    fi
> > +    local reply=${src_mac}${server_mac}86dd0000000000${msg_len}1101
> > +    reply=${reply}${server_lla}${src_lla}
> > +
> > +    # udp header and dhcpv6 header
> > +    reply=${reply}0223022200${msg_len}ffff${reply_code}010203
> > +    # Client identifier
> > +    reply=${reply}0001000a00030001${src_mac}
> > +    # IA-NA
> > +    if test $offer_ip != 1; then
> > +
> reply=${reply}0003002801020304ffffffffffffffff00050018${offer_ip}
> > +        reply=${reply}ffffffffffffffff
> > +    fi
> > +    # Server identifier
> > +    reply=${reply}0002000a00030001${server_mac}
> > +
> > +    echo $reply | trim_zeros >> ext${inport}_v6.expected
> > +    # The inport also receives the request packet since it is connected
> > +    # to the br-phys.
> > +    #echo $request >> ext${inport}_v6.expected
> > +
> > +    as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport}
> $request
> > +}
> > +
> > +reset_pcap_file() {
> > +    local iface=$1
> > +    local pcap_file=$2
> > +    ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
> > +options:rxq_pcap=dummy-rx.pcap
> > +    rm -f ${pcap_file}*.pcap
> > +    ovs-vsctl -- set Interface $iface
> options:tx_pcap=${pcap_file}-tx.pcap \
> > +options:rxq_pcap=${pcap_file}-rx.pcap
> > +}
> > +
> > +ip_to_hex() {
> > +    printf "%02x%02x%02x%02x" "$@"
> > +}
> > +
> > +AT_CAPTURE_FILE([ofctl_monitor0_hv1.log])
> > +as hv1 ovs-ofctl monitor br-int resume --detach --no-chdir \
> > +--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv1.log
> > +
> > +AT_CAPTURE_FILE([ofctl_monitor0_hv2.log])
> > +as hv2 ovs-ofctl monitor br-int resume --detach --no-chdir \
> > +--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv2.log
> > +
> > +# Send DHCPDISCOVER.
> > +offer_ip=`ip_to_hex 10 0 0 6`
> > +server_ip=`ip_to_hex 10 0 0 1`
> > +server_mac=ff1000000001
> > +expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001
> > +test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \
> > +$expected_dhcp_opts
> > +
> > +# NXT_RESUMEs should be 1 in hv1.
> > +OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv1.log | grep -c
> NXT_RESUME`])
> > +
> > +# NXT_RESUMEs should be 0 in hv2.
> > +OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c
> NXT_RESUME`])
> > +
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap >
> ext1_v4.packets
> > +cat ext1_v4.expected | cut -c -48 > expout
> > +AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout])
> > +# Skipping the IPv4 checksum.
> > +cat ext1_v4.expected | cut -c 53- > expout
> > +AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout])
> > +
> > +# ovs-ofctl also resumes the packets and this causes other ports to
> receive
> > +# the DHCP request packet. So reset the pcap files so that its easier
> to test.
> > +reset_pcap_file hv1-ext1 hv1/ext1
> > +rm -f ext1_v4.expected
> > +rm -f ext1_v4.packets
> > +
> > +# Send DHCPv6 request
> > +src_mac=f00000000003
> > +src_lla=fe80000000000000f20000fffe000003
> > +offer_ip=ae700000000000000000000000000006
> > +test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip
> > +
> > +# NXT_RESUMEs should be 2 in hv1.
> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c
> NXT_RESUME`])
> > +
> > +# NXT_RESUMEs should be 0 in hv2.
> > +OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c
> NXT_RESUME`])
> > +
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \
> > +sort > ext1_v6.packets
> > +cat ext1_v6.expected | cut -c -120 > expout
> > +AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout])
> > +# Skipping the UDP checksum
> > +cat ext1_v6.expected | cut -c 125- > expout
> > +AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout])
> > +
> > +rm -f ext1_v6.expected
> > +rm -f ext1_v6.packets
> > +reset_pcap_file hv1-ext1 hv1/ext1
> > +
> > +# Change the requested-chassis option for ls1-lp_ext1 from hv1 to hv2
> > +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv2
> > +
> > +hv2_uuid=$(ovn-sbctl list chassis hv2 | grep uuid | awk '{print $3}')
> > +
> > +# The ls1-lp_ext1 should be bound to hv2
> > +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
> > +grep -v requested | grep chassis | awk '{print $3}')
> > +AT_CHECK([test $chassis == "$hv2_uuid"], [0], [])
> > +
> > +# There should be OF flows for DHCP4/v6 for the ls1-lp_ext1 port in hv2
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \
> > +wc -l], [0], [3
> > +])
> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
> > +grep reg14=0x$ln_public_key | wc -l], [0], [1
> > +])
> > +
> > +# There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv1
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
> > +])
> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
> > +grep controller | grep tp_src=546 | grep \
> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
> > +grep reg14=0x$ln_public_key | wc -l], [0], [0
> > +])
> > +
> > +# Send DHCPDISCOVER again for hv1/ext1. The DHCP response should come
> from
> > +# hv2 ovn-controller. Due to the test setup, the port hv1/ext1 is also
> > +# receiving the expected packet.
> > +offer_ip=`ip_to_hex 10 0 0 6`
> > +server_ip=`ip_to_hex 10 0 0 1`
> > +server_mac=ff1000000001
> > +expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001
> > +test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \
> > +$expected_dhcp_opts
> > +
> > +# NXT_RESUMEs should be 2 in hv1.
> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c
> NXT_RESUME`])
> > +
> > +# NXT_RESUMEs should be 1 in hv2.
> > +OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv2.log | grep -c
> NXT_RESUME`])
> > +
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap >
> ext1_v4.packets
> > +cat ext1_v4.expected | cut -c -48 > expout
> > +AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout])
> > +# Skipping the IPv4 checksum.
> > +cat ext1_v4.expected | cut -c 53- > expout
> > +AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout])
> > +
> > +# ovs-ofctl also resumes the packets and this causes other ports to
> receive
> > +# the DHCP request packet. So reset the pcap files so that its easier
> to test.
> > +reset_pcap_file hv1-ext1 hv1/ext1
> > +rm -f ext1_v4.expected
> > +
> > +# Send DHCPv6 request again
> > +src_mac=f00000000003
> > +src_lla=fe80000000000000f20000fffe000003
> > +offer_ip=ae700000000000000000000000000006
> > +test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip 1
> > +
> > +# NXT_RESUMEs should be 2 in hv1.
> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c
> NXT_RESUME`])
> > +
> > +# NXT_RESUMEs should be 2 in hv2.
> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv2.log | grep -c
> NXT_RESUME`])
> > +
> > +as hv1
> > +ovs-vsctl show
> > +ovs-ofctl dump-flows br-int
> > +
> > +as hv2
> > +ovs-vsctl show
> > +ovs-ofctl dump-flows br-int
> > +
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \
> > +sort > ext1_v6.packets
> > +cat ext1_v6.expected | cut -c -120 > expout
> > +AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout])
> > +# Skipping the UDP checksum
> > +cat ext1_v6.expected | cut -c 125- > expout
> > +AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout])
> > +
> > +rm -f ext1_v6.expected
> > +rm -f ext1_v6.packets
> > +
> > +as hv1
> > +ovs-vsctl show
> > +reset_pcap_file hv1-ext1 hv1/ext1
> > +reset_pcap_file br-phys_n1 hv1/br-phys_n1
> > +reset_pcap_file br-phys hv1/br-phys
> > +
> > +as hv2
> > +ovs-vsctl show
> > +reset_pcap_file hv2-ext2 hv2/ext2
> > +reset_pcap_file br-phys_n1 hv2/br-phys_n1
> > +reset_pcap_file br-phys hv2/br-phys
> > +
> > +# From  ls1-lp_ext1, send ARP request for the router ip. The ARP
> > +# response should come from the router pipeline of hv2.
> > +ext1_mac=f00000000003
> > +router_mac=a01000000001
> > +ext1_ip=`ip_to_hex 10 0 0 6`
> > +router_ip=`ip_to_hex 10 0 0 1`
> >
> +arp_request=ffffffffffff${ext1_mac}08060001080006040001${ext1_mac}${ext1_ip}000000000000${router_ip}
> > +
> > +as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request
> >
> +expected_response=${src_mac}${router_mac}08060001080006040002${router_mac}${router_ip}${ext1_mac}${ext1_ip}
> > +echo $expected_response > expout
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap >
> ext1_arp_resp
> > +AT_CHECK([cat ext1_arp_resp], [0], [expout])
> > +
> > +# Verify that the response came from hv2
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap >
> ext1_arp_resp
> > +AT_CHECK([cat ext1_arp_resp], [0], [expout])
> > +
> > +
> > +# # Change the requested-chassis option for ls1-lp_ext1 from hv2 to hv1
> > +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1
> > +
> > +as hv1
> > +ovs-vsctl show
> > +reset_pcap_file hv1-ext1 hv1/ext1
> > +reset_pcap_file br-phys_n1 hv1/br-phys_n1
> > +reset_pcap_file br-phys hv1/br-phys
> > +
> > +as hv2
> > +ovs-vsctl show
> > +reset_pcap_file hv2-ext2 hv2/ext2
> > +reset_pcap_file br-phys_n1 hv2/br-phys_n1
> > +reset_pcap_file br-phys hv2/br-phys
> > +
> > +as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request
> > +
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap >
> ext1_arp_resp
> > +AT_CHECK([cat ext1_arp_resp], [0], [expout])
> > +
> > +# Verify that the response didn't come from hv2
> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap >
> ext1_arp_resp
> > +AT_CHECK([cat ext1_arp_resp], [0], [])
> > +
> > +OVN_CLEANUP([hv1],[hv2])
> > +AT_CLEANUP
> > +
> >  AT_SETUP([ovn -- ovn-controller restart])
> >  AT_SKIP_IF([test $HAVE_PYTHON = no])
> >  ovn_start
> > --
> > 2.20.1
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Han Zhou Jan. 17, 2019, 7:32 p.m. UTC | #5
On Thu, Jan 17, 2019 at 11:25 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>
>
>
> On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com> wrote:
>>
>> Hi Numan,
>>
>> With v5 the new test case "external logical port" fails.
>> And please see more comments inlined.
>>
>> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
>> >
>> > From: Numan Siddique <nusiddiq@redhat.com>
>> >
>> > In the case of OpenStack + OVN, when the VMs are booted on
>> > hypervisors supporting SR-IOV nics, there are no OVS ports
>> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
>> > Router Solicitation requests, the local ovn-controller
>> > cannot reply to these packets. OpenStack Neutron dhcp agent
>> > service needs to be run to serve these requests.
>> >
>> > With the new logical port type - 'external', OVN itself can
>> > handle these requests avoiding the need to deploy any
>> > external services like neutron dhcp agent.
>> >
>> > To make use of this feature, CMS has to
>> >  - create a logical port for such VMs
>> >  - set the type to 'external'
>> >  - set requested-chassis="<chassis-name>" in the options
>> >    column.
>> >  - create a localnet port for the logical switch
>> >  - configure the ovn-bridge-mappings option in the OVS db.
>> >
>> > When the ovn-controller running in that 'chassis', detects
>> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
>> > flows. Since the packet enters the logical switch pipeline
>> > via the localnet port, the inport register (reg14) is set
>> > to the tunnel key of localnet port in the match conditions.
>> >
>> > In case the chassis goes down for some reason, it is the
>> > responsibility of CMS to change the 'requested-chassis'
>> > option to some other active chassis, so that it can serve
>> > these requests.
>> >
>> > When the VM with the external port, sends an ARP request for
>> > the router ips, only the chassis which has claimed the port,
>> > will reply to the ARP requests. Rest of the chassis on
>> > receiving these packets drop them in the ingress switch
>> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
>> > before S_SWITCH_IN_L2_LKUP.
>> >
>> > This would guarantee that only the chassis which has claimed
>> > the external ports will run the router datapath pipeline.
>> >
>> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>> > ---
>> >
>> > v4 -> v5
>> > ------
>> >   * Addressed review comments from Han Zhou.
>> >
>> > v3 -> v4
>> > ------
>> >   * Updated the documention as per Han Zhou's suggestion.
>> >
>> > v2 -> v3
>> > -------
>> >   * Rebased
>> >
>> >  ovn/controller/binding.c        |  12 +
>> >  ovn/controller/lflow.c          |  41 ++-
>> >  ovn/controller/lflow.h          |   2 +
>> >  ovn/controller/lport.c          |  26 ++
>> >  ovn/controller/lport.h          |   5 +
>> >  ovn/controller/ovn-controller.c |   6 +
>> >  ovn/lib/ovn-util.c              |   1 +
>> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
>> >  ovn/northd/ovn-northd.c         |  85 ++++-
>> >  ovn/ovn-architecture.7.xml      |  78 +++++
>> >  ovn/ovn-nb.xml                  |  47 +++
>> >  tests/ovn.at                    | 530 +++++++++++++++++++++++++++++++-
>> >  12 files changed, 848 insertions(+), 22 deletions(-)
>> >
>> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
>> > index 021ecddcf..64e605b92 100644
>> > --- a/ovn/controller/binding.c
>> > +++ b/ovn/controller/binding.c
>> > @@ -471,6 +471,18 @@ consider_local_datapath(struct ovsdb_idl_txn *ovnsb_idl_txn,
>> >           * for them. */
>> >          sset_add(local_lports, binding_rec->logical_port);
>> >          our_chassis = false;
>> > +    } else if (!strcmp(binding_rec->type, "external")) {
>> > +        const char *chassis_id = smap_get(&binding_rec->options,
>> > +                                          "requested-chassis");
>> > +        our_chassis = chassis_id && (
>> > +            !strcmp(chassis_id, chassis_rec->name) ||
>> > +            !strcmp(chassis_id, chassis_rec->hostname));
>> > +        if (our_chassis) {
>> > +            add_local_datapath(sbrec_datapath_binding_by_key,
>> > +                               sbrec_port_binding_by_datapath,
>> > +                               sbrec_port_binding_by_name,
>> > +                               binding_rec->datapath, true, local_datapaths);
>> > +        }
>> >      }
>> >
>> >      if (our_chassis
>> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
>> > index 8db81927e..98e8ed3b9 100644
>> > --- a/ovn/controller/lflow.c
>> > +++ b/ovn/controller/lflow.c
>> > @@ -52,7 +52,10 @@ lflow_init(void)
>> >  struct lookup_port_aux {
>> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
>> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
>> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
>> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
>> >      const struct sbrec_datapath_binding *dp;
>> > +    const struct sbrec_chassis *chassis;
>> >  };
>> >
>> >  struct condition_aux {
>> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
>> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >      const struct sbrec_logical_flow *,
>> >      const struct hmap *local_datapaths,
>> >      const struct sbrec_chassis *,
>> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char *port_name, unsigned int *portp)
>> >      const struct sbrec_port_binding *pb
>> >          = lport_lookup_by_name(aux->sbrec_port_binding_by_name, port_name);
>> >      if (pb && pb->datapath == aux->dp) {
>> > -        *portp = pb->tunnel_key;
>> > -        return true;
>> > +        if (strcmp(pb->type, "external")) {
>> > +            *portp = pb->tunnel_key;
>> > +            return true;
>> > +        }
>> > +        const char *chassis_id = smap_get(&pb->options,
>> > +                                          "requested-chassis");
>> > +        if (chassis_id && (!strcmp(chassis_id, aux->chassis->name) ||
>> > +                           !strcmp(chassis_id, aux->chassis->hostname))) {
>> > +            const struct sbrec_port_binding *localnet_pb
>> > +                = lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
>> > +                                       aux->sbrec_port_binding_by_type,
>> > +                                       aux->dp->tunnel_key, "localnet");
>> > +            if (localnet_pb) {
>> > +                *portp = localnet_pb->tunnel_key;
>> > +                return true;
>> > +            }
>> > +        }
>> > +        return false;
>> >      }
>> >
>> >      const struct sbrec_multicast_group *mg = mcgroup_lookup_by_dp_name(
>> > @@ -144,6 +165,8 @@ add_logical_flows(
>> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >      const struct sbrec_dhcp_options_table *dhcp_options_table,
>> >      const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
>> >      const struct sbrec_logical_flow_table *logical_flow_table,
>> > @@ -183,6 +206,8 @@ add_logical_flows(
>> >          consider_logical_flow(sbrec_chassis_by_name,
>> >                                sbrec_multicast_group_by_name_datapath,
>> >                                sbrec_port_binding_by_name,
>> > +                              sbrec_port_binding_by_type,
>> > +                              sbrec_datapath_binding_by_key,
>> >                                lflow, local_datapaths,
>> >                                chassis, &dhcp_opts, &dhcpv6_opts, &nd_ra_opts,
>> >                                addr_sets, port_groups, active_tunnels,
>> > @@ -200,6 +225,8 @@ consider_logical_flow(
>> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >      const struct sbrec_logical_flow *lflow,
>> >      const struct hmap *local_datapaths,
>> >      const struct sbrec_chassis *chassis,
>> > @@ -292,7 +319,10 @@ consider_logical_flow(
>> >          .sbrec_multicast_group_by_name_datapath
>> >              = sbrec_multicast_group_by_name_datapath,
>> >          .sbrec_port_binding_by_name = sbrec_port_binding_by_name,
>> > -        .dp = lflow->logical_datapath
>> > +        .sbrec_port_binding_by_type = sbrec_port_binding_by_type,
>> > +        .sbrec_datapath_binding_by_key = sbrec_datapath_binding_by_key,
>> > +        .dp = lflow->logical_datapath,
>> > +        .chassis = chassis
>> >      };
>> >      struct condition_aux cond_aux = {
>> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
>> > @@ -463,6 +493,8 @@ void
>> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >            struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> >            struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> > +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > +          struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >            const struct sbrec_dhcp_options_table *dhcp_options_table,
>> >            const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
>> >            const struct sbrec_logical_flow_table *logical_flow_table,
>> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >
>> >      add_logical_flows(sbrec_chassis_by_name,
>> >                        sbrec_multicast_group_by_name_datapath,
>> > -                      sbrec_port_binding_by_name, dhcp_options_table,
>> > +                      sbrec_port_binding_by_name, sbrec_port_binding_by_type,
>> > +                      sbrec_datapath_binding_by_key, dhcp_options_table,
>> >                        dhcpv6_options_table, logical_flow_table,
>> >                        local_datapaths, chassis, addr_sets, port_groups,
>> >                        active_tunnels, local_lport_ids, flow_table, group_table,
>> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
>> > index d19338140..b2911e0eb 100644
>> > --- a/ovn/controller/lflow.h
>> > +++ b/ovn/controller/lflow.h
>> > @@ -68,6 +68,8 @@ void lflow_init(void);
>> >  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >                 struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> >                 struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> > +               struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > +               struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >                 const struct sbrec_dhcp_options_table *,
>> >                 const struct sbrec_dhcpv6_options_table *,
>> >                 const struct sbrec_logical_flow_table *,
>> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
>> > index cc5c5fbb2..9c827d9b0 100644
>> > --- a/ovn/controller/lport.c
>> > +++ b/ovn/controller/lport.c
>> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >      return retval;
>> >  }
>> >
>> > +const struct sbrec_port_binding *
>> > +lport_lookup_by_type(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> > +                     struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > +                     uint64_t dp_key, const char *port_type)
>> > +{
>> > +    /* Lookup datapath corresponding to dp_key. */
>> > +    const struct sbrec_datapath_binding *db = datapath_lookup_by_key(
>> > +        sbrec_datapath_binding_by_key, dp_key);
>> > +    if (!db) {
>> > +        return NULL;
>> > +    }
>> > +
>> > +    /* Build key for an indexed lookup. */
>> > +    struct sbrec_port_binding *pb = sbrec_port_binding_index_init_row(
>> > +            sbrec_port_binding_by_type);
>> > +    sbrec_port_binding_index_set_datapath(pb, db);
>> > +    sbrec_port_binding_index_set_type(pb, port_type);
>> > +
>> > +    const struct sbrec_port_binding *retval = sbrec_port_binding_index_find(
>> > +            sbrec_port_binding_by_type, pb);
>> > +
>> > +    sbrec_port_binding_index_destroy_row(pb);
>> > +
>> > +    return retval;
>> > +}
>> > +
>> >  const struct sbrec_datapath_binding *
>> >  datapath_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >                         uint64_t dp_key)
>> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
>> > index 7dcd5bee0..2d49792f6 100644
>> > --- a/ovn/controller/lport.h
>> > +++ b/ovn/controller/lport.h
>> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding *lport_lookup_by_key(
>> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
>> >      uint64_t dp_key, uint64_t port_key);
>> >
>> > +const struct sbrec_port_binding *lport_lookup_by_type(
>> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > +    uint64_t dp_key, const char *port_type);
>> > +
>> >  const struct sbrec_datapath_binding *datapath_lookup_by_key(
>> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key, uint64_t dp_key);
>> >
>> > diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
>> > index 4e9a5865f..5aab9142f 100644
>> > --- a/ovn/controller/ovn-controller.c
>> > +++ b/ovn/controller/ovn-controller.c
>> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
>> >       * ports that have a Gateway_Chassis that point's to our own
>> >       * chassis */
>> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "chassisredirect");
>> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "external");
>> >      if (chassis) {
>> >          /* This should be mostly redundant with the other clauses for port
>> >           * bindings, but it allows us to catch any ports that are assigned to
>> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
>> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
>> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> >                                    &sbrec_port_binding_col_datapath);
>> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
>> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> > +                                  &sbrec_port_binding_col_type);
>>
>> This index is used with two columns: datapath_binding and type, so it
>> should be created with both columns using create2.
>>
>> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
>> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> >                                    &sbrec_datapath_binding_col_tunnel_key);
>> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
>> >                              sbrec_chassis_by_name,
>> >                              sbrec_multicast_group_by_name_datapath,
>> >                              sbrec_port_binding_by_name,
>> > +                            sbrec_port_binding_by_type,
>> > +                            sbrec_datapath_binding_by_key,
>> >                              sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
>> >                              sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
>> >                              sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
>> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
>> > index aa03919bb..a9d4b8736 100644
>> > --- a/ovn/lib/ovn-util.c
>> > +++ b/ovn/lib/ovn-util.c
>> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
>> >      "localport",
>> >      "router",
>> >      "vtep",
>> > +    "external",
>> >  };
>> >
>> >  bool
>> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
>> > index 392a5efc9..c8883d60d 100644
>> > --- a/ovn/northd/ovn-northd.8.xml
>> > +++ b/ovn/northd/ovn-northd.8.xml
>> > @@ -626,7 +626,8 @@ nd_na_router {
>> >      <p>
>> >        This table adds the DHCPv4 options to a DHCPv4 packet from the
>> >        logical ports configured with IPv4 address(es) and DHCPv4 options,
>> > -      and similarly for DHCPv6 options.
>> > +      and similarly for DHCPv6 options. This table also adds flows for the
>> > +      logical ports of type <code>external</code>.
>> >      </p>
>> >
>> >      <ul>
>> > @@ -827,7 +828,39 @@ output;
>> >        </li>
>> >      </ul>
>> >
>> > -    <h3>Ingress Table 16 Destination Lookup</h3>
>> > +    <h3>Ingress table 16 External ports</h3>
>> > +
>> > +    <p>
>> > +      Traffic from the <code>external</code> logical ports enter the ingress
>> > +      datapath pipeline via the <code>localnet</code> port. This table adds the
>> > +      below logical flows to handle the traffic from these ports.
>> > +    </p>
>> > +
>> > +    <ul>
>> > +      <li>
>> > +        <p>
>> > +          A priority-100 flow is added for each <code>external</code> logical
>> > +          port which doesn't reside on a chassis to drop the ARP/IPv6 NS
>> > +          request to the router IP(s) (of the logical switch) which matches
>> > +          on the <code>inport</code> of the <code>external</code> logical port
>> > +          and the valid <code>eth.src</code> address(es) of the
>> > +          <code>external</code> logical port.
>> > +        </p>
>> > +
>> > +        <p>
>> > +          This flow guarantees that the ARP/NS request to the router IP
>> > +          address from the external ports is responded by only the chassis
>> > +          which has claimed these external ports. All the other chassis,
>> > +          drops these packets.
>> > +        </p>
>> > +      </li>
>> > +
>> > +      <li>
>> > +        A priority-0 flow that matches all packets to advances to table 17.
>> > +      </li>
>> > +    </ul>
>> > +
>> > +    <h3>Ingress Table 17 Destination Lookup</h3>
>> >
>> >      <p>
>> >        This table implements switching behavior.  It contains these logical
>> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>> > index 3fd8a8757..87208c6c1 100644
>> > --- a/ovn/northd/ovn-northd.c
>> > +++ b/ovn/northd/ovn-northd.c
>> > @@ -119,7 +119,8 @@ enum ovn_stage {
>> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13, "ls_in_dhcp_response") \
>> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14, "ls_in_dns_lookup")    \
>> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15, "ls_in_dns_response")  \
>> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16, "ls_in_l2_lkup")       \
>> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16, "ls_in_external_port") \
>> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17, "ls_in_l2_lkup")       \
>> >                                                                            \
>> >      /* Logical switch egress stages. */                                   \
>> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")         \
>> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct nbrec_logical_switch_port *lsp)
>> >      return !lsp->up || *lsp->up;
>> >  }
>> >
>> > +static bool
>> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
>> > +{
>> > +    return !strcmp(nbsp->type, "external");
>> > +}
>> > +
>> >  static bool
>> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
>> >                      struct ds *options_action, struct ds *response_action,
>> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >           *  - port type is localport
>> >           */
>> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") &&
>> > -            strcmp(op->nbsp->type, "localport")) {
>> > +            strcmp(op->nbsp->type, "localport") && lsp_is_external(op->nbsp)) {
>>
>> Sorry that I missed this in last review. The && condition has problem.
>> It will cause ARP responder flows added for all lports that are not
>> external. I think it should be || here.
>
>
> Agree. To make it easier to read, I will add a new "if" with continue - below this one for
> external port types.
>
>
>>
>>
>> >              continue;
>> >          }
>> >
>> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >              continue;
>> >          }
>> >
>> > +        bool is_external = lsp_is_external(op->nbsp);
>> > +        if (is_external && !op->od->localnet_port) {
>> > +            /* If it's an external port and there is no localnet port
>> > +             * ignore it. */
>> > +            continue;
>> > +        }
>> > +
>> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>> >              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
>> >                  struct ds options_action = DS_EMPTY_INITIALIZER;
>> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >                      ds_put_format(
>> >                          &match, "inport == %s && eth.src == %s && "
>> >                          "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
>> > -                        "udp.src == 68 && udp.dst == 67", op->json_key,
>> > -                        op->lsp_addrs[i].ea_s);
>> > +                        "udp.src == 68 && udp.dst == 67",
>> > +                        op->json_key, op->lsp_addrs[i].ea_s);
>>
>> No change here?
>
>
> I think it's unwanted and unrelated change. I will correct it.
>>
>> >
>> >                      ovn_lflow_add(lflows, op->od, S_SWITCH_IN_DHCP_OPTIONS,
>> >                                    100, ds_cstr(&match),
>> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >      /* Ingress table 12 and 13: DHCP options and response, by default goto
>> >       * next. (priority 0).
>> >       * Ingress table 14 and 15: DNS lookup and response, by default goto next.
>> > -     * (priority 0).*/
>> > +     * (priority 0).
>> > +     * Ingress table 16 - External port handling, by default goto next.
>> > +     * (priority 0). */
>> >
>> >      HMAP_FOR_EACH (od, key_node, datapaths) {
>> >          if (!od->nbs) {
>> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1", "next;");
>> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, "1", "next;");
>> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0, "1", "next;");
>> > +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0, "1", "next;");
>> >      }
>> >
>> > -    /* Ingress table 16: Destination lookup, broadcast and multicast handling
>> > +    HMAP_FOR_EACH (op, key_node, ports) {
>> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
>> > +           continue;
>> > +        }
>> > +
>> > +        /* Table 16: External port. Drop ARP request for router ips from
>> > +         * external ports  on chassis not binding those ports.
>> > +         * This makes the router pipeline to be run only on the chassis
>> > +         * binding the external ports. */
>> > +
>> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>> > +            for (size_t j = 0; j < op->od->n_router_ports; j++) {
>> > +                struct ovn_port *rp = op->od->router_ports[j];
>> > +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
>> > +                    for (size_t l = 0; l < rp->lsp_addrs[k].n_ipv4_addrs;
>> > +                         l++) {
>> > +                        ds_clear(&match);
>> > +                        ds_put_cstr(&match, "ip4");
>> > +                        ds_put_format(
>> > +                            &match, "inport == %s && eth.src == %s"
>> > +                            " && !is_chassis_resident(%s)"
>> > +                            " && arp.tpa == %s && arp.op == 1",
>> > +                            op->json_key, op->lsp_addrs[i].ea_s, op->json_key,
>>
>> I believe the inport should match the localnet port's json_key here,
>> since it is coming from a localnet port.
>
>
> Both would work. If you see the code in lflow.c in this patch - it will get the tunnel
> key of the localnet port if the port_binding type is "external".
>
> That's how even the DHCP requests are handled. ovn-controller will translate
> the logical flows with action "put_dhcp_opts" only the chassis claiming the
> external ports.

Oh, yes you are right. Actually I read that part in v4 and it somehow
slipped my mind. Thanks for explain.
>
> Thanks
> Numan
>
>
>>
>>
>> > +                            rp->lsp_addrs[k].ipv4_addrs[l].addr_s);
>> > +                        ovn_lflow_add(lflows, op->od,
>> > +                                      S_SWITCH_IN_EXTERNAL_PORT, 100,
>> > +                                      ds_cstr(&match), "drop;");
>> > +                    }
>> > +                    for (size_t l = 0; l < rp->lsp_addrs[k].n_ipv6_addrs;
>> > +                         l++) {
>> > +                        ds_clear(&match);
>> > +                        ds_put_format(
>> > +                            &match, "inport == %s && eth.src == %s"
>> > +                            " && !is_chassis_resident(%s)"
>> > +                            " && nd_ns && ip6.dst == {%s, %s} && "
>> > +                            "nd.target == %s",
>> > +                            op->json_key, op->lsp_addrs[i].ea_s, op->json_key,
>>
>> same as above.
>>
>> > +                            rp->lsp_addrs[k].ipv6_addrs[l].addr_s,
>> > +                            rp->lsp_addrs[k].ipv6_addrs[l].sn_addr_s,
>> > +                            rp->lsp_addrs[k].ipv6_addrs[l].addr_s);
>> > +                        ovn_lflow_add(lflows, op->od,
>> > +                                      S_SWITCH_IN_EXTERNAL_PORT, 100,
>> > +                                      ds_cstr(&match), "drop;");
>> > +                    }
>> > +                }
>> > +            }
>> > +        }
>> > +    }
>> > +    /* Ingress table 17: Destination lookup, broadcast and multicast handling
>> >       * (priority 100). */
>> >      HMAP_FOR_EACH (op, key_node, ports) {
>> >          if (!op->nbsp) {
>> > @@ -4448,9 +4513,9 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >                        "outport = \""MC_FLOOD"\"; output;");
>> >      }
>> >
>> > -    /* Ingress table 16: Destination lookup, unicast handling (priority 50), */
>> > +    /* Ingress table 17: Destination lookup, unicast handling (priority 50), */
>> >      HMAP_FOR_EACH (op, key_node, ports) {
>> > -        if (!op->nbsp) {
>> > +        if (!op->nbsp || lsp_is_external(op->nbsp)) {
>> >              continue;
>> >          }
>> >
>> > @@ -4567,7 +4632,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >          }
>> >      }
>> >
>> > -    /* Ingress table 16: Destination lookup for unknown MACs (priority 0). */
>> > +    /* Ingress table 17: Destination lookup for unknown MACs (priority 0). */
>> >      HMAP_FOR_EACH (od, key_node, datapaths) {
>> >          if (!od->nbs) {
>> >              continue;
>> > @@ -4602,7 +4667,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >       * Priority 150 rules drop packets to disabled logical ports, so that they
>> >       * don't even receive multicast or broadcast packets. */
>> >      HMAP_FOR_EACH (op, key_node, ports) {
>> > -        if (!op->nbsp) {
>> > +        if (!op->nbsp || lsp_is_external(op->nbsp)) {
>> >              continue;
>> >          }
>> >
>> > diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
>> > index 3936e6016..405975b7b 100644
>> > --- a/ovn/ovn-architecture.7.xml
>> > +++ b/ovn/ovn-architecture.7.xml
>> > @@ -1678,6 +1678,84 @@
>> >      </li>
>> >    </ol>
>> >
>> > +  <h2>Native OVN services for external logical ports</h2>
>> > +
>> > +  <p>
>> > +    To support OVN native services (like DHCP/IPv6 RA/DNS lookup) to the
>> > +    cloud resources which are external, OVN supports <code>external</code>
>> > +    logical ports.
>> > +  </p>
>> > +
>> > +  <p>
>> > +    Below are some of the use cases where <code>external</code> ports can be
>> > +    used.
>> > +  </p>
>> > +
>> > +  <ul>
>> > +    <li>
>> > +      VMs connected to SR-IOV nics - Traffic from these VMs by passes the
>> > +      kernel stack and local <code>ovn-controller</code> do not bind these
>> > +      ports and cannot serve the native services.
>> > +    </li>
>> > +    <li>
>> > +      When CMS supports provisioning baremetal servers.
>> > +    </li>
>> > +  </ul>
>> > +
>> > +  <p>
>> > +    OVN will provide the native services if CMS has done the below
>> > +    configuration in the <dfn>OVN Northbound Database</dfn>.
>> > +  </p>
>> > +
>> > +  <ul>
>> > +    <li>
>> > +      A row is created in <code>Logical_Switch_Port</code>, configuring the
>> > +      <ref column="addresses" table="Logical_Switch_Port" db="OVN_NB"/> column
>> > +      and setting the <ref column="type" table="Logical_Switch_Port"
>> > +      db="OVN_NB"/> to <code>external</code>.
>> > +    </li>
>> > +
>> > +    <li>
>> > +      <ref column="options:requested-chassis" table="Logical_Switch_Port"
>> > +      db="OVN_NB"/> column is configured to a desired chassis.
>> > +    </li>
>> > +
>> > +    <li>
>> > +      The chassis on which this logical port is requested has the
>> > +      <code>ovn-bridge-mappings</code> configured and has proper L2
>> > +      connectivity so that it can receive the DHCP and other related request
>> > +      packets from these external resources.
>> > +    </li>
>> > +
>> > +    <li>
>> > +      The Logical_Switch of this port has a <code>localnet</code> port.
>> > +    </li>
>> > +
>> > +    <li>
>> > +      Native OVN services are enabled by configuring the DHCP and other
>> > +      options like the way it is done for the normal logical ports.
>> > +    </li>
>> > +  </ul>
>> > +
>> > +  <p>
>> > +    OVN doesn't support HA for these <code>external</code> ports. In case
>> > +    the <code>ovn-controller</code> running on the requested chassis goes down,
>> > +    it is the responsiblity of CMS, to reschedule these <code>external</code>
>> > +    ports to other active chassis.
>> > +  </p>
>> > +
>> > +  <p>
>> > +    It is recommended to request the same chassis for all the external ports
>> > +    of a logical switch. Otherwise, the physical switch might see MAC flap
>> > +    issue when different chassis provide the native services. For example when
>> > +    supporting native DHCPv4 service, DHCPv4 server mac (configured in
>> > +    <ref column="options:server_mac" table="DHCP_Options" db="OVN_NB"/> column
>> > +    in table <ref table="DHCP_Options"/>)
>> > +    originating from different ports can cause MAC flap issue. The MAC of the
>> > +    logical router IP(s) can also flap if the same chassis is not requested for
>> > +    all the external ports of a logical switch.
>> > +  </p>
>> > +
>> >    <h1>Security</h1>
>> >
>> >    <h2>Role-Based Access Controls for the Soutbound DB</h2>
>> > diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
>> > index 6d6fb055a..fdf9adbfa 100644
>> > --- a/ovn/ovn-nb.xml
>> > +++ b/ovn/ovn-nb.xml
>> > @@ -353,6 +353,53 @@
>> >            <dd>
>> >              A port to a logical switch on a VTEP gateway.
>> >            </dd>
>> > +
>> > +          <dt><code>external</code></dt>
>> > +          <dd>
>> > +            <p>
>> > +              Represents a logical port which is external and not having
>> > +              an OVS port in the integration bridge.
>> > +              <code>OVN</code> will never receive any traffic from this port or
>> > +              send any traffic to this port. <code>OVN</code> can support
>> > +              native services like DHCPv4/DHCPv6/DNS for this port.
>> > +              If <ref column="options:requested-chassis"/> is defined,
>> > +              <code>ovn-controller</code> running in that chassis will bind
>> > +              this port to provide these native services. It is expected that
>> > +              this port belong to a bridged logical switch
>> > +              (with a <code>localnet</code> port).
>> > +            </p>
>> > +
>> > +            <p>
>> > +              It is recommended to request the same chassis for all the
>> > +              external ports of a logical switch. Otherwise, the physical
>> > +              switch might see MAC flap issue when different chassis provide
>> > +              the native services. For example when supporting native DHCPv4
>> > +              service, DHCPv4 server mac (configured in
>> > +              <ref column="options:server_mac" table="DHCP_Options"
>> > +              db="OVN_NB"/> column in table <ref table="DHCP_Options"/>)
>> > +              originating from different ports can cause MAC flap issue.
>> > +              The MAC of the logical router IP(s) can also flap if the
>> > +              same chassis is not requested for all the external ports
>> > +              of a logical switch.
>> > +            </p>
>> > +
>> > +            <p>
>> > +              Below are some of the use cases where <code>external</code>
>> > +              ports can be used.
>> > +            </p>
>> > +
>> > +            <ul>
>> > +              <li>
>> > +                VMs connected to SR-IOV nics - Traffic from these VMs by passes
>> > +                the kernel stack and local <code>ovn-controller</code> do not
>> > +                bind these ports and cannot serve the native services.
>> > +              </li>
>> > +
>> > +              <li>
>> > +                When CMS supports provisioning baremetal servers.
>> > +              </li>
>> > +            </ul>
>> > +          </dd>
>> >          </dl>
>> >        </column>
>> >      </group>
>> > diff --git a/tests/ovn.at b/tests/ovn.at
>> > index 8bada3241..94c774e8b 100644
>> > --- a/tests/ovn.at
>> > +++ b/tests/ovn.at
>> > @@ -9594,9 +9594,9 @@ AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=32 | grep active_backup | gre
>> >  sleep 3 # let BFD sessions settle so we get the right flows on the right chassis
>> >
>> >  # make sure that flows for handling the outside router port reside on gw1
>> > -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
>> > +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
>> >  ]])
>> > -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
>> > +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
>> >  ]])
>> >
>> >  # make sure ARP responder flows for outside router port reside on gw1 too
>> > @@ -9686,9 +9686,9 @@ AT_CHECK([ovs-vsctl --bare --columns bfd find Interface name=ovn-hv1-0],[0],
>> >  sleep 3  # let BFD sessions settle so we get the right flows on the right chassis
>> >
>> >  # make sure that flows for handling the outside router port reside on gw2 now
>> > -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
>> > +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
>> >  ]])
>> > -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
>> > +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
>> >  ]])
>> >
>> >  # disconnect GW2 from the network, GW1 should take over
>> > @@ -9700,9 +9700,9 @@ sleep 4
>> >  bfd_dump
>> >
>> >  # make sure that flows for handling the outside router port reside on gw2 now
>> > -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
>> > +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
>> >  ]])
>> > -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
>> > +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
>> >  ]])
>> >
>> >  # check that the chassis redirect port has been reclaimed by the gw1 chassis
>> > @@ -11619,6 +11619,524 @@ as hv2 start_daemon ovn-controller
>> >  OVN_CLEANUP([hv1],[hv2])
>> >  AT_CLEANUP
>> >
>> > +AT_SETUP([ovn -- external logical port])
>> > +AT_SKIP_IF([test $HAVE_PYTHON = no])
>> > +ovn_start
>> > +
>> > +net_add n1
>> > +sim_add hv1
>> > +sim_add hv2
>> > +
>> > +ovn-nbctl ls-add ls1
>> > +ovn-nbctl lsp-add ls1 ls1-lp1 \
>> > +-- lsp-set-addresses ls1-lp1 "f0:00:00:00:00:01 10.0.0.4 ae70::4"
>> > +
>> > +# Add a couple of external logical port
>> > +ovn-nbctl lsp-add ls1 ls1-lp_ext1 \
>> > +-- lsp-set-addresses ls1-lp_ext1 "f0:00:00:00:00:03 10.0.0.6 ae70::6"
>> > +ovn-nbctl lsp-set-port-security ls1-lp_ext1 \
>> > +"f0:00:00:00:00:03 10.0.0.6 ae70::6"
>> > +ovn-nbctl lsp-set-type ls1-lp_ext1 external
>> > +
>> > +ovn-nbctl lsp-add ls1 ls1-lp_ext2 \
>> > +-- lsp-set-addresses ls1-lp_ext2 "f0:00:00:00:00:04 10.0.0.7 ae70::7"
>> > +ovn-nbctl lsp-set-port-security ls1-lp_ext2 \
>> > +"f0:00:00:00:00:04 10.0.0.7 ae70::8"
>> > +ovn-nbctl lsp-set-type ls1-lp_ext2 external
>> > +
>> > +d1="$(ovn-nbctl create DHCP_Options cidr=10.0.0.0/24 \
>> > +options="\"server_id\"=\"10.0.0.1\" \"server_mac\"=\"ff:10:00:00:00:01\" \
>> > +\"lease_time\"=\"3600\" \"router\"=\"10.0.0.1\"")"
>> > +
>> > +d2="$(ovn-nbctl create DHCP_Options cidr="ae70\:\:/64" \
>> > +options="\"server_id\"=\"00:00:00:10:00:01\"")"
>> > +
>> > +ovn-nbctl lsp-set-dhcpv4-options ls1-lp1 ${d1}
>> > +ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext1 ${d1}
>> > +ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext2 ${d1}
>> > +
>> > +ovn-nbctl lsp-set-dhcpv6-options ls1-lp1 ${d2}
>> > +ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext1 ${d2}
>> > +ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext2 ${d2}
>> > +
>> > +# Create a logical router and connect it to ls1
>> > +ovn-nbctl lr-add lr0
>> > +ovn-nbctl lrp-add lr0 lr0-ls1 a0:10:00:00:00:01 10.0.0.1/24
>> > +ovn-nbctl lsp-add ls1 ls1-lr0
>> > +ovn-nbctl set Logical_Switch_Port ls1-lr0 type=router \
>> > +    options:router-port=lr0-ls1 addresses=router
>> > +
>> > +as hv1
>> > +ovs-vsctl add-br br-phys
>> > +ovn_attach n1 br-phys 192.168.0.1
>> > +ovs-vsctl -- add-port br-phys hv1-ext1 -- \
>> > +    set interface hv1-ext1 options:tx_pcap=hv1/ext1-tx.pcap \
>> > +    options:rxq_pcap=hv1/ext1-rx.pcap \
>> > +    ofport-request=2
>> > +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
>> > +
>> > +as hv2
>> > +ovs-vsctl add-br br-phys
>> > +ovn_attach n1 br-phys 192.168.0.2
>> > +ovs-vsctl -- add-port br-phys hv2-ext2 -- \
>> > +    set interface hv2-ext2 options:tx_pcap=hv2/ext2-tx.pcap \
>> > +    options:rxq_pcap=hv2/ext2-rx.pcap \
>> > +    ofport-request=2
>> > +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
>> > +
>> > +ovn-sbctl dump-flows > lflows_n.txt
>> > +
>> > +# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in hv1 and
>> > +# hv2 as requested-chassis option is not set and no localnet port added to ls1.
>> > +AT_CHECK([ovn-sbctl dump-flows ls1 | grep "offerip = 10.0.0.6" | \
>> > +wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep tp_src=546 | grep \
>> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep tp_src=546 | grep \
>> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
>> > +])
>> > +
>> > +hv1_uuid=$(ovn-sbctl list chassis hv1 | grep uuid | awk '{print $3}')
>> > +
>> > +# The port_binding row for ls1-lp_ext1 should have empty chassis
>> > +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
>> > +grep -v requested | grep chassis | awk '{print $3}')
>> > +
>> > +AT_CHECK([test $chassis == "[[]]"], [0], [])
>> > +
>> > +# Set the requested-chassis option for ls1-lp_ext1
>> > +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1
>> > +
>> > +# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in hv1 and hv2
>> > +# as no localnet port added to ls1 yet.
>> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep tp_src=546 | grep \
>> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep tp_src=546 | grep \
>> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
>> > +])
>> > +
>> > +# Add the localnet port to the logical switch ls1
>> > +ovn-nbctl lsp-add ls1 ln-public
>> > +ovn-nbctl lsp-set-addresses ln-public unknown
>> > +ovn-nbctl lsp-set-type ln-public localnet
>> > +ovn-nbctl --wait=hv lsp-set-options ln-public network_name=phys
>> > +
>> > +ln_public_key=$(ovn-sbctl list port_binding ln-public | grep  tunnel_key | \
>> > +awk '{print $3}')
>> > +
>> > +# The ls1-lp_ext1 should be bound to hv1
>> > +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
>> > +grep -v requested | grep chassis | awk '{print $3}')
>> > +AT_CHECK([test $chassis == "$hv1_uuid"], [0], [])
>> > +
>> > +# There should be DHCPv4/v6 OF flows for the ls1-lp_ext1 port in hv1
>> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \
>> > +wc -l], [0], [3
>> > +])
>> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep tp_src=546 | grep \
>> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
>> > +grep reg14=0x$ln_public_key | wc -l], [0], [1
>> > +])
>> > +
>> > +# There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv2
>> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep tp_src=546 | grep \
>> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
>> > +])
>> > +
>> > +# No DHCPv4/v6 flows for the external port - ls1-lp_ext2 - 10.0.0.7 in hv1 and
>> > +# hv2 as requested-chassis option is not set.
>> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep "0a.00.00.07" | wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep "0a.00.00.07" | wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep tp_src=546 | grep \
>> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep tp_src=546 | grep \
>> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0
>> > +])
>> > +
>> > +as hv1
>> > +ovs-vsctl show
>> > +
>> > +# This shell function sends a DHCP request packet
>> > +# test_dhcp INPORT SRC_MAC DHCP_TYPE OFFER_IP ...
>> > +test_dhcp() {
>> > +    local inport=$1 src_mac=$2 dhcp_type=$3 offer_ip=$4 use_ip=$5
>> > +    shift; shift; shift; shift; shift;
>> > +    if test $use_ip != 0; then
>> > +        src_ip=$1
>> > +        dst_ip=$2
>> > +        shift; shift;
>> > +    else
>> > +        src_ip=`ip_to_hex 0 0 0 0`
>> > +        dst_ip=`ip_to_hex 255 255 255 255`
>> > +    fi
>> > +    local request=ffffffffffff${src_mac}0800451001100000000080110000${src_ip}${dst_ip}
>> > +    # udp header and dhcp header
>> > +    request=${request}0044004300fc0000
>> > +    request=${request}010106006359aa760000000000000000000000000000000000000000${src_mac}
>> > +    # client hardware padding
>> > +    request=${request}00000000000000000000
>> > +    # server hostname
>> > +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
>> > +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
>> > +    # boot file name
>> > +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
>> > +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
>> > +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
>> > +    request=${request}0000000000000000000000000000000000000000000000000000000000000000
>> > +    # dhcp magic cookie
>> > +    request=${request}63825363
>> > +    # dhcp message type
>> > +    request=${request}3501${dhcp_type}ff
>> > +
>> > +    local srv_mac=$1 srv_ip=$2 expected_dhcp_opts=$3
>> > +    # total IP length will be the IP length of the request packet
>> > +    # (which is 272 in our case) + 8 (padding bytes) + (expected_dhcp_opts / 2)
>> > +    ip_len=`expr 280 + ${#expected_dhcp_opts} / 2`
>> > +    udp_len=`expr $ip_len - 20`
>> > +    ip_len=$(printf "%x" $ip_len)
>> > +    udp_len=$(printf "%x" $udp_len)
>> > +    # $ip_len var will be in 3 digits i.e 134. So adding a '0' before $ip_len
>> > +    local reply=${src_mac}${srv_mac}080045100${ip_len}000000008011XXXX${srv_ip}${offer_ip}
>> > +    # udp header and dhcp header.
>> > +    # $udp_len var will be in 3 digits. So adding a '0' before $udp_len
>> > +    reply=${reply}004300440${udp_len}0000020106006359aa760000000000000000
>> > +    # your ip address
>> > +    reply=${reply}${offer_ip}
>> > +    # next server ip address, relay agent ip address, client mac address
>> > +    reply=${reply}0000000000000000${src_mac}
>> > +    # client hardware padding
>> > +    reply=${reply}00000000000000000000
>> > +    # server hostname
>> > +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
>> > +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
>> > +    # boot file name
>> > +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
>> > +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
>> > +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
>> > +    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
>> > +    # dhcp magic cookie
>> > +    reply=${reply}63825363
>> > +    # dhcp message type
>> > +    local dhcp_reply_type=02
>> > +    if test $dhcp_type = 03; then
>> > +        dhcp_reply_type=05
>> > +    fi
>> > +    reply=${reply}3501${dhcp_reply_type}${expected_dhcp_opts}00000000ff00000000
>> > +    echo $reply >> ext1_v4.expected
>> > +
>> > +    as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport} $request
>> > +}
>> > +
>> > +
>> > +trim_zeros() {
>> > +    sed 's/\(00\)\{1,\}$//'
>> > +}
>> > +
>> > +# This shell function sends a DHCPv6 request packet
>> > +# test_dhcpv6 INPORT SRC_MAC SRC_LLA DHCPv6_MSG_TYPE OFFER_IP OUTPORT...
>> > +# The OUTPORTs (zero or more) list the VIFs on which the original DHCPv6
>> > +# packet should be received twice (one from ovn-controller and the other
>> > +# from the "ovs-ofctl monitor br-int resume"
>> > +test_dhcpv6() {
>> > +    local inport=$1 src_mac=$2 src_lla=$3 msg_code=$4 offer_ip=$5
>> > +    local req_pkt_in_expected=$6
>> > +    local request=ffffffffffff${src_mac}86dd00000000002a1101${src_lla}
>> > +    # dst ip ff02::1:2
>> > +    request=${request}ff020000000000000000000000010002
>> > +    # udp header and dhcpv6 header
>> > +    request=${request}02220223002affff${msg_code}010203
>> > +    # Client identifier
>> > +    request=${request}0001000a00030001${src_mac}
>> > +    # IA-NA (Identity Association for Non Temporary Address)
>> > +    request=${request}0003000c0102030400000e1000001518
>> > +    shift; shift; shift; shift; shift;
>> > +
>> > +    local server_mac=000000100001
>> > +    local server_lla=fe80000000000000020000fffe100001
>> > +    local reply_code=07
>> > +    if test $msg_code = 01; then
>> > +        reply_code=02
>> > +    fi
>> > +    local msg_len=54
>> > +    if test $offer_ip = 1; then
>> > +        msg_len=28
>> > +    fi
>> > +    local reply=${src_mac}${server_mac}86dd0000000000${msg_len}1101
>> > +    reply=${reply}${server_lla}${src_lla}
>> > +
>> > +    # udp header and dhcpv6 header
>> > +    reply=${reply}0223022200${msg_len}ffff${reply_code}010203
>> > +    # Client identifier
>> > +    reply=${reply}0001000a00030001${src_mac}
>> > +    # IA-NA
>> > +    if test $offer_ip != 1; then
>> > +        reply=${reply}0003002801020304ffffffffffffffff00050018${offer_ip}
>> > +        reply=${reply}ffffffffffffffff
>> > +    fi
>> > +    # Server identifier
>> > +    reply=${reply}0002000a00030001${server_mac}
>> > +
>> > +    echo $reply | trim_zeros >> ext${inport}_v6.expected
>> > +    # The inport also receives the request packet since it is connected
>> > +    # to the br-phys.
>> > +    #echo $request >> ext${inport}_v6.expected
>> > +
>> > +    as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport} $request
>> > +}
>> > +
>> > +reset_pcap_file() {
>> > +    local iface=$1
>> > +    local pcap_file=$2
>> > +    ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
>> > +options:rxq_pcap=dummy-rx.pcap
>> > +    rm -f ${pcap_file}*.pcap
>> > +    ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \
>> > +options:rxq_pcap=${pcap_file}-rx.pcap
>> > +}
>> > +
>> > +ip_to_hex() {
>> > +    printf "%02x%02x%02x%02x" "$@"
>> > +}
>> > +
>> > +AT_CAPTURE_FILE([ofctl_monitor0_hv1.log])
>> > +as hv1 ovs-ofctl monitor br-int resume --detach --no-chdir \
>> > +--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv1.log
>> > +
>> > +AT_CAPTURE_FILE([ofctl_monitor0_hv2.log])
>> > +as hv2 ovs-ofctl monitor br-int resume --detach --no-chdir \
>> > +--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv2.log
>> > +
>> > +# Send DHCPDISCOVER.
>> > +offer_ip=`ip_to_hex 10 0 0 6`
>> > +server_ip=`ip_to_hex 10 0 0 1`
>> > +server_mac=ff1000000001
>> > +expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001
>> > +test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \
>> > +$expected_dhcp_opts
>> > +
>> > +# NXT_RESUMEs should be 1 in hv1.
>> > +OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
>> > +
>> > +# NXT_RESUMEs should be 0 in hv2.
>> > +OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
>> > +
>> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_v4.packets
>> > +cat ext1_v4.expected | cut -c -48 > expout
>> > +AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout])
>> > +# Skipping the IPv4 checksum.
>> > +cat ext1_v4.expected | cut -c 53- > expout
>> > +AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout])
>> > +
>> > +# ovs-ofctl also resumes the packets and this causes other ports to receive
>> > +# the DHCP request packet. So reset the pcap files so that its easier to test.
>> > +reset_pcap_file hv1-ext1 hv1/ext1
>> > +rm -f ext1_v4.expected
>> > +rm -f ext1_v4.packets
>> > +
>> > +# Send DHCPv6 request
>> > +src_mac=f00000000003
>> > +src_lla=fe80000000000000f20000fffe000003
>> > +offer_ip=ae700000000000000000000000000006
>> > +test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip
>> > +
>> > +# NXT_RESUMEs should be 2 in hv1.
>> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
>> > +
>> > +# NXT_RESUMEs should be 0 in hv2.
>> > +OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
>> > +
>> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \
>> > +sort > ext1_v6.packets
>> > +cat ext1_v6.expected | cut -c -120 > expout
>> > +AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout])
>> > +# Skipping the UDP checksum
>> > +cat ext1_v6.expected | cut -c 125- > expout
>> > +AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout])
>> > +
>> > +rm -f ext1_v6.expected
>> > +rm -f ext1_v6.packets
>> > +reset_pcap_file hv1-ext1 hv1/ext1
>> > +
>> > +# Change the requested-chassis option for ls1-lp_ext1 from hv1 to hv2
>> > +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv2
>> > +
>> > +hv2_uuid=$(ovn-sbctl list chassis hv2 | grep uuid | awk '{print $3}')
>> > +
>> > +# The ls1-lp_ext1 should be bound to hv2
>> > +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
>> > +grep -v requested | grep chassis | awk '{print $3}')
>> > +AT_CHECK([test $chassis == "$hv2_uuid"], [0], [])
>> > +
>> > +# There should be OF flows for DHCP4/v6 for the ls1-lp_ext1 port in hv2
>> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \
>> > +wc -l], [0], [3
>> > +])
>> > +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep tp_src=546 | grep \
>> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
>> > +grep reg14=0x$ln_public_key | wc -l], [0], [1
>> > +])
>> > +
>> > +# There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv1
>> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep "0a.00.00.06" | wc -l], [0], [0
>> > +])
>> > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
>> > +grep controller | grep tp_src=546 | grep \
>> > +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
>> > +grep reg14=0x$ln_public_key | wc -l], [0], [0
>> > +])
>> > +
>> > +# Send DHCPDISCOVER again for hv1/ext1. The DHCP response should come from
>> > +# hv2 ovn-controller. Due to the test setup, the port hv1/ext1 is also
>> > +# receiving the expected packet.
>> > +offer_ip=`ip_to_hex 10 0 0 6`
>> > +server_ip=`ip_to_hex 10 0 0 1`
>> > +server_mac=ff1000000001
>> > +expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001
>> > +test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \
>> > +$expected_dhcp_opts
>> > +
>> > +# NXT_RESUMEs should be 2 in hv1.
>> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
>> > +
>> > +# NXT_RESUMEs should be 1 in hv2.
>> > +OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
>> > +
>> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_v4.packets
>> > +cat ext1_v4.expected | cut -c -48 > expout
>> > +AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout])
>> > +# Skipping the IPv4 checksum.
>> > +cat ext1_v4.expected | cut -c 53- > expout
>> > +AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout])
>> > +
>> > +# ovs-ofctl also resumes the packets and this causes other ports to receive
>> > +# the DHCP request packet. So reset the pcap files so that its easier to test.
>> > +reset_pcap_file hv1-ext1 hv1/ext1
>> > +rm -f ext1_v4.expected
>> > +
>> > +# Send DHCPv6 request again
>> > +src_mac=f00000000003
>> > +src_lla=fe80000000000000f20000fffe000003
>> > +offer_ip=ae700000000000000000000000000006
>> > +test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip 1
>> > +
>> > +# NXT_RESUMEs should be 2 in hv1.
>> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
>> > +
>> > +# NXT_RESUMEs should be 2 in hv2.
>> > +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
>> > +
>> > +as hv1
>> > +ovs-vsctl show
>> > +ovs-ofctl dump-flows br-int
>> > +
>> > +as hv2
>> > +ovs-vsctl show
>> > +ovs-ofctl dump-flows br-int
>> > +
>> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \
>> > +sort > ext1_v6.packets
>> > +cat ext1_v6.expected | cut -c -120 > expout
>> > +AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout])
>> > +# Skipping the UDP checksum
>> > +cat ext1_v6.expected | cut -c 125- > expout
>> > +AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout])
>> > +
>> > +rm -f ext1_v6.expected
>> > +rm -f ext1_v6.packets
>> > +
>> > +as hv1
>> > +ovs-vsctl show
>> > +reset_pcap_file hv1-ext1 hv1/ext1
>> > +reset_pcap_file br-phys_n1 hv1/br-phys_n1
>> > +reset_pcap_file br-phys hv1/br-phys
>> > +
>> > +as hv2
>> > +ovs-vsctl show
>> > +reset_pcap_file hv2-ext2 hv2/ext2
>> > +reset_pcap_file br-phys_n1 hv2/br-phys_n1
>> > +reset_pcap_file br-phys hv2/br-phys
>> > +
>> > +# From  ls1-lp_ext1, send ARP request for the router ip. The ARP
>> > +# response should come from the router pipeline of hv2.
>> > +ext1_mac=f00000000003
>> > +router_mac=a01000000001
>> > +ext1_ip=`ip_to_hex 10 0 0 6`
>> > +router_ip=`ip_to_hex 10 0 0 1`
>> > +arp_request=ffffffffffff${ext1_mac}08060001080006040001${ext1_mac}${ext1_ip}000000000000${router_ip}
>> > +
>> > +as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request
>> > +expected_response=${src_mac}${router_mac}08060001080006040002${router_mac}${router_ip}${ext1_mac}${ext1_ip}
>> > +echo $expected_response > expout
>> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_arp_resp
>> > +AT_CHECK([cat ext1_arp_resp], [0], [expout])
>> > +
>> > +# Verify that the response came from hv2
>> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap > ext1_arp_resp
>> > +AT_CHECK([cat ext1_arp_resp], [0], [expout])
>> > +
>> > +
>> > +# # Change the requested-chassis option for ls1-lp_ext1 from hv2 to hv1
>> > +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1
>> > +
>> > +as hv1
>> > +ovs-vsctl show
>> > +reset_pcap_file hv1-ext1 hv1/ext1
>> > +reset_pcap_file br-phys_n1 hv1/br-phys_n1
>> > +reset_pcap_file br-phys hv1/br-phys
>> > +
>> > +as hv2
>> > +ovs-vsctl show
>> > +reset_pcap_file hv2-ext2 hv2/ext2
>> > +reset_pcap_file br-phys_n1 hv2/br-phys_n1
>> > +reset_pcap_file br-phys hv2/br-phys
>> > +
>> > +as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request
>> > +
>> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_arp_resp
>> > +AT_CHECK([cat ext1_arp_resp], [0], [expout])
>> > +
>> > +# Verify that the response didn't come from hv2
>> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap > ext1_arp_resp
>> > +AT_CHECK([cat ext1_arp_resp], [0], [])
>> > +
>> > +OVN_CLEANUP([hv1],[hv2])
>> > +AT_CLEANUP
>> > +
>> >  AT_SETUP([ovn -- ovn-controller restart])
>> >  AT_SKIP_IF([test $HAVE_PYTHON = no])
>> >  ovn_start
>> > --
>> > 2.20.1
>> >
>> > _______________________________________________
>> > dev mailing list
>> > dev@openvswitch.org
>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Han Zhou Jan. 17, 2019, 8:41 p.m. UTC | #6
On Thu, Jan 17, 2019 at 11:32 AM Han Zhou <zhouhan@gmail.com> wrote:
>
> On Thu, Jan 17, 2019 at 11:25 AM Numan Siddique <nusiddiq@redhat.com> wrote:
> >
> >
> >
> > On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com> wrote:
> >>
> >> Hi Numan,
> >>
> >> With v5 the new test case "external logical port" fails.
> >> And please see more comments inlined.
> >>
> >> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
> >> >
> >> > From: Numan Siddique <nusiddiq@redhat.com>
> >> >
> >> > In the case of OpenStack + OVN, when the VMs are booted on
> >> > hypervisors supporting SR-IOV nics, there are no OVS ports
> >> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
> >> > Router Solicitation requests, the local ovn-controller
> >> > cannot reply to these packets. OpenStack Neutron dhcp agent
> >> > service needs to be run to serve these requests.
> >> >
> >> > With the new logical port type - 'external', OVN itself can
> >> > handle these requests avoiding the need to deploy any
> >> > external services like neutron dhcp agent.
> >> >
> >> > To make use of this feature, CMS has to
> >> >  - create a logical port for such VMs
> >> >  - set the type to 'external'
> >> >  - set requested-chassis="<chassis-name>" in the options
> >> >    column.
> >> >  - create a localnet port for the logical switch
> >> >  - configure the ovn-bridge-mappings option in the OVS db.
> >> >
> >> > When the ovn-controller running in that 'chassis', detects
> >> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
> >> > flows. Since the packet enters the logical switch pipeline
> >> > via the localnet port, the inport register (reg14) is set
> >> > to the tunnel key of localnet port in the match conditions.
> >> >
> >> > In case the chassis goes down for some reason, it is the
> >> > responsibility of CMS to change the 'requested-chassis'
> >> > option to some other active chassis, so that it can serve
> >> > these requests.
> >> >
> >> > When the VM with the external port, sends an ARP request for
> >> > the router ips, only the chassis which has claimed the port,
> >> > will reply to the ARP requests. Rest of the chassis on
> >> > receiving these packets drop them in the ingress switch
> >> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
> >> > before S_SWITCH_IN_L2_LKUP.
> >> >
> >> > This would guarantee that only the chassis which has claimed
> >> > the external ports will run the router datapath pipeline.
> >> >
> >> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> >> > ---
> >> >
> >> > v4 -> v5
> >> > ------
> >> >   * Addressed review comments from Han Zhou.
> >> >
> >> > v3 -> v4
> >> > ------
> >> >   * Updated the documention as per Han Zhou's suggestion.
> >> >
> >> > v2 -> v3
> >> > -------
> >> >   * Rebased
> >> >
> >> >  ovn/controller/binding.c        |  12 +
> >> >  ovn/controller/lflow.c          |  41 ++-
> >> >  ovn/controller/lflow.h          |   2 +
> >> >  ovn/controller/lport.c          |  26 ++
> >> >  ovn/controller/lport.h          |   5 +
> >> >  ovn/controller/ovn-controller.c |   6 +
> >> >  ovn/lib/ovn-util.c              |   1 +
> >> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
> >> >  ovn/northd/ovn-northd.c         |  85 ++++-
> >> >  ovn/ovn-architecture.7.xml      |  78 +++++
> >> >  ovn/ovn-nb.xml                  |  47 +++
> >> >  tests/ovn.at                    | 530 +++++++++++++++++++++++++++++++-
> >> >  12 files changed, 848 insertions(+), 22 deletions(-)
> >> >
> >> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
> >> > index 021ecddcf..64e605b92 100644
> >> > --- a/ovn/controller/binding.c
> >> > +++ b/ovn/controller/binding.c
> >> > @@ -471,6 +471,18 @@ consider_local_datapath(struct ovsdb_idl_txn *ovnsb_idl_txn,
> >> >           * for them. */
> >> >          sset_add(local_lports, binding_rec->logical_port);
> >> >          our_chassis = false;
> >> > +    } else if (!strcmp(binding_rec->type, "external")) {
> >> > +        const char *chassis_id = smap_get(&binding_rec->options,
> >> > +                                          "requested-chassis");
> >> > +        our_chassis = chassis_id && (
> >> > +            !strcmp(chassis_id, chassis_rec->name) ||
> >> > +            !strcmp(chassis_id, chassis_rec->hostname));
> >> > +        if (our_chassis) {
> >> > +            add_local_datapath(sbrec_datapath_binding_by_key,
> >> > +                               sbrec_port_binding_by_datapath,
> >> > +                               sbrec_port_binding_by_name,
> >> > +                               binding_rec->datapath, true, local_datapaths);
> >> > +        }
> >> >      }
> >> >
> >> >      if (our_chassis
> >> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
> >> > index 8db81927e..98e8ed3b9 100644
> >> > --- a/ovn/controller/lflow.c
> >> > +++ b/ovn/controller/lflow.c
> >> > @@ -52,7 +52,10 @@ lflow_init(void)
> >> >  struct lookup_port_aux {
> >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
> >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
> >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
> >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
> >> >      const struct sbrec_datapath_binding *dp;
> >> > +    const struct sbrec_chassis *chassis;
> >> >  };
> >> >
> >> >  struct condition_aux {
> >> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
> >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> >      const struct sbrec_logical_flow *,
> >> >      const struct hmap *local_datapaths,
> >> >      const struct sbrec_chassis *,
> >> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char *port_name, unsigned int *portp)
> >> >      const struct sbrec_port_binding *pb
> >> >          = lport_lookup_by_name(aux->sbrec_port_binding_by_name, port_name);
> >> >      if (pb && pb->datapath == aux->dp) {
> >> > -        *portp = pb->tunnel_key;
> >> > -        return true;
> >> > +        if (strcmp(pb->type, "external")) {
> >> > +            *portp = pb->tunnel_key;
> >> > +            return true;
> >> > +        }
> >> > +        const char *chassis_id = smap_get(&pb->options,
> >> > +                                          "requested-chassis");
> >> > +        if (chassis_id && (!strcmp(chassis_id, aux->chassis->name) ||
> >> > +                           !strcmp(chassis_id, aux->chassis->hostname))) {
> >> > +            const struct sbrec_port_binding *localnet_pb
> >> > +                = lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
> >> > +                                       aux->sbrec_port_binding_by_type,
> >> > +                                       aux->dp->tunnel_key, "localnet");
> >> > +            if (localnet_pb) {
> >> > +                *portp = localnet_pb->tunnel_key;
> >> > +                return true;
> >> > +            }
> >> > +        }
> >> > +        return false;
> >> >      }
> >> >
> >> >      const struct sbrec_multicast_group *mg = mcgroup_lookup_by_dp_name(
> >> > @@ -144,6 +165,8 @@ add_logical_flows(
> >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> >      const struct sbrec_dhcp_options_table *dhcp_options_table,
> >> >      const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
> >> >      const struct sbrec_logical_flow_table *logical_flow_table,
> >> > @@ -183,6 +206,8 @@ add_logical_flows(
> >> >          consider_logical_flow(sbrec_chassis_by_name,
> >> >                                sbrec_multicast_group_by_name_datapath,
> >> >                                sbrec_port_binding_by_name,
> >> > +                              sbrec_port_binding_by_type,
> >> > +                              sbrec_datapath_binding_by_key,
> >> >                                lflow, local_datapaths,
> >> >                                chassis, &dhcp_opts, &dhcpv6_opts, &nd_ra_opts,
> >> >                                addr_sets, port_groups, active_tunnels,
> >> > @@ -200,6 +225,8 @@ consider_logical_flow(
> >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> >      const struct sbrec_logical_flow *lflow,
> >> >      const struct hmap *local_datapaths,
> >> >      const struct sbrec_chassis *chassis,
> >> > @@ -292,7 +319,10 @@ consider_logical_flow(
> >> >          .sbrec_multicast_group_by_name_datapath
> >> >              = sbrec_multicast_group_by_name_datapath,
> >> >          .sbrec_port_binding_by_name = sbrec_port_binding_by_name,
> >> > -        .dp = lflow->logical_datapath
> >> > +        .sbrec_port_binding_by_type = sbrec_port_binding_by_type,
> >> > +        .sbrec_datapath_binding_by_key = sbrec_datapath_binding_by_key,
> >> > +        .dp = lflow->logical_datapath,
> >> > +        .chassis = chassis
> >> >      };
> >> >      struct condition_aux cond_aux = {
> >> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
> >> > @@ -463,6 +493,8 @@ void
> >> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> >> >            struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> >> >            struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >> > +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > +          struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> >            const struct sbrec_dhcp_options_table *dhcp_options_table,
> >> >            const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
> >> >            const struct sbrec_logical_flow_table *logical_flow_table,
> >> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> >> >
> >> >      add_logical_flows(sbrec_chassis_by_name,
> >> >                        sbrec_multicast_group_by_name_datapath,
> >> > -                      sbrec_port_binding_by_name, dhcp_options_table,
> >> > +                      sbrec_port_binding_by_name, sbrec_port_binding_by_type,
> >> > +                      sbrec_datapath_binding_by_key, dhcp_options_table,
> >> >                        dhcpv6_options_table, logical_flow_table,
> >> >                        local_datapaths, chassis, addr_sets, port_groups,
> >> >                        active_tunnels, local_lport_ids, flow_table, group_table,
> >> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
> >> > index d19338140..b2911e0eb 100644
> >> > --- a/ovn/controller/lflow.h
> >> > +++ b/ovn/controller/lflow.h
> >> > @@ -68,6 +68,8 @@ void lflow_init(void);
> >> >  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> >> >                 struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> >> >                 struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >> > +               struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > +               struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> >                 const struct sbrec_dhcp_options_table *,
> >> >                 const struct sbrec_dhcpv6_options_table *,
> >> >                 const struct sbrec_logical_flow_table *,
> >> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
> >> > index cc5c5fbb2..9c827d9b0 100644
> >> > --- a/ovn/controller/lport.c
> >> > +++ b/ovn/controller/lport.c
> >> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> >      return retval;
> >> >  }
> >> >
> >> > +const struct sbrec_port_binding *
> >> > +lport_lookup_by_type(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> > +                     struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > +                     uint64_t dp_key, const char *port_type)
> >> > +{
> >> > +    /* Lookup datapath corresponding to dp_key. */
> >> > +    const struct sbrec_datapath_binding *db = datapath_lookup_by_key(
> >> > +        sbrec_datapath_binding_by_key, dp_key);
> >> > +    if (!db) {
> >> > +        return NULL;
> >> > +    }
> >> > +
> >> > +    /* Build key for an indexed lookup. */
> >> > +    struct sbrec_port_binding *pb = sbrec_port_binding_index_init_row(
> >> > +            sbrec_port_binding_by_type);
> >> > +    sbrec_port_binding_index_set_datapath(pb, db);
> >> > +    sbrec_port_binding_index_set_type(pb, port_type);
> >> > +
> >> > +    const struct sbrec_port_binding *retval = sbrec_port_binding_index_find(
> >> > +            sbrec_port_binding_by_type, pb);
> >> > +
> >> > +    sbrec_port_binding_index_destroy_row(pb);
> >> > +
> >> > +    return retval;
> >> > +}
> >> > +
> >> >  const struct sbrec_datapath_binding *
> >> >  datapath_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> >                         uint64_t dp_key)
> >> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
> >> > index 7dcd5bee0..2d49792f6 100644
> >> > --- a/ovn/controller/lport.h
> >> > +++ b/ovn/controller/lport.h
> >> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding *lport_lookup_by_key(
> >> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
> >> >      uint64_t dp_key, uint64_t port_key);
> >> >
> >> > +const struct sbrec_port_binding *lport_lookup_by_type(
> >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > +    uint64_t dp_key, const char *port_type);
> >> > +
> >> >  const struct sbrec_datapath_binding *datapath_lookup_by_key(
> >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key, uint64_t dp_key);
> >> >
> >> > diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
> >> > index 4e9a5865f..5aab9142f 100644
> >> > --- a/ovn/controller/ovn-controller.c
> >> > +++ b/ovn/controller/ovn-controller.c
> >> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
> >> >       * ports that have a Gateway_Chassis that point's to our own
> >> >       * chassis */
> >> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "chassisredirect");
> >> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "external");
> >> >      if (chassis) {
> >> >          /* This should be mostly redundant with the other clauses for port
> >> >           * bindings, but it allows us to catch any ports that are assigned to
> >> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
> >> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
> >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >> >                                    &sbrec_port_binding_col_datapath);
> >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
> >> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >> > +                                  &sbrec_port_binding_col_type);
> >>
> >> This index is used with two columns: datapath_binding and type, so it
> >> should be created with both columns using create2.
> >>
> >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
> >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >> >                                    &sbrec_datapath_binding_col_tunnel_key);
> >> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
> >> >                              sbrec_chassis_by_name,
> >> >                              sbrec_multicast_group_by_name_datapath,
> >> >                              sbrec_port_binding_by_name,
> >> > +                            sbrec_port_binding_by_type,
> >> > +                            sbrec_datapath_binding_by_key,
> >> >                              sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
> >> >                              sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
> >> >                              sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
> >> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
> >> > index aa03919bb..a9d4b8736 100644
> >> > --- a/ovn/lib/ovn-util.c
> >> > +++ b/ovn/lib/ovn-util.c
> >> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
> >> >      "localport",
> >> >      "router",
> >> >      "vtep",
> >> > +    "external",
> >> >  };
> >> >
> >> >  bool
> >> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> >> > index 392a5efc9..c8883d60d 100644
> >> > --- a/ovn/northd/ovn-northd.8.xml
> >> > +++ b/ovn/northd/ovn-northd.8.xml
> >> > @@ -626,7 +626,8 @@ nd_na_router {
> >> >      <p>
> >> >        This table adds the DHCPv4 options to a DHCPv4 packet from the
> >> >        logical ports configured with IPv4 address(es) and DHCPv4 options,
> >> > -      and similarly for DHCPv6 options.
> >> > +      and similarly for DHCPv6 options. This table also adds flows for the
> >> > +      logical ports of type <code>external</code>.
> >> >      </p>
> >> >
> >> >      <ul>
> >> > @@ -827,7 +828,39 @@ output;
> >> >        </li>
> >> >      </ul>
> >> >
> >> > -    <h3>Ingress Table 16 Destination Lookup</h3>
> >> > +    <h3>Ingress table 16 External ports</h3>
> >> > +
> >> > +    <p>
> >> > +      Traffic from the <code>external</code> logical ports enter the ingress
> >> > +      datapath pipeline via the <code>localnet</code> port. This table adds the
> >> > +      below logical flows to handle the traffic from these ports.
> >> > +    </p>
> >> > +
> >> > +    <ul>
> >> > +      <li>
> >> > +        <p>
> >> > +          A priority-100 flow is added for each <code>external</code> logical
> >> > +          port which doesn't reside on a chassis to drop the ARP/IPv6 NS
> >> > +          request to the router IP(s) (of the logical switch) which matches
> >> > +          on the <code>inport</code> of the <code>external</code> logical port
> >> > +          and the valid <code>eth.src</code> address(es) of the
> >> > +          <code>external</code> logical port.
> >> > +        </p>
> >> > +
> >> > +        <p>
> >> > +          This flow guarantees that the ARP/NS request to the router IP
> >> > +          address from the external ports is responded by only the chassis
> >> > +          which has claimed these external ports. All the other chassis,
> >> > +          drops these packets.
> >> > +        </p>
> >> > +      </li>
> >> > +
> >> > +      <li>
> >> > +        A priority-0 flow that matches all packets to advances to table 17.
> >> > +      </li>
> >> > +    </ul>
> >> > +
> >> > +    <h3>Ingress Table 17 Destination Lookup</h3>
> >> >
> >> >      <p>
> >> >        This table implements switching behavior.  It contains these logical
> >> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> >> > index 3fd8a8757..87208c6c1 100644
> >> > --- a/ovn/northd/ovn-northd.c
> >> > +++ b/ovn/northd/ovn-northd.c
> >> > @@ -119,7 +119,8 @@ enum ovn_stage {
> >> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13, "ls_in_dhcp_response") \
> >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14, "ls_in_dns_lookup")    \
> >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15, "ls_in_dns_response")  \
> >> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16, "ls_in_l2_lkup")       \
> >> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16, "ls_in_external_port") \
> >> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17, "ls_in_l2_lkup")       \
> >> >                                                                            \
> >> >      /* Logical switch egress stages. */                                   \
> >> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")         \
> >> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct nbrec_logical_switch_port *lsp)
> >> >      return !lsp->up || *lsp->up;
> >> >  }
> >> >
> >> > +static bool
> >> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
> >> > +{
> >> > +    return !strcmp(nbsp->type, "external");
> >> > +}
> >> > +
> >> >  static bool
> >> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
> >> >                      struct ds *options_action, struct ds *response_action,
> >> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >> >           *  - port type is localport
> >> >           */
> >> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") &&
> >> > -            strcmp(op->nbsp->type, "localport")) {
> >> > +            strcmp(op->nbsp->type, "localport") && lsp_is_external(op->nbsp)) {
> >>
> >> Sorry that I missed this in last review. The && condition has problem.
> >> It will cause ARP responder flows added for all lports that are not
> >> external. I think it should be || here.
> >
> >
> > Agree. To make it easier to read, I will add a new "if" with continue - below this one for
> > external port types.
> >
> >
> >>
> >>
> >> >              continue;
> >> >          }
> >> >
> >> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >> >              continue;
> >> >          }
> >> >
> >> > +        bool is_external = lsp_is_external(op->nbsp);
> >> > +        if (is_external && !op->od->localnet_port) {
> >> > +            /* If it's an external port and there is no localnet port
> >> > +             * ignore it. */
> >> > +            continue;
> >> > +        }
> >> > +
> >> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> >> >              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
> >> >                  struct ds options_action = DS_EMPTY_INITIALIZER;
> >> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >> >                      ds_put_format(
> >> >                          &match, "inport == %s && eth.src == %s && "
> >> >                          "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
> >> > -                        "udp.src == 68 && udp.dst == 67", op->json_key,
> >> > -                        op->lsp_addrs[i].ea_s);
> >> > +                        "udp.src == 68 && udp.dst == 67",
> >> > +                        op->json_key, op->lsp_addrs[i].ea_s);
> >>
> >> No change here?
> >
> >
> > I think it's unwanted and unrelated change. I will correct it.
> >>
> >> >
> >> >                      ovn_lflow_add(lflows, op->od, S_SWITCH_IN_DHCP_OPTIONS,
> >> >                                    100, ds_cstr(&match),
> >> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >> >      /* Ingress table 12 and 13: DHCP options and response, by default goto
> >> >       * next. (priority 0).
> >> >       * Ingress table 14 and 15: DNS lookup and response, by default goto next.
> >> > -     * (priority 0).*/
> >> > +     * (priority 0).
> >> > +     * Ingress table 16 - External port handling, by default goto next.
> >> > +     * (priority 0). */
> >> >
> >> >      HMAP_FOR_EACH (od, key_node, datapaths) {
> >> >          if (!od->nbs) {
> >> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1", "next;");
> >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, "1", "next;");
> >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0, "1", "next;");
> >> > +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0, "1", "next;");
> >> >      }
> >> >
> >> > -    /* Ingress table 16: Destination lookup, broadcast and multicast handling
> >> > +    HMAP_FOR_EACH (op, key_node, ports) {
> >> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
> >> > +           continue;
> >> > +        }
> >> > +
> >> > +        /* Table 16: External port. Drop ARP request for router ips from
> >> > +         * external ports  on chassis not binding those ports.
> >> > +         * This makes the router pipeline to be run only on the chassis
> >> > +         * binding the external ports. */
> >> > +
> >> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> >> > +            for (size_t j = 0; j < op->od->n_router_ports; j++) {
> >> > +                struct ovn_port *rp = op->od->router_ports[j];
> >> > +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
> >> > +                    for (size_t l = 0; l < rp->lsp_addrs[k].n_ipv4_addrs;
> >> > +                         l++) {
> >> > +                        ds_clear(&match);
> >> > +                        ds_put_cstr(&match, "ip4");
> >> > +                        ds_put_format(
> >> > +                            &match, "inport == %s && eth.src == %s"
> >> > +                            " && !is_chassis_resident(%s)"
> >> > +                            " && arp.tpa == %s && arp.op == 1",
> >> > +                            op->json_key, op->lsp_addrs[i].ea_s, op->json_key,
> >>
> >> I believe the inport should match the localnet port's json_key here,
> >> since it is coming from a localnet port.
> >
> >
> > Both would work. If you see the code in lflow.c in this patch - it will get the tunnel
> > key of the localnet port if the port_binding type is "external".
> >
> > That's how even the DHCP requests are handled. ovn-controller will translate
> > the logical flows with action "put_dhcp_opts" only the chassis claiming the
> > external ports.
>
> Oh, yes you are right. Actually I read that part in v4 and it somehow
> slipped my mind. Thanks for explain.

I thought it a second time, and I'd suggest to do the convertion here
in northd instead of ovn-controller, for two reasons:

1. In ovn-controller there is no extra context so it just blindly
transate all references to external logical port into localnet port
key. This could lead to unexpected behavior. For example, if someone
uses external logical port in ACL match condition. The match condition
would then apply to all packets to/from localnet port which is
definitely unwanted. (at the same time it would be better to document
that features like port-security, ACL should not be used for external
logical ports)

2. A less important reason is, it is better to do it at earlier stage
than later. northd handles common processing. This part of logic is
common for all chassises, so it would be better if we explicitely
handle it in northd, instead of let every chassis to process. And the
change in northd would likely be simpler than in ovn-controller.

Thanks,
Han
Numan Siddique Jan. 18, 2019, 6:16 p.m. UTC | #7
On Fri, Jan 18, 2019 at 2:11 AM Han Zhou <zhouhan@gmail.com> wrote:

> On Thu, Jan 17, 2019 at 11:32 AM Han Zhou <zhouhan@gmail.com> wrote:
> >
> > On Thu, Jan 17, 2019 at 11:25 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> > >
> > >
> > >
> > > On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com> wrote:
> > >>
> > >> Hi Numan,
> > >>
> > >> With v5 the new test case "external logical port" fails.
> > >> And please see more comments inlined.
> > >>
> > >> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
> > >> >
> > >> > From: Numan Siddique <nusiddiq@redhat.com>
> > >> >
> > >> > In the case of OpenStack + OVN, when the VMs are booted on
> > >> > hypervisors supporting SR-IOV nics, there are no OVS ports
> > >> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
> > >> > Router Solicitation requests, the local ovn-controller
> > >> > cannot reply to these packets. OpenStack Neutron dhcp agent
> > >> > service needs to be run to serve these requests.
> > >> >
> > >> > With the new logical port type - 'external', OVN itself can
> > >> > handle these requests avoiding the need to deploy any
> > >> > external services like neutron dhcp agent.
> > >> >
> > >> > To make use of this feature, CMS has to
> > >> >  - create a logical port for such VMs
> > >> >  - set the type to 'external'
> > >> >  - set requested-chassis="<chassis-name>" in the options
> > >> >    column.
> > >> >  - create a localnet port for the logical switch
> > >> >  - configure the ovn-bridge-mappings option in the OVS db.
> > >> >
> > >> > When the ovn-controller running in that 'chassis', detects
> > >> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
> > >> > flows. Since the packet enters the logical switch pipeline
> > >> > via the localnet port, the inport register (reg14) is set
> > >> > to the tunnel key of localnet port in the match conditions.
> > >> >
> > >> > In case the chassis goes down for some reason, it is the
> > >> > responsibility of CMS to change the 'requested-chassis'
> > >> > option to some other active chassis, so that it can serve
> > >> > these requests.
> > >> >
> > >> > When the VM with the external port, sends an ARP request for
> > >> > the router ips, only the chassis which has claimed the port,
> > >> > will reply to the ARP requests. Rest of the chassis on
> > >> > receiving these packets drop them in the ingress switch
> > >> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
> > >> > before S_SWITCH_IN_L2_LKUP.
> > >> >
> > >> > This would guarantee that only the chassis which has claimed
> > >> > the external ports will run the router datapath pipeline.
> > >> >
> > >> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> > >> > ---
> > >> >
> > >> > v4 -> v5
> > >> > ------
> > >> >   * Addressed review comments from Han Zhou.
> > >> >
> > >> > v3 -> v4
> > >> > ------
> > >> >   * Updated the documention as per Han Zhou's suggestion.
> > >> >
> > >> > v2 -> v3
> > >> > -------
> > >> >   * Rebased
> > >> >
> > >> >  ovn/controller/binding.c        |  12 +
> > >> >  ovn/controller/lflow.c          |  41 ++-
> > >> >  ovn/controller/lflow.h          |   2 +
> > >> >  ovn/controller/lport.c          |  26 ++
> > >> >  ovn/controller/lport.h          |   5 +
> > >> >  ovn/controller/ovn-controller.c |   6 +
> > >> >  ovn/lib/ovn-util.c              |   1 +
> > >> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
> > >> >  ovn/northd/ovn-northd.c         |  85 ++++-
> > >> >  ovn/ovn-architecture.7.xml      |  78 +++++
> > >> >  ovn/ovn-nb.xml                  |  47 +++
> > >> >  tests/ovn.at                    | 530
> +++++++++++++++++++++++++++++++-
> > >> >  12 files changed, 848 insertions(+), 22 deletions(-)
> > >> >
> > >> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
> > >> > index 021ecddcf..64e605b92 100644
> > >> > --- a/ovn/controller/binding.c
> > >> > +++ b/ovn/controller/binding.c
> > >> > @@ -471,6 +471,18 @@ consider_local_datapath(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
> > >> >           * for them. */
> > >> >          sset_add(local_lports, binding_rec->logical_port);
> > >> >          our_chassis = false;
> > >> > +    } else if (!strcmp(binding_rec->type, "external")) {
> > >> > +        const char *chassis_id = smap_get(&binding_rec->options,
> > >> > +                                          "requested-chassis");
> > >> > +        our_chassis = chassis_id && (
> > >> > +            !strcmp(chassis_id, chassis_rec->name) ||
> > >> > +            !strcmp(chassis_id, chassis_rec->hostname));
> > >> > +        if (our_chassis) {
> > >> > +            add_local_datapath(sbrec_datapath_binding_by_key,
> > >> > +                               sbrec_port_binding_by_datapath,
> > >> > +                               sbrec_port_binding_by_name,
> > >> > +                               binding_rec->datapath, true,
> local_datapaths);
> > >> > +        }
> > >> >      }
> > >> >
> > >> >      if (our_chassis
> > >> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
> > >> > index 8db81927e..98e8ed3b9 100644
> > >> > --- a/ovn/controller/lflow.c
> > >> > +++ b/ovn/controller/lflow.c
> > >> > @@ -52,7 +52,10 @@ lflow_init(void)
> > >> >  struct lookup_port_aux {
> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
> > >> >      const struct sbrec_datapath_binding *dp;
> > >> > +    const struct sbrec_chassis *chassis;
> > >> >  };
> > >> >
> > >> >  struct condition_aux {
> > >> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> > >> >      const struct sbrec_logical_flow *,
> > >> >      const struct hmap *local_datapaths,
> > >> >      const struct sbrec_chassis *,
> > >> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char
> *port_name, unsigned int *portp)
> > >> >      const struct sbrec_port_binding *pb
> > >> >          = lport_lookup_by_name(aux->sbrec_port_binding_by_name,
> port_name);
> > >> >      if (pb && pb->datapath == aux->dp) {
> > >> > -        *portp = pb->tunnel_key;
> > >> > -        return true;
> > >> > +        if (strcmp(pb->type, "external")) {
> > >> > +            *portp = pb->tunnel_key;
> > >> > +            return true;
> > >> > +        }
> > >> > +        const char *chassis_id = smap_get(&pb->options,
> > >> > +                                          "requested-chassis");
> > >> > +        if (chassis_id && (!strcmp(chassis_id, aux->chassis->name)
> ||
> > >> > +                           !strcmp(chassis_id,
> aux->chassis->hostname))) {
> > >> > +            const struct sbrec_port_binding *localnet_pb
> > >> > +                =
> lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
> > >> > +
>  aux->sbrec_port_binding_by_type,
> > >> > +                                       aux->dp->tunnel_key,
> "localnet");
> > >> > +            if (localnet_pb) {
> > >> > +                *portp = localnet_pb->tunnel_key;
> > >> > +                return true;
> > >> > +            }
> > >> > +        }
> > >> > +        return false;
> > >> >      }
> > >> >
> > >> >      const struct sbrec_multicast_group *mg =
> mcgroup_lookup_by_dp_name(
> > >> > @@ -144,6 +165,8 @@ add_logical_flows(
> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> > >> >      const struct sbrec_dhcp_options_table *dhcp_options_table,
> > >> >      const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
> > >> >      const struct sbrec_logical_flow_table *logical_flow_table,
> > >> > @@ -183,6 +206,8 @@ add_logical_flows(
> > >> >          consider_logical_flow(sbrec_chassis_by_name,
> > >> >
> sbrec_multicast_group_by_name_datapath,
> > >> >                                sbrec_port_binding_by_name,
> > >> > +                              sbrec_port_binding_by_type,
> > >> > +                              sbrec_datapath_binding_by_key,
> > >> >                                lflow, local_datapaths,
> > >> >                                chassis, &dhcp_opts, &dhcpv6_opts,
> &nd_ra_opts,
> > >> >                                addr_sets, port_groups,
> active_tunnels,
> > >> > @@ -200,6 +225,8 @@ consider_logical_flow(
> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> > >> >      const struct sbrec_logical_flow *lflow,
> > >> >      const struct hmap *local_datapaths,
> > >> >      const struct sbrec_chassis *chassis,
> > >> > @@ -292,7 +319,10 @@ consider_logical_flow(
> > >> >          .sbrec_multicast_group_by_name_datapath
> > >> >              = sbrec_multicast_group_by_name_datapath,
> > >> >          .sbrec_port_binding_by_name = sbrec_port_binding_by_name,
> > >> > -        .dp = lflow->logical_datapath
> > >> > +        .sbrec_port_binding_by_type = sbrec_port_binding_by_type,
> > >> > +        .sbrec_datapath_binding_by_key =
> sbrec_datapath_binding_by_key,
> > >> > +        .dp = lflow->logical_datapath,
> > >> > +        .chassis = chassis
> > >> >      };
> > >> >      struct condition_aux cond_aux = {
> > >> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
> > >> > @@ -463,6 +493,8 @@ void
> > >> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> > >> >            struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> > >> >            struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > >> > +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > >> > +          struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> > >> >            const struct sbrec_dhcp_options_table
> *dhcp_options_table,
> > >> >            const struct sbrec_dhcpv6_options_table
> *dhcpv6_options_table,
> > >> >            const struct sbrec_logical_flow_table
> *logical_flow_table,
> > >> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index
> *sbrec_chassis_by_name,
> > >> >
> > >> >      add_logical_flows(sbrec_chassis_by_name,
> > >> >                        sbrec_multicast_group_by_name_datapath,
> > >> > -                      sbrec_port_binding_by_name,
> dhcp_options_table,
> > >> > +                      sbrec_port_binding_by_name,
> sbrec_port_binding_by_type,
> > >> > +                      sbrec_datapath_binding_by_key,
> dhcp_options_table,
> > >> >                        dhcpv6_options_table, logical_flow_table,
> > >> >                        local_datapaths, chassis, addr_sets,
> port_groups,
> > >> >                        active_tunnels, local_lport_ids, flow_table,
> group_table,
> > >> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
> > >> > index d19338140..b2911e0eb 100644
> > >> > --- a/ovn/controller/lflow.h
> > >> > +++ b/ovn/controller/lflow.h
> > >> > @@ -68,6 +68,8 @@ void lflow_init(void);
> > >> >  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> > >> >                 struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> > >> >                 struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > >> > +               struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > >> > +               struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> > >> >                 const struct sbrec_dhcp_options_table *,
> > >> >                 const struct sbrec_dhcpv6_options_table *,
> > >> >                 const struct sbrec_logical_flow_table *,
> > >> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
> > >> > index cc5c5fbb2..9c827d9b0 100644
> > >> > --- a/ovn/controller/lport.c
> > >> > +++ b/ovn/controller/lport.c
> > >> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> > >> >      return retval;
> > >> >  }
> > >> >
> > >> > +const struct sbrec_port_binding *
> > >> > +lport_lookup_by_type(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> > >> > +                     struct ovsdb_idl_index
> *sbrec_port_binding_by_type,
> > >> > +                     uint64_t dp_key, const char *port_type)
> > >> > +{
> > >> > +    /* Lookup datapath corresponding to dp_key. */
> > >> > +    const struct sbrec_datapath_binding *db =
> datapath_lookup_by_key(
> > >> > +        sbrec_datapath_binding_by_key, dp_key);
> > >> > +    if (!db) {
> > >> > +        return NULL;
> > >> > +    }
> > >> > +
> > >> > +    /* Build key for an indexed lookup. */
> > >> > +    struct sbrec_port_binding *pb =
> sbrec_port_binding_index_init_row(
> > >> > +            sbrec_port_binding_by_type);
> > >> > +    sbrec_port_binding_index_set_datapath(pb, db);
> > >> > +    sbrec_port_binding_index_set_type(pb, port_type);
> > >> > +
> > >> > +    const struct sbrec_port_binding *retval =
> sbrec_port_binding_index_find(
> > >> > +            sbrec_port_binding_by_type, pb);
> > >> > +
> > >> > +    sbrec_port_binding_index_destroy_row(pb);
> > >> > +
> > >> > +    return retval;
> > >> > +}
> > >> > +
> > >> >  const struct sbrec_datapath_binding *
> > >> >  datapath_lookup_by_key(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> > >> >                         uint64_t dp_key)
> > >> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
> > >> > index 7dcd5bee0..2d49792f6 100644
> > >> > --- a/ovn/controller/lport.h
> > >> > +++ b/ovn/controller/lport.h
> > >> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding
> *lport_lookup_by_key(
> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
> > >> >      uint64_t dp_key, uint64_t port_key);
> > >> >
> > >> > +const struct sbrec_port_binding *lport_lookup_by_type(
> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> > >> > +    uint64_t dp_key, const char *port_type);
> > >> > +
> > >> >  const struct sbrec_datapath_binding *datapath_lookup_by_key(
> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> uint64_t dp_key);
> > >> >
> > >> > diff --git a/ovn/controller/ovn-controller.c
> b/ovn/controller/ovn-controller.c
> > >> > index 4e9a5865f..5aab9142f 100644
> > >> > --- a/ovn/controller/ovn-controller.c
> > >> > +++ b/ovn/controller/ovn-controller.c
> > >> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
> > >> >       * ports that have a Gateway_Chassis that point's to our own
> > >> >       * chassis */
> > >> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
> "chassisredirect");
> > >> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
> "external");
> > >> >      if (chassis) {
> > >> >          /* This should be mostly redundant with the other clauses
> for port
> > >> >           * bindings, but it allows us to catch any ports that are
> assigned to
> > >> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> > >> >
> &sbrec_port_binding_col_datapath);
> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
> > >> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> > >> > +                                  &sbrec_port_binding_col_type);
> > >>
> > >> This index is used with two columns: datapath_binding and type, so it
> > >> should be created with both columns using create2.
> > >>
> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> > >> >
> &sbrec_datapath_binding_col_tunnel_key);
> > >> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
> > >> >                              sbrec_chassis_by_name,
> > >> >                              sbrec_multicast_group_by_name_datapath,
> > >> >                              sbrec_port_binding_by_name,
> > >> > +                            sbrec_port_binding_by_type,
> > >> > +                            sbrec_datapath_binding_by_key,
> > >> >
> sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
> > >> >
> sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
> > >> >
> sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
> > >> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
> > >> > index aa03919bb..a9d4b8736 100644
> > >> > --- a/ovn/lib/ovn-util.c
> > >> > +++ b/ovn/lib/ovn-util.c
> > >> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
> > >> >      "localport",
> > >> >      "router",
> > >> >      "vtep",
> > >> > +    "external",
> > >> >  };
> > >> >
> > >> >  bool
> > >> > diff --git a/ovn/northd/ovn-northd.8.xml
> b/ovn/northd/ovn-northd.8.xml
> > >> > index 392a5efc9..c8883d60d 100644
> > >> > --- a/ovn/northd/ovn-northd.8.xml
> > >> > +++ b/ovn/northd/ovn-northd.8.xml
> > >> > @@ -626,7 +626,8 @@ nd_na_router {
> > >> >      <p>
> > >> >        This table adds the DHCPv4 options to a DHCPv4 packet from
> the
> > >> >        logical ports configured with IPv4 address(es) and DHCPv4
> options,
> > >> > -      and similarly for DHCPv6 options.
> > >> > +      and similarly for DHCPv6 options. This table also adds flows
> for the
> > >> > +      logical ports of type <code>external</code>.
> > >> >      </p>
> > >> >
> > >> >      <ul>
> > >> > @@ -827,7 +828,39 @@ output;
> > >> >        </li>
> > >> >      </ul>
> > >> >
> > >> > -    <h3>Ingress Table 16 Destination Lookup</h3>
> > >> > +    <h3>Ingress table 16 External ports</h3>
> > >> > +
> > >> > +    <p>
> > >> > +      Traffic from the <code>external</code> logical ports enter
> the ingress
> > >> > +      datapath pipeline via the <code>localnet</code> port. This
> table adds the
> > >> > +      below logical flows to handle the traffic from these ports.
> > >> > +    </p>
> > >> > +
> > >> > +    <ul>
> > >> > +      <li>
> > >> > +        <p>
> > >> > +          A priority-100 flow is added for each
> <code>external</code> logical
> > >> > +          port which doesn't reside on a chassis to drop the
> ARP/IPv6 NS
> > >> > +          request to the router IP(s) (of the logical switch)
> which matches
> > >> > +          on the <code>inport</code> of the <code>external</code>
> logical port
> > >> > +          and the valid <code>eth.src</code> address(es) of the
> > >> > +          <code>external</code> logical port.
> > >> > +        </p>
> > >> > +
> > >> > +        <p>
> > >> > +          This flow guarantees that the ARP/NS request to the
> router IP
> > >> > +          address from the external ports is responded by only the
> chassis
> > >> > +          which has claimed these external ports. All the other
> chassis,
> > >> > +          drops these packets.
> > >> > +        </p>
> > >> > +      </li>
> > >> > +
> > >> > +      <li>
> > >> > +        A priority-0 flow that matches all packets to advances to
> table 17.
> > >> > +      </li>
> > >> > +    </ul>
> > >> > +
> > >> > +    <h3>Ingress Table 17 Destination Lookup</h3>
> > >> >
> > >> >      <p>
> > >> >        This table implements switching behavior.  It contains these
> logical
> > >> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> > >> > index 3fd8a8757..87208c6c1 100644
> > >> > --- a/ovn/northd/ovn-northd.c
> > >> > +++ b/ovn/northd/ovn-northd.c
> > >> > @@ -119,7 +119,8 @@ enum ovn_stage {
> > >> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13,
> "ls_in_dhcp_response") \
> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14,
> "ls_in_dns_lookup")    \
> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15,
> "ls_in_dns_response")  \
> > >> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16,
> "ls_in_l2_lkup")       \
> > >> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16,
> "ls_in_external_port") \
> > >> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17,
> "ls_in_l2_lkup")       \
> > >> >
>         \
> > >> >      /* Logical switch egress stages. */
>        \
> > >> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")
>        \
> > >> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct
> nbrec_logical_switch_port *lsp)
> > >> >      return !lsp->up || *lsp->up;
> > >> >  }
> > >> >
> > >> > +static bool
> > >> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
> > >> > +{
> > >> > +    return !strcmp(nbsp->type, "external");
> > >> > +}
> > >> > +
> > >> >  static bool
> > >> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
> > >> >                      struct ds *options_action, struct ds
> *response_action,
> > >> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap *datapaths,
> struct hmap *ports,
> > >> >           *  - port type is localport
> > >> >           */
> > >> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type,
> "router") &&
> > >> > -            strcmp(op->nbsp->type, "localport")) {
> > >> > +            strcmp(op->nbsp->type, "localport") &&
> lsp_is_external(op->nbsp)) {
> > >>
> > >> Sorry that I missed this in last review. The && condition has problem.
> > >> It will cause ARP responder flows added for all lports that are not
> > >> external. I think it should be || here.
> > >
> > >
> > > Agree. To make it easier to read, I will add a new "if" with continue
> - below this one for
> > > external port types.
> > >
> > >
> > >>
> > >>
> > >> >              continue;
> > >> >          }
> > >> >
> > >> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap *datapaths,
> struct hmap *ports,
> > >> >              continue;
> > >> >          }
> > >> >
> > >> > +        bool is_external = lsp_is_external(op->nbsp);
> > >> > +        if (is_external && !op->od->localnet_port) {
> > >> > +            /* If it's an external port and there is no localnet
> port
> > >> > +             * ignore it. */
> > >> > +            continue;
> > >> > +        }
> > >> > +
> > >> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> > >> >              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs;
> j++) {
> > >> >                  struct ds options_action = DS_EMPTY_INITIALIZER;
> > >> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap *datapaths,
> struct hmap *ports,
> > >> >                      ds_put_format(
> > >> >                          &match, "inport == %s && eth.src == %s && "
> > >> >                          "ip4.src == 0.0.0.0 && ip4.dst ==
> 255.255.255.255 && "
> > >> > -                        "udp.src == 68 && udp.dst == 67",
> op->json_key,
> > >> > -                        op->lsp_addrs[i].ea_s);
> > >> > +                        "udp.src == 68 && udp.dst == 67",
> > >> > +                        op->json_key, op->lsp_addrs[i].ea_s);
> > >>
> > >> No change here?
> > >
> > >
> > > I think it's unwanted and unrelated change. I will correct it.
> > >>
> > >> >
> > >> >                      ovn_lflow_add(lflows, op->od,
> S_SWITCH_IN_DHCP_OPTIONS,
> > >> >                                    100, ds_cstr(&match),
> > >> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap *datapaths,
> struct hmap *ports,
> > >> >      /* Ingress table 12 and 13: DHCP options and response, by
> default goto
> > >> >       * next. (priority 0).
> > >> >       * Ingress table 14 and 15: DNS lookup and response, by
> default goto next.
> > >> > -     * (priority 0).*/
> > >> > +     * (priority 0).
> > >> > +     * Ingress table 16 - External port handling, by default goto
> next.
> > >> > +     * (priority 0). */
> > >> >
> > >> >      HMAP_FOR_EACH (od, key_node, datapaths) {
> > >> >          if (!od->nbs) {
> > >> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap *datapaths,
> struct hmap *ports,
> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0,
> "1", "next;");
> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, "1",
> "next;");
> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0,
> "1", "next;");
> > >> > +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0,
> "1", "next;");
> > >> >      }
> > >> >
> > >> > -    /* Ingress table 16: Destination lookup, broadcast and
> multicast handling
> > >> > +    HMAP_FOR_EACH (op, key_node, ports) {
> > >> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
> > >> > +           continue;
> > >> > +        }
> > >> > +
> > >> > +        /* Table 16: External port. Drop ARP request for router
> ips from
> > >> > +         * external ports  on chassis not binding those ports.
> > >> > +         * This makes the router pipeline to be run only on the
> chassis
> > >> > +         * binding the external ports. */
> > >> > +
> > >> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> > >> > +            for (size_t j = 0; j < op->od->n_router_ports; j++) {
> > >> > +                struct ovn_port *rp = op->od->router_ports[j];
> > >> > +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
> > >> > +                    for (size_t l = 0; l <
> rp->lsp_addrs[k].n_ipv4_addrs;
> > >> > +                         l++) {
> > >> > +                        ds_clear(&match);
> > >> > +                        ds_put_cstr(&match, "ip4");
> > >> > +                        ds_put_format(
> > >> > +                            &match, "inport == %s && eth.src == %s"
> > >> > +                            " && !is_chassis_resident(%s)"
> > >> > +                            " && arp.tpa == %s && arp.op == 1",
> > >> > +                            op->json_key, op->lsp_addrs[i].ea_s,
> op->json_key,
> > >>
> > >> I believe the inport should match the localnet port's json_key here,
> > >> since it is coming from a localnet port.
> > >
> > >
> > > Both would work. If you see the code in lflow.c in this patch - it
> will get the tunnel
> > > key of the localnet port if the port_binding type is "external".
> > >
> > > That's how even the DHCP requests are handled. ovn-controller will
> translate
> > > the logical flows with action "put_dhcp_opts" only the chassis
> claiming the
> > > external ports.
> >
> > Oh, yes you are right. Actually I read that part in v4 and it somehow
> > slipped my mind. Thanks for explain.
>
> I thought it a second time, and I'd suggest to do the convertion here
> in northd instead of ovn-controller, for two reasons:
>
> 1. In ovn-controller there is no extra context so it just blindly
> transate all references to external logical port into localnet port
> key. This could lead to unexpected behavior. For example, if someone
> uses external logical port in ACL match condition. The match condition
> would then apply to all packets to/from localnet port which is
> definitely unwanted. (at the same time it would be better to document
> that features like port-security, ACL should not be used for external
> logical ports)
>
>
That's not how it works in the present patch. Lets say you have  2 chassis
hv1 and hv2 and an external port sw0-ext1 and a localnet port "ln-public".
Suppose if the requested-chassis is set to hv1, then all the logical flows
with the
match "inport == sw0-ext1" will be converted to OF flows only on hv1 as
this port
is bound by hv1 and the function 'lookup_port_cb()' would return true only
on hv1 . In hv2, lookup_port_cb() would return false.

If we want to do the conversion in ovn-northd.c the match condition would
have to
be - "inport == "ln-public" && is_chassis_resident(sw0-ext1) && ...."
instead of the present one  - "inport == sw0-ext1 && ...".

And the ACL match condition would not be an issue because of the above
mentioned
reason. i.e the ACL flows will be applied only on the chassis binding the
external
port.

The test case added checks that the OF flows are applied only on the bound
chassis.

I think it is better to do it in ovn-controller instead of ovn-northd.
Please let me know
if you still have any concerns.

Thanks
Numan


2. A less important reason is, it is better to do it at earlier stage
> than later. northd handles common processing. This part of logic is
> common for all chassises, so it would be better if we explicitely
> handle it in northd, instead of let every chassis to process. And the
> change in northd would likely be simpler than in ovn-controller.
>
> Thanks,
> Han
>
Han Zhou Jan. 18, 2019, 7:01 p.m. UTC | #8
On Fri, Jan 18, 2019 at 10:16 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>
>
>
> On Fri, Jan 18, 2019 at 2:11 AM Han Zhou <zhouhan@gmail.com> wrote:
>>
>> On Thu, Jan 17, 2019 at 11:32 AM Han Zhou <zhouhan@gmail.com> wrote:
>> >
>> > On Thu, Jan 17, 2019 at 11:25 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>> > >
>> > >
>> > >
>> > > On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com> wrote:
>> > >>
>> > >> Hi Numan,
>> > >>
>> > >> With v5 the new test case "external logical port" fails.
>> > >> And please see more comments inlined.
>> > >>
>> > >> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
>> > >> >
>> > >> > From: Numan Siddique <nusiddiq@redhat.com>
>> > >> >
>> > >> > In the case of OpenStack + OVN, when the VMs are booted on
>> > >> > hypervisors supporting SR-IOV nics, there are no OVS ports
>> > >> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
>> > >> > Router Solicitation requests, the local ovn-controller
>> > >> > cannot reply to these packets. OpenStack Neutron dhcp agent
>> > >> > service needs to be run to serve these requests.
>> > >> >
>> > >> > With the new logical port type - 'external', OVN itself can
>> > >> > handle these requests avoiding the need to deploy any
>> > >> > external services like neutron dhcp agent.
>> > >> >
>> > >> > To make use of this feature, CMS has to
>> > >> >  - create a logical port for such VMs
>> > >> >  - set the type to 'external'
>> > >> >  - set requested-chassis="<chassis-name>" in the options
>> > >> >    column.
>> > >> >  - create a localnet port for the logical switch
>> > >> >  - configure the ovn-bridge-mappings option in the OVS db.
>> > >> >
>> > >> > When the ovn-controller running in that 'chassis', detects
>> > >> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
>> > >> > flows. Since the packet enters the logical switch pipeline
>> > >> > via the localnet port, the inport register (reg14) is set
>> > >> > to the tunnel key of localnet port in the match conditions.
>> > >> >
>> > >> > In case the chassis goes down for some reason, it is the
>> > >> > responsibility of CMS to change the 'requested-chassis'
>> > >> > option to some other active chassis, so that it can serve
>> > >> > these requests.
>> > >> >
>> > >> > When the VM with the external port, sends an ARP request for
>> > >> > the router ips, only the chassis which has claimed the port,
>> > >> > will reply to the ARP requests. Rest of the chassis on
>> > >> > receiving these packets drop them in the ingress switch
>> > >> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
>> > >> > before S_SWITCH_IN_L2_LKUP.
>> > >> >
>> > >> > This would guarantee that only the chassis which has claimed
>> > >> > the external ports will run the router datapath pipeline.
>> > >> >
>> > >> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>> > >> > ---
>> > >> >
>> > >> > v4 -> v5
>> > >> > ------
>> > >> >   * Addressed review comments from Han Zhou.
>> > >> >
>> > >> > v3 -> v4
>> > >> > ------
>> > >> >   * Updated the documention as per Han Zhou's suggestion.
>> > >> >
>> > >> > v2 -> v3
>> > >> > -------
>> > >> >   * Rebased
>> > >> >
>> > >> >  ovn/controller/binding.c        |  12 +
>> > >> >  ovn/controller/lflow.c          |  41 ++-
>> > >> >  ovn/controller/lflow.h          |   2 +
>> > >> >  ovn/controller/lport.c          |  26 ++
>> > >> >  ovn/controller/lport.h          |   5 +
>> > >> >  ovn/controller/ovn-controller.c |   6 +
>> > >> >  ovn/lib/ovn-util.c              |   1 +
>> > >> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
>> > >> >  ovn/northd/ovn-northd.c         |  85 ++++-
>> > >> >  ovn/ovn-architecture.7.xml      |  78 +++++
>> > >> >  ovn/ovn-nb.xml                  |  47 +++
>> > >> >  tests/ovn.at                    | 530 +++++++++++++++++++++++++++++++-
>> > >> >  12 files changed, 848 insertions(+), 22 deletions(-)
>> > >> >
>> > >> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
>> > >> > index 021ecddcf..64e605b92 100644
>> > >> > --- a/ovn/controller/binding.c
>> > >> > +++ b/ovn/controller/binding.c
>> > >> > @@ -471,6 +471,18 @@ consider_local_datapath(struct ovsdb_idl_txn *ovnsb_idl_txn,
>> > >> >           * for them. */
>> > >> >          sset_add(local_lports, binding_rec->logical_port);
>> > >> >          our_chassis = false;
>> > >> > +    } else if (!strcmp(binding_rec->type, "external")) {
>> > >> > +        const char *chassis_id = smap_get(&binding_rec->options,
>> > >> > +                                          "requested-chassis");
>> > >> > +        our_chassis = chassis_id && (
>> > >> > +            !strcmp(chassis_id, chassis_rec->name) ||
>> > >> > +            !strcmp(chassis_id, chassis_rec->hostname));
>> > >> > +        if (our_chassis) {
>> > >> > +            add_local_datapath(sbrec_datapath_binding_by_key,
>> > >> > +                               sbrec_port_binding_by_datapath,
>> > >> > +                               sbrec_port_binding_by_name,
>> > >> > +                               binding_rec->datapath, true, local_datapaths);
>> > >> > +        }
>> > >> >      }
>> > >> >
>> > >> >      if (our_chassis
>> > >> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
>> > >> > index 8db81927e..98e8ed3b9 100644
>> > >> > --- a/ovn/controller/lflow.c
>> > >> > +++ b/ovn/controller/lflow.c
>> > >> > @@ -52,7 +52,10 @@ lflow_init(void)
>> > >> >  struct lookup_port_aux {
>> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
>> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
>> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
>> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
>> > >> >      const struct sbrec_datapath_binding *dp;
>> > >> > +    const struct sbrec_chassis *chassis;
>> > >> >  };
>> > >> >
>> > >> >  struct condition_aux {
>> > >> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
>> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> > >> >      const struct sbrec_logical_flow *,
>> > >> >      const struct hmap *local_datapaths,
>> > >> >      const struct sbrec_chassis *,
>> > >> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char *port_name, unsigned int *portp)
>> > >> >      const struct sbrec_port_binding *pb
>> > >> >          = lport_lookup_by_name(aux->sbrec_port_binding_by_name, port_name);
>> > >> >      if (pb && pb->datapath == aux->dp) {
>> > >> > -        *portp = pb->tunnel_key;
>> > >> > -        return true;
>> > >> > +        if (strcmp(pb->type, "external")) {
>> > >> > +            *portp = pb->tunnel_key;
>> > >> > +            return true;
>> > >> > +        }
>> > >> > +        const char *chassis_id = smap_get(&pb->options,
>> > >> > +                                          "requested-chassis");
>> > >> > +        if (chassis_id && (!strcmp(chassis_id, aux->chassis->name) ||
>> > >> > +                           !strcmp(chassis_id, aux->chassis->hostname))) {
>> > >> > +            const struct sbrec_port_binding *localnet_pb
>> > >> > +                = lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
>> > >> > +                                       aux->sbrec_port_binding_by_type,
>> > >> > +                                       aux->dp->tunnel_key, "localnet");
>> > >> > +            if (localnet_pb) {
>> > >> > +                *portp = localnet_pb->tunnel_key;
>> > >> > +                return true;
>> > >> > +            }
>> > >> > +        }
>> > >> > +        return false;
>> > >> >      }
>> > >> >
>> > >> >      const struct sbrec_multicast_group *mg = mcgroup_lookup_by_dp_name(
>> > >> > @@ -144,6 +165,8 @@ add_logical_flows(
>> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> > >> >      const struct sbrec_dhcp_options_table *dhcp_options_table,
>> > >> >      const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
>> > >> >      const struct sbrec_logical_flow_table *logical_flow_table,
>> > >> > @@ -183,6 +206,8 @@ add_logical_flows(
>> > >> >          consider_logical_flow(sbrec_chassis_by_name,
>> > >> >                                sbrec_multicast_group_by_name_datapath,
>> > >> >                                sbrec_port_binding_by_name,
>> > >> > +                              sbrec_port_binding_by_type,
>> > >> > +                              sbrec_datapath_binding_by_key,
>> > >> >                                lflow, local_datapaths,
>> > >> >                                chassis, &dhcp_opts, &dhcpv6_opts, &nd_ra_opts,
>> > >> >                                addr_sets, port_groups, active_tunnels,
>> > >> > @@ -200,6 +225,8 @@ consider_logical_flow(
>> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> > >> >      const struct sbrec_logical_flow *lflow,
>> > >> >      const struct hmap *local_datapaths,
>> > >> >      const struct sbrec_chassis *chassis,
>> > >> > @@ -292,7 +319,10 @@ consider_logical_flow(
>> > >> >          .sbrec_multicast_group_by_name_datapath
>> > >> >              = sbrec_multicast_group_by_name_datapath,
>> > >> >          .sbrec_port_binding_by_name = sbrec_port_binding_by_name,
>> > >> > -        .dp = lflow->logical_datapath
>> > >> > +        .sbrec_port_binding_by_type = sbrec_port_binding_by_type,
>> > >> > +        .sbrec_datapath_binding_by_key = sbrec_datapath_binding_by_key,
>> > >> > +        .dp = lflow->logical_datapath,
>> > >> > +        .chassis = chassis
>> > >> >      };
>> > >> >      struct condition_aux cond_aux = {
>> > >> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
>> > >> > @@ -463,6 +493,8 @@ void
>> > >> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>> > >> >            struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> > >> >            struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> > >> > +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > >> > +          struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> > >> >            const struct sbrec_dhcp_options_table *dhcp_options_table,
>> > >> >            const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
>> > >> >            const struct sbrec_logical_flow_table *logical_flow_table,
>> > >> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>> > >> >
>> > >> >      add_logical_flows(sbrec_chassis_by_name,
>> > >> >                        sbrec_multicast_group_by_name_datapath,
>> > >> > -                      sbrec_port_binding_by_name, dhcp_options_table,
>> > >> > +                      sbrec_port_binding_by_name, sbrec_port_binding_by_type,
>> > >> > +                      sbrec_datapath_binding_by_key, dhcp_options_table,
>> > >> >                        dhcpv6_options_table, logical_flow_table,
>> > >> >                        local_datapaths, chassis, addr_sets, port_groups,
>> > >> >                        active_tunnels, local_lport_ids, flow_table, group_table,
>> > >> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
>> > >> > index d19338140..b2911e0eb 100644
>> > >> > --- a/ovn/controller/lflow.h
>> > >> > +++ b/ovn/controller/lflow.h
>> > >> > @@ -68,6 +68,8 @@ void lflow_init(void);
>> > >> >  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>> > >> >                 struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> > >> >                 struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> > >> > +               struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > >> > +               struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> > >> >                 const struct sbrec_dhcp_options_table *,
>> > >> >                 const struct sbrec_dhcpv6_options_table *,
>> > >> >                 const struct sbrec_logical_flow_table *,
>> > >> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
>> > >> > index cc5c5fbb2..9c827d9b0 100644
>> > >> > --- a/ovn/controller/lport.c
>> > >> > +++ b/ovn/controller/lport.c
>> > >> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> > >> >      return retval;
>> > >> >  }
>> > >> >
>> > >> > +const struct sbrec_port_binding *
>> > >> > +lport_lookup_by_type(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> > >> > +                     struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > >> > +                     uint64_t dp_key, const char *port_type)
>> > >> > +{
>> > >> > +    /* Lookup datapath corresponding to dp_key. */
>> > >> > +    const struct sbrec_datapath_binding *db = datapath_lookup_by_key(
>> > >> > +        sbrec_datapath_binding_by_key, dp_key);
>> > >> > +    if (!db) {
>> > >> > +        return NULL;
>> > >> > +    }
>> > >> > +
>> > >> > +    /* Build key for an indexed lookup. */
>> > >> > +    struct sbrec_port_binding *pb = sbrec_port_binding_index_init_row(
>> > >> > +            sbrec_port_binding_by_type);
>> > >> > +    sbrec_port_binding_index_set_datapath(pb, db);
>> > >> > +    sbrec_port_binding_index_set_type(pb, port_type);
>> > >> > +
>> > >> > +    const struct sbrec_port_binding *retval = sbrec_port_binding_index_find(
>> > >> > +            sbrec_port_binding_by_type, pb);
>> > >> > +
>> > >> > +    sbrec_port_binding_index_destroy_row(pb);
>> > >> > +
>> > >> > +    return retval;
>> > >> > +}
>> > >> > +
>> > >> >  const struct sbrec_datapath_binding *
>> > >> >  datapath_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> > >> >                         uint64_t dp_key)
>> > >> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
>> > >> > index 7dcd5bee0..2d49792f6 100644
>> > >> > --- a/ovn/controller/lport.h
>> > >> > +++ b/ovn/controller/lport.h
>> > >> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding *lport_lookup_by_key(
>> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
>> > >> >      uint64_t dp_key, uint64_t port_key);
>> > >> >
>> > >> > +const struct sbrec_port_binding *lport_lookup_by_type(
>> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> > >> > +    uint64_t dp_key, const char *port_type);
>> > >> > +
>> > >> >  const struct sbrec_datapath_binding *datapath_lookup_by_key(
>> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key, uint64_t dp_key);
>> > >> >
>> > >> > diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
>> > >> > index 4e9a5865f..5aab9142f 100644
>> > >> > --- a/ovn/controller/ovn-controller.c
>> > >> > +++ b/ovn/controller/ovn-controller.c
>> > >> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
>> > >> >       * ports that have a Gateway_Chassis that point's to our own
>> > >> >       * chassis */
>> > >> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "chassisredirect");
>> > >> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "external");
>> > >> >      if (chassis) {
>> > >> >          /* This should be mostly redundant with the other clauses for port
>> > >> >           * bindings, but it allows us to catch any ports that are assigned to
>> > >> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
>> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
>> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> > >> >                                    &sbrec_port_binding_col_datapath);
>> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
>> > >> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> > >> > +                                  &sbrec_port_binding_col_type);
>> > >>
>> > >> This index is used with two columns: datapath_binding and type, so it
>> > >> should be created with both columns using create2.
>> > >>
>> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
>> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> > >> >                                    &sbrec_datapath_binding_col_tunnel_key);
>> > >> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
>> > >> >                              sbrec_chassis_by_name,
>> > >> >                              sbrec_multicast_group_by_name_datapath,
>> > >> >                              sbrec_port_binding_by_name,
>> > >> > +                            sbrec_port_binding_by_type,
>> > >> > +                            sbrec_datapath_binding_by_key,
>> > >> >                              sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
>> > >> >                              sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
>> > >> >                              sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
>> > >> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
>> > >> > index aa03919bb..a9d4b8736 100644
>> > >> > --- a/ovn/lib/ovn-util.c
>> > >> > +++ b/ovn/lib/ovn-util.c
>> > >> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
>> > >> >      "localport",
>> > >> >      "router",
>> > >> >      "vtep",
>> > >> > +    "external",
>> > >> >  };
>> > >> >
>> > >> >  bool
>> > >> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
>> > >> > index 392a5efc9..c8883d60d 100644
>> > >> > --- a/ovn/northd/ovn-northd.8.xml
>> > >> > +++ b/ovn/northd/ovn-northd.8.xml
>> > >> > @@ -626,7 +626,8 @@ nd_na_router {
>> > >> >      <p>
>> > >> >        This table adds the DHCPv4 options to a DHCPv4 packet from the
>> > >> >        logical ports configured with IPv4 address(es) and DHCPv4 options,
>> > >> > -      and similarly for DHCPv6 options.
>> > >> > +      and similarly for DHCPv6 options. This table also adds flows for the
>> > >> > +      logical ports of type <code>external</code>.
>> > >> >      </p>
>> > >> >
>> > >> >      <ul>
>> > >> > @@ -827,7 +828,39 @@ output;
>> > >> >        </li>
>> > >> >      </ul>
>> > >> >
>> > >> > -    <h3>Ingress Table 16 Destination Lookup</h3>
>> > >> > +    <h3>Ingress table 16 External ports</h3>
>> > >> > +
>> > >> > +    <p>
>> > >> > +      Traffic from the <code>external</code> logical ports enter the ingress
>> > >> > +      datapath pipeline via the <code>localnet</code> port. This table adds the
>> > >> > +      below logical flows to handle the traffic from these ports.
>> > >> > +    </p>
>> > >> > +
>> > >> > +    <ul>
>> > >> > +      <li>
>> > >> > +        <p>
>> > >> > +          A priority-100 flow is added for each <code>external</code> logical
>> > >> > +          port which doesn't reside on a chassis to drop the ARP/IPv6 NS
>> > >> > +          request to the router IP(s) (of the logical switch) which matches
>> > >> > +          on the <code>inport</code> of the <code>external</code> logical port
>> > >> > +          and the valid <code>eth.src</code> address(es) of the
>> > >> > +          <code>external</code> logical port.
>> > >> > +        </p>
>> > >> > +
>> > >> > +        <p>
>> > >> > +          This flow guarantees that the ARP/NS request to the router IP
>> > >> > +          address from the external ports is responded by only the chassis
>> > >> > +          which has claimed these external ports. All the other chassis,
>> > >> > +          drops these packets.
>> > >> > +        </p>
>> > >> > +      </li>
>> > >> > +
>> > >> > +      <li>
>> > >> > +        A priority-0 flow that matches all packets to advances to table 17.
>> > >> > +      </li>
>> > >> > +    </ul>
>> > >> > +
>> > >> > +    <h3>Ingress Table 17 Destination Lookup</h3>
>> > >> >
>> > >> >      <p>
>> > >> >        This table implements switching behavior.  It contains these logical
>> > >> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>> > >> > index 3fd8a8757..87208c6c1 100644
>> > >> > --- a/ovn/northd/ovn-northd.c
>> > >> > +++ b/ovn/northd/ovn-northd.c
>> > >> > @@ -119,7 +119,8 @@ enum ovn_stage {
>> > >> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13, "ls_in_dhcp_response") \
>> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14, "ls_in_dns_lookup")    \
>> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15, "ls_in_dns_response")  \
>> > >> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16, "ls_in_l2_lkup")       \
>> > >> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16, "ls_in_external_port") \
>> > >> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17, "ls_in_l2_lkup")       \
>> > >> >                                                                            \
>> > >> >      /* Logical switch egress stages. */                                   \
>> > >> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")         \
>> > >> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct nbrec_logical_switch_port *lsp)
>> > >> >      return !lsp->up || *lsp->up;
>> > >> >  }
>> > >> >
>> > >> > +static bool
>> > >> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
>> > >> > +{
>> > >> > +    return !strcmp(nbsp->type, "external");
>> > >> > +}
>> > >> > +
>> > >> >  static bool
>> > >> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
>> > >> >                      struct ds *options_action, struct ds *response_action,
>> > >> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> > >> >           *  - port type is localport
>> > >> >           */
>> > >> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") &&
>> > >> > -            strcmp(op->nbsp->type, "localport")) {
>> > >> > +            strcmp(op->nbsp->type, "localport") && lsp_is_external(op->nbsp)) {
>> > >>
>> > >> Sorry that I missed this in last review. The && condition has problem.
>> > >> It will cause ARP responder flows added for all lports that are not
>> > >> external. I think it should be || here.
>> > >
>> > >
>> > > Agree. To make it easier to read, I will add a new "if" with continue - below this one for
>> > > external port types.
>> > >
>> > >
>> > >>
>> > >>
>> > >> >              continue;
>> > >> >          }
>> > >> >
>> > >> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> > >> >              continue;
>> > >> >          }
>> > >> >
>> > >> > +        bool is_external = lsp_is_external(op->nbsp);
>> > >> > +        if (is_external && !op->od->localnet_port) {
>> > >> > +            /* If it's an external port and there is no localnet port
>> > >> > +             * ignore it. */
>> > >> > +            continue;
>> > >> > +        }
>> > >> > +
>> > >> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>> > >> >              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
>> > >> >                  struct ds options_action = DS_EMPTY_INITIALIZER;
>> > >> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> > >> >                      ds_put_format(
>> > >> >                          &match, "inport == %s && eth.src == %s && "
>> > >> >                          "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
>> > >> > -                        "udp.src == 68 && udp.dst == 67", op->json_key,
>> > >> > -                        op->lsp_addrs[i].ea_s);
>> > >> > +                        "udp.src == 68 && udp.dst == 67",
>> > >> > +                        op->json_key, op->lsp_addrs[i].ea_s);
>> > >>
>> > >> No change here?
>> > >
>> > >
>> > > I think it's unwanted and unrelated change. I will correct it.
>> > >>
>> > >> >
>> > >> >                      ovn_lflow_add(lflows, op->od, S_SWITCH_IN_DHCP_OPTIONS,
>> > >> >                                    100, ds_cstr(&match),
>> > >> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> > >> >      /* Ingress table 12 and 13: DHCP options and response, by default goto
>> > >> >       * next. (priority 0).
>> > >> >       * Ingress table 14 and 15: DNS lookup and response, by default goto next.
>> > >> > -     * (priority 0).*/
>> > >> > +     * (priority 0).
>> > >> > +     * Ingress table 16 - External port handling, by default goto next.
>> > >> > +     * (priority 0). */
>> > >> >
>> > >> >      HMAP_FOR_EACH (od, key_node, datapaths) {
>> > >> >          if (!od->nbs) {
>> > >> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1", "next;");
>> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, "1", "next;");
>> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0, "1", "next;");
>> > >> > +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0, "1", "next;");
>> > >> >      }
>> > >> >
>> > >> > -    /* Ingress table 16: Destination lookup, broadcast and multicast handling
>> > >> > +    HMAP_FOR_EACH (op, key_node, ports) {
>> > >> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
>> > >> > +           continue;
>> > >> > +        }
>> > >> > +
>> > >> > +        /* Table 16: External port. Drop ARP request for router ips from
>> > >> > +         * external ports  on chassis not binding those ports.
>> > >> > +         * This makes the router pipeline to be run only on the chassis
>> > >> > +         * binding the external ports. */
>> > >> > +
>> > >> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>> > >> > +            for (size_t j = 0; j < op->od->n_router_ports; j++) {
>> > >> > +                struct ovn_port *rp = op->od->router_ports[j];
>> > >> > +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
>> > >> > +                    for (size_t l = 0; l < rp->lsp_addrs[k].n_ipv4_addrs;
>> > >> > +                         l++) {
>> > >> > +                        ds_clear(&match);
>> > >> > +                        ds_put_cstr(&match, "ip4");
>> > >> > +                        ds_put_format(
>> > >> > +                            &match, "inport == %s && eth.src == %s"
>> > >> > +                            " && !is_chassis_resident(%s)"
>> > >> > +                            " && arp.tpa == %s && arp.op == 1",
>> > >> > +                            op->json_key, op->lsp_addrs[i].ea_s, op->json_key,
>> > >>
>> > >> I believe the inport should match the localnet port's json_key here,
>> > >> since it is coming from a localnet port.
>> > >
>> > >
>> > > Both would work. If you see the code in lflow.c in this patch - it will get the tunnel
>> > > key of the localnet port if the port_binding type is "external".
>> > >
>> > > That's how even the DHCP requests are handled. ovn-controller will translate
>> > > the logical flows with action "put_dhcp_opts" only the chassis claiming the
>> > > external ports.
>> >
>> > Oh, yes you are right. Actually I read that part in v4 and it somehow
>> > slipped my mind. Thanks for explain.
>>
>> I thought it a second time, and I'd suggest to do the convertion here
>> in northd instead of ovn-controller, for two reasons:
>>
>> 1. In ovn-controller there is no extra context so it just blindly
>> transate all references to external logical port into localnet port
>> key. This could lead to unexpected behavior. For example, if someone
>> uses external logical port in ACL match condition. The match condition
>> would then apply to all packets to/from localnet port which is
>> definitely unwanted. (at the same time it would be better to document
>> that features like port-security, ACL should not be used for external
>> logical ports)
>>
>
> That's not how it works in the present patch. Lets say you have  2 chassis
> hv1 and hv2 and an external port sw0-ext1 and a localnet port "ln-public".
> Suppose if the requested-chassis is set to hv1, then all the logical flows with the
> match "inport == sw0-ext1" will be converted to OF flows only on hv1 as this port
> is bound by hv1 and the function 'lookup_port_cb()' would return true only
> on hv1 . In hv2, lookup_port_cb() would return false.

Yes, this is well understood.

>
> If we want to do the conversion in ovn-northd.c the match condition would have to
> be - "inport == "ln-public" && is_chassis_resident(sw0-ext1) && ...."
> instead of the present one  - "inport == sw0-ext1 && ...".

Yes, this is what I would suggest (see reason below).

>
> And the ACL match condition would not be an issue because of the above mentioned
> reason. i.e the ACL flows will be applied only on the chassis binding the external
> port.

Here is the concern. For example, chassis A has regular port sw0-lsp1
bound. Chassis A is also set as requested-chassis for external port
sw0-ext1. And now the user adds an ACL: to-port, 1001, 'outport ==
"sw0-ext1", drop. This would get translated to something like:
to-port, 1001, 'outport == "ln-public"', drop. Wouldn't this impact
the traffic from sw0-lsp1 to the bridged network? Even if it doesn't
impact because of some subtle reasons of current implementation, I
would say it is risky and could leads to problems under certain
conditions, because the conversion in ovn-controller widens the
original intent. Whereas doing it in northd only for specific lflows
would ensure it has impact only for intended use cases.

>
> The test case added checks that the OF flows are applied only on the bound chassis.
>
> I think it is better to do it in ovn-controller instead of ovn-northd. Please let me know
> if you still have any concerns.
>
>
>
>> 2. A less important reason is, it is better to do it at earlier stage
>> than later. northd handles common processing. This part of logic is
>> common for all chassises, so it would be better if we explicitely
>> handle it in northd, instead of let every chassis to process. And the
>> change in northd would likely be simpler than in ovn-controller.

This is less critical problem, but I think it is worth consideration,
too. With current logic, although the conversion would take effect
only if "is_chassis_resident()" is true, but the code logic and
processing has to happen on every chassis.

>>
>> Thanks,
>> Han
Numan Siddique Jan. 18, 2019, 7:12 p.m. UTC | #9
On Sat, Jan 19, 2019, 12:32 AM Han Zhou <zhouhan@gmail.com wrote:

> On Fri, Jan 18, 2019 at 10:16 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> >
> >
> >
> > On Fri, Jan 18, 2019 at 2:11 AM Han Zhou <zhouhan@gmail.com> wrote:
> >>
> >> On Thu, Jan 17, 2019 at 11:32 AM Han Zhou <zhouhan@gmail.com> wrote:
> >> >
> >> > On Thu, Jan 17, 2019 at 11:25 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> >> > >
> >> > >
> >> > >
> >> > > On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com>
> wrote:
> >> > >>
> >> > >> Hi Numan,
> >> > >>
> >> > >> With v5 the new test case "external logical port" fails.
> >> > >> And please see more comments inlined.
> >> > >>
> >> > >> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
> >> > >> >
> >> > >> > From: Numan Siddique <nusiddiq@redhat.com>
> >> > >> >
> >> > >> > In the case of OpenStack + OVN, when the VMs are booted on
> >> > >> > hypervisors supporting SR-IOV nics, there are no OVS ports
> >> > >> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
> >> > >> > Router Solicitation requests, the local ovn-controller
> >> > >> > cannot reply to these packets. OpenStack Neutron dhcp agent
> >> > >> > service needs to be run to serve these requests.
> >> > >> >
> >> > >> > With the new logical port type - 'external', OVN itself can
> >> > >> > handle these requests avoiding the need to deploy any
> >> > >> > external services like neutron dhcp agent.
> >> > >> >
> >> > >> > To make use of this feature, CMS has to
> >> > >> >  - create a logical port for such VMs
> >> > >> >  - set the type to 'external'
> >> > >> >  - set requested-chassis="<chassis-name>" in the options
> >> > >> >    column.
> >> > >> >  - create a localnet port for the logical switch
> >> > >> >  - configure the ovn-bridge-mappings option in the OVS db.
> >> > >> >
> >> > >> > When the ovn-controller running in that 'chassis', detects
> >> > >> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
> >> > >> > flows. Since the packet enters the logical switch pipeline
> >> > >> > via the localnet port, the inport register (reg14) is set
> >> > >> > to the tunnel key of localnet port in the match conditions.
> >> > >> >
> >> > >> > In case the chassis goes down for some reason, it is the
> >> > >> > responsibility of CMS to change the 'requested-chassis'
> >> > >> > option to some other active chassis, so that it can serve
> >> > >> > these requests.
> >> > >> >
> >> > >> > When the VM with the external port, sends an ARP request for
> >> > >> > the router ips, only the chassis which has claimed the port,
> >> > >> > will reply to the ARP requests. Rest of the chassis on
> >> > >> > receiving these packets drop them in the ingress switch
> >> > >> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
> >> > >> > before S_SWITCH_IN_L2_LKUP.
> >> > >> >
> >> > >> > This would guarantee that only the chassis which has claimed
> >> > >> > the external ports will run the router datapath pipeline.
> >> > >> >
> >> > >> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> >> > >> > ---
> >> > >> >
> >> > >> > v4 -> v5
> >> > >> > ------
> >> > >> >   * Addressed review comments from Han Zhou.
> >> > >> >
> >> > >> > v3 -> v4
> >> > >> > ------
> >> > >> >   * Updated the documention as per Han Zhou's suggestion.
> >> > >> >
> >> > >> > v2 -> v3
> >> > >> > -------
> >> > >> >   * Rebased
> >> > >> >
> >> > >> >  ovn/controller/binding.c        |  12 +
> >> > >> >  ovn/controller/lflow.c          |  41 ++-
> >> > >> >  ovn/controller/lflow.h          |   2 +
> >> > >> >  ovn/controller/lport.c          |  26 ++
> >> > >> >  ovn/controller/lport.h          |   5 +
> >> > >> >  ovn/controller/ovn-controller.c |   6 +
> >> > >> >  ovn/lib/ovn-util.c              |   1 +
> >> > >> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
> >> > >> >  ovn/northd/ovn-northd.c         |  85 ++++-
> >> > >> >  ovn/ovn-architecture.7.xml      |  78 +++++
> >> > >> >  ovn/ovn-nb.xml                  |  47 +++
> >> > >> >  tests/ovn.at                    | 530
> +++++++++++++++++++++++++++++++-
> >> > >> >  12 files changed, 848 insertions(+), 22 deletions(-)
> >> > >> >
> >> > >> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
> >> > >> > index 021ecddcf..64e605b92 100644
> >> > >> > --- a/ovn/controller/binding.c
> >> > >> > +++ b/ovn/controller/binding.c
> >> > >> > @@ -471,6 +471,18 @@ consider_local_datapath(struct
> ovsdb_idl_txn *ovnsb_idl_txn,
> >> > >> >           * for them. */
> >> > >> >          sset_add(local_lports, binding_rec->logical_port);
> >> > >> >          our_chassis = false;
> >> > >> > +    } else if (!strcmp(binding_rec->type, "external")) {
> >> > >> > +        const char *chassis_id = smap_get(&binding_rec->options,
> >> > >> > +                                          "requested-chassis");
> >> > >> > +        our_chassis = chassis_id && (
> >> > >> > +            !strcmp(chassis_id, chassis_rec->name) ||
> >> > >> > +            !strcmp(chassis_id, chassis_rec->hostname));
> >> > >> > +        if (our_chassis) {
> >> > >> > +            add_local_datapath(sbrec_datapath_binding_by_key,
> >> > >> > +                               sbrec_port_binding_by_datapath,
> >> > >> > +                               sbrec_port_binding_by_name,
> >> > >> > +                               binding_rec->datapath, true,
> local_datapaths);
> >> > >> > +        }
> >> > >> >      }
> >> > >> >
> >> > >> >      if (our_chassis
> >> > >> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
> >> > >> > index 8db81927e..98e8ed3b9 100644
> >> > >> > --- a/ovn/controller/lflow.c
> >> > >> > +++ b/ovn/controller/lflow.c
> >> > >> > @@ -52,7 +52,10 @@ lflow_init(void)
> >> > >> >  struct lookup_port_aux {
> >> > >> >      struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath;
> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
> >> > >> >      const struct sbrec_datapath_binding *dp;
> >> > >> > +    const struct sbrec_chassis *chassis;
> >> > >> >  };
> >> > >> >
> >> > >> >  struct condition_aux {
> >> > >> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >> > >> >      struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> > >> >      const struct sbrec_logical_flow *,
> >> > >> >      const struct hmap *local_datapaths,
> >> > >> >      const struct sbrec_chassis *,
> >> > >> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char
> *port_name, unsigned int *portp)
> >> > >> >      const struct sbrec_port_binding *pb
> >> > >> >          = lport_lookup_by_name(aux->sbrec_port_binding_by_name,
> port_name);
> >> > >> >      if (pb && pb->datapath == aux->dp) {
> >> > >> > -        *portp = pb->tunnel_key;
> >> > >> > -        return true;
> >> > >> > +        if (strcmp(pb->type, "external")) {
> >> > >> > +            *portp = pb->tunnel_key;
> >> > >> > +            return true;
> >> > >> > +        }
> >> > >> > +        const char *chassis_id = smap_get(&pb->options,
> >> > >> > +                                          "requested-chassis");
> >> > >> > +        if (chassis_id && (!strcmp(chassis_id,
> aux->chassis->name) ||
> >> > >> > +                           !strcmp(chassis_id,
> aux->chassis->hostname))) {
> >> > >> > +            const struct sbrec_port_binding *localnet_pb
> >> > >> > +                =
> lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
> >> > >> > +
>  aux->sbrec_port_binding_by_type,
> >> > >> > +                                       aux->dp->tunnel_key,
> "localnet");
> >> > >> > +            if (localnet_pb) {
> >> > >> > +                *portp = localnet_pb->tunnel_key;
> >> > >> > +                return true;
> >> > >> > +            }
> >> > >> > +        }
> >> > >> > +        return false;
> >> > >> >      }
> >> > >> >
> >> > >> >      const struct sbrec_multicast_group *mg =
> mcgroup_lookup_by_dp_name(
> >> > >> > @@ -144,6 +165,8 @@ add_logical_flows(
> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >> > >> >      struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> > >> >      const struct sbrec_dhcp_options_table *dhcp_options_table,
> >> > >> >      const struct sbrec_dhcpv6_options_table
> *dhcpv6_options_table,
> >> > >> >      const struct sbrec_logical_flow_table *logical_flow_table,
> >> > >> > @@ -183,6 +206,8 @@ add_logical_flows(
> >> > >> >          consider_logical_flow(sbrec_chassis_by_name,
> >> > >> >
> sbrec_multicast_group_by_name_datapath,
> >> > >> >                                sbrec_port_binding_by_name,
> >> > >> > +                              sbrec_port_binding_by_type,
> >> > >> > +                              sbrec_datapath_binding_by_key,
> >> > >> >                                lflow, local_datapaths,
> >> > >> >                                chassis, &dhcp_opts,
> &dhcpv6_opts, &nd_ra_opts,
> >> > >> >                                addr_sets, port_groups,
> active_tunnels,
> >> > >> > @@ -200,6 +225,8 @@ consider_logical_flow(
> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >> > >> >      struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> > >> >      const struct sbrec_logical_flow *lflow,
> >> > >> >      const struct hmap *local_datapaths,
> >> > >> >      const struct sbrec_chassis *chassis,
> >> > >> > @@ -292,7 +319,10 @@ consider_logical_flow(
> >> > >> >          .sbrec_multicast_group_by_name_datapath
> >> > >> >              = sbrec_multicast_group_by_name_datapath,
> >> > >> >          .sbrec_port_binding_by_name =
> sbrec_port_binding_by_name,
> >> > >> > -        .dp = lflow->logical_datapath
> >> > >> > +        .sbrec_port_binding_by_type =
> sbrec_port_binding_by_type,
> >> > >> > +        .sbrec_datapath_binding_by_key =
> sbrec_datapath_binding_by_key,
> >> > >> > +        .dp = lflow->logical_datapath,
> >> > >> > +        .chassis = chassis
> >> > >> >      };
> >> > >> >      struct condition_aux cond_aux = {
> >> > >> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
> >> > >> > @@ -463,6 +493,8 @@ void
> >> > >> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> >> > >> >            struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >> > >> >            struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >> > >> > +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > >> > +          struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> > >> >            const struct sbrec_dhcp_options_table
> *dhcp_options_table,
> >> > >> >            const struct sbrec_dhcpv6_options_table
> *dhcpv6_options_table,
> >> > >> >            const struct sbrec_logical_flow_table
> *logical_flow_table,
> >> > >> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index
> *sbrec_chassis_by_name,
> >> > >> >
> >> > >> >      add_logical_flows(sbrec_chassis_by_name,
> >> > >> >                        sbrec_multicast_group_by_name_datapath,
> >> > >> > -                      sbrec_port_binding_by_name,
> dhcp_options_table,
> >> > >> > +                      sbrec_port_binding_by_name,
> sbrec_port_binding_by_type,
> >> > >> > +                      sbrec_datapath_binding_by_key,
> dhcp_options_table,
> >> > >> >                        dhcpv6_options_table, logical_flow_table,
> >> > >> >                        local_datapaths, chassis, addr_sets,
> port_groups,
> >> > >> >                        active_tunnels, local_lport_ids,
> flow_table, group_table,
> >> > >> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
> >> > >> > index d19338140..b2911e0eb 100644
> >> > >> > --- a/ovn/controller/lflow.h
> >> > >> > +++ b/ovn/controller/lflow.h
> >> > >> > @@ -68,6 +68,8 @@ void lflow_init(void);
> >> > >> >  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> >> > >> >                 struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >> > >> >                 struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
> >> > >> > +               struct ovsdb_idl_index
> *sbrec_port_binding_by_type,
> >> > >> > +               struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >> > >> >                 const struct sbrec_dhcp_options_table *,
> >> > >> >                 const struct sbrec_dhcpv6_options_table *,
> >> > >> >                 const struct sbrec_logical_flow_table *,
> >> > >> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
> >> > >> > index cc5c5fbb2..9c827d9b0 100644
> >> > >> > --- a/ovn/controller/lport.c
> >> > >> > +++ b/ovn/controller/lport.c
> >> > >> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >> > >> >      return retval;
> >> > >> >  }
> >> > >> >
> >> > >> > +const struct sbrec_port_binding *
> >> > >> > +lport_lookup_by_type(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >> > >> > +                     struct ovsdb_idl_index
> *sbrec_port_binding_by_type,
> >> > >> > +                     uint64_t dp_key, const char *port_type)
> >> > >> > +{
> >> > >> > +    /* Lookup datapath corresponding to dp_key. */
> >> > >> > +    const struct sbrec_datapath_binding *db =
> datapath_lookup_by_key(
> >> > >> > +        sbrec_datapath_binding_by_key, dp_key);
> >> > >> > +    if (!db) {
> >> > >> > +        return NULL;
> >> > >> > +    }
> >> > >> > +
> >> > >> > +    /* Build key for an indexed lookup. */
> >> > >> > +    struct sbrec_port_binding *pb =
> sbrec_port_binding_index_init_row(
> >> > >> > +            sbrec_port_binding_by_type);
> >> > >> > +    sbrec_port_binding_index_set_datapath(pb, db);
> >> > >> > +    sbrec_port_binding_index_set_type(pb, port_type);
> >> > >> > +
> >> > >> > +    const struct sbrec_port_binding *retval =
> sbrec_port_binding_index_find(
> >> > >> > +            sbrec_port_binding_by_type, pb);
> >> > >> > +
> >> > >> > +    sbrec_port_binding_index_destroy_row(pb);
> >> > >> > +
> >> > >> > +    return retval;
> >> > >> > +}
> >> > >> > +
> >> > >> >  const struct sbrec_datapath_binding *
> >> > >> >  datapath_lookup_by_key(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >> > >> >                         uint64_t dp_key)
> >> > >> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
> >> > >> > index 7dcd5bee0..2d49792f6 100644
> >> > >> > --- a/ovn/controller/lport.h
> >> > >> > +++ b/ovn/controller/lport.h
> >> > >> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding
> *lport_lookup_by_key(
> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
> >> > >> >      uint64_t dp_key, uint64_t port_key);
> >> > >> >
> >> > >> > +const struct sbrec_port_binding *lport_lookup_by_type(
> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >> > >> > +    uint64_t dp_key, const char *port_type);
> >> > >> > +
> >> > >> >  const struct sbrec_datapath_binding *datapath_lookup_by_key(
> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> uint64_t dp_key);
> >> > >> >
> >> > >> > diff --git a/ovn/controller/ovn-controller.c
> b/ovn/controller/ovn-controller.c
> >> > >> > index 4e9a5865f..5aab9142f 100644
> >> > >> > --- a/ovn/controller/ovn-controller.c
> >> > >> > +++ b/ovn/controller/ovn-controller.c
> >> > >> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl
> *ovnsb_idl,
> >> > >> >       * ports that have a Gateway_Chassis that point's to our own
> >> > >> >       * chassis */
> >> > >> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
> "chassisredirect");
> >> > >> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
> "external");
> >> > >> >      if (chassis) {
> >> > >> >          /* This should be mostly redundant with the other
> clauses for port
> >> > >> >           * bindings, but it allows us to catch any ports that
> are assigned to
> >> > >> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >> > >> >
> &sbrec_port_binding_col_datapath);
> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
> >> > >> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >> > >> > +                                  &sbrec_port_binding_col_type);
> >> > >>
> >> > >> This index is used with two columns: datapath_binding and type, so
> it
> >> > >> should be created with both columns using create2.
> >> > >>
> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >> > >> >
> &sbrec_datapath_binding_col_tunnel_key);
> >> > >> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
> >> > >> >                              sbrec_chassis_by_name,
> >> > >> >
> sbrec_multicast_group_by_name_datapath,
> >> > >> >                              sbrec_port_binding_by_name,
> >> > >> > +                            sbrec_port_binding_by_type,
> >> > >> > +                            sbrec_datapath_binding_by_key,
> >> > >> >
> sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
> >> > >> >
> sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
> >> > >> >
> sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
> >> > >> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
> >> > >> > index aa03919bb..a9d4b8736 100644
> >> > >> > --- a/ovn/lib/ovn-util.c
> >> > >> > +++ b/ovn/lib/ovn-util.c
> >> > >> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
> >> > >> >      "localport",
> >> > >> >      "router",
> >> > >> >      "vtep",
> >> > >> > +    "external",
> >> > >> >  };
> >> > >> >
> >> > >> >  bool
> >> > >> > diff --git a/ovn/northd/ovn-northd.8.xml
> b/ovn/northd/ovn-northd.8.xml
> >> > >> > index 392a5efc9..c8883d60d 100644
> >> > >> > --- a/ovn/northd/ovn-northd.8.xml
> >> > >> > +++ b/ovn/northd/ovn-northd.8.xml
> >> > >> > @@ -626,7 +626,8 @@ nd_na_router {
> >> > >> >      <p>
> >> > >> >        This table adds the DHCPv4 options to a DHCPv4 packet
> from the
> >> > >> >        logical ports configured with IPv4 address(es) and DHCPv4
> options,
> >> > >> > -      and similarly for DHCPv6 options.
> >> > >> > +      and similarly for DHCPv6 options. This table also adds
> flows for the
> >> > >> > +      logical ports of type <code>external</code>.
> >> > >> >      </p>
> >> > >> >
> >> > >> >      <ul>
> >> > >> > @@ -827,7 +828,39 @@ output;
> >> > >> >        </li>
> >> > >> >      </ul>
> >> > >> >
> >> > >> > -    <h3>Ingress Table 16 Destination Lookup</h3>
> >> > >> > +    <h3>Ingress table 16 External ports</h3>
> >> > >> > +
> >> > >> > +    <p>
> >> > >> > +      Traffic from the <code>external</code> logical ports
> enter the ingress
> >> > >> > +      datapath pipeline via the <code>localnet</code> port.
> This table adds the
> >> > >> > +      below logical flows to handle the traffic from these
> ports.
> >> > >> > +    </p>
> >> > >> > +
> >> > >> > +    <ul>
> >> > >> > +      <li>
> >> > >> > +        <p>
> >> > >> > +          A priority-100 flow is added for each
> <code>external</code> logical
> >> > >> > +          port which doesn't reside on a chassis to drop the
> ARP/IPv6 NS
> >> > >> > +          request to the router IP(s) (of the logical switch)
> which matches
> >> > >> > +          on the <code>inport</code> of the
> <code>external</code> logical port
> >> > >> > +          and the valid <code>eth.src</code> address(es) of the
> >> > >> > +          <code>external</code> logical port.
> >> > >> > +        </p>
> >> > >> > +
> >> > >> > +        <p>
> >> > >> > +          This flow guarantees that the ARP/NS request to the
> router IP
> >> > >> > +          address from the external ports is responded by only
> the chassis
> >> > >> > +          which has claimed these external ports. All the other
> chassis,
> >> > >> > +          drops these packets.
> >> > >> > +        </p>
> >> > >> > +      </li>
> >> > >> > +
> >> > >> > +      <li>
> >> > >> > +        A priority-0 flow that matches all packets to advances
> to table 17.
> >> > >> > +      </li>
> >> > >> > +    </ul>
> >> > >> > +
> >> > >> > +    <h3>Ingress Table 17 Destination Lookup</h3>
> >> > >> >
> >> > >> >      <p>
> >> > >> >        This table implements switching behavior.  It contains
> these logical
> >> > >> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> >> > >> > index 3fd8a8757..87208c6c1 100644
> >> > >> > --- a/ovn/northd/ovn-northd.c
> >> > >> > +++ b/ovn/northd/ovn-northd.c
> >> > >> > @@ -119,7 +119,8 @@ enum ovn_stage {
> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13,
> "ls_in_dhcp_response") \
> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14,
> "ls_in_dns_lookup")    \
> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15,
> "ls_in_dns_response")  \
> >> > >> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16,
> "ls_in_l2_lkup")       \
> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16,
> "ls_in_external_port") \
> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17,
> "ls_in_l2_lkup")       \
> >> > >> >
>           \
> >> > >> >      /* Logical switch egress stages. */
>            \
> >> > >> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0,
> "ls_out_pre_lb")         \
> >> > >> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct
> nbrec_logical_switch_port *lsp)
> >> > >> >      return !lsp->up || *lsp->up;
> >> > >> >  }
> >> > >> >
> >> > >> > +static bool
> >> > >> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
> >> > >> > +{
> >> > >> > +    return !strcmp(nbsp->type, "external");
> >> > >> > +}
> >> > >> > +
> >> > >> >  static bool
> >> > >> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
> >> > >> >                      struct ds *options_action, struct ds
> *response_action,
> >> > >> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap
> *datapaths, struct hmap *ports,
> >> > >> >           *  - port type is localport
> >> > >> >           */
> >> > >> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type,
> "router") &&
> >> > >> > -            strcmp(op->nbsp->type, "localport")) {
> >> > >> > +            strcmp(op->nbsp->type, "localport") &&
> lsp_is_external(op->nbsp)) {
> >> > >>
> >> > >> Sorry that I missed this in last review. The && condition has
> problem.
> >> > >> It will cause ARP responder flows added for all lports that are not
> >> > >> external. I think it should be || here.
> >> > >
> >> > >
> >> > > Agree. To make it easier to read, I will add a new "if" with
> continue - below this one for
> >> > > external port types.
> >> > >
> >> > >
> >> > >>
> >> > >>
> >> > >> >              continue;
> >> > >> >          }
> >> > >> >
> >> > >> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap
> *datapaths, struct hmap *ports,
> >> > >> >              continue;
> >> > >> >          }
> >> > >> >
> >> > >> > +        bool is_external = lsp_is_external(op->nbsp);
> >> > >> > +        if (is_external && !op->od->localnet_port) {
> >> > >> > +            /* If it's an external port and there is no
> localnet port
> >> > >> > +             * ignore it. */
> >> > >> > +            continue;
> >> > >> > +        }
> >> > >> > +
> >> > >> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> >> > >> >              for (size_t j = 0; j <
> op->lsp_addrs[i].n_ipv4_addrs; j++) {
> >> > >> >                  struct ds options_action = DS_EMPTY_INITIALIZER;
> >> > >> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap
> *datapaths, struct hmap *ports,
> >> > >> >                      ds_put_format(
> >> > >> >                          &match, "inport == %s && eth.src == %s
> && "
> >> > >> >                          "ip4.src == 0.0.0.0 && ip4.dst ==
> 255.255.255.255 && "
> >> > >> > -                        "udp.src == 68 && udp.dst == 67",
> op->json_key,
> >> > >> > -                        op->lsp_addrs[i].ea_s);
> >> > >> > +                        "udp.src == 68 && udp.dst == 67",
> >> > >> > +                        op->json_key, op->lsp_addrs[i].ea_s);
> >> > >>
> >> > >> No change here?
> >> > >
> >> > >
> >> > > I think it's unwanted and unrelated change. I will correct it.
> >> > >>
> >> > >> >
> >> > >> >                      ovn_lflow_add(lflows, op->od,
> S_SWITCH_IN_DHCP_OPTIONS,
> >> > >> >                                    100, ds_cstr(&match),
> >> > >> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap
> *datapaths, struct hmap *ports,
> >> > >> >      /* Ingress table 12 and 13: DHCP options and response, by
> default goto
> >> > >> >       * next. (priority 0).
> >> > >> >       * Ingress table 14 and 15: DNS lookup and response, by
> default goto next.
> >> > >> > -     * (priority 0).*/
> >> > >> > +     * (priority 0).
> >> > >> > +     * Ingress table 16 - External port handling, by default
> goto next.
> >> > >> > +     * (priority 0). */
> >> > >> >
> >> > >> >      HMAP_FOR_EACH (od, key_node, datapaths) {
> >> > >> >          if (!od->nbs) {
> >> > >> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap
> *datapaths, struct hmap *ports,
> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0,
> "1", "next;");
> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0,
> "1", "next;");
> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0,
> "1", "next;");
> >> > >> > +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0,
> "1", "next;");
> >> > >> >      }
> >> > >> >
> >> > >> > -    /* Ingress table 16: Destination lookup, broadcast and
> multicast handling
> >> > >> > +    HMAP_FOR_EACH (op, key_node, ports) {
> >> > >> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
> >> > >> > +           continue;
> >> > >> > +        }
> >> > >> > +
> >> > >> > +        /* Table 16: External port. Drop ARP request for router
> ips from
> >> > >> > +         * external ports  on chassis not binding those ports.
> >> > >> > +         * This makes the router pipeline to be run only on the
> chassis
> >> > >> > +         * binding the external ports. */
> >> > >> > +
> >> > >> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> >> > >> > +            for (size_t j = 0; j < op->od->n_router_ports; j++)
> {
> >> > >> > +                struct ovn_port *rp = op->od->router_ports[j];
> >> > >> > +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
> >> > >> > +                    for (size_t l = 0; l <
> rp->lsp_addrs[k].n_ipv4_addrs;
> >> > >> > +                         l++) {
> >> > >> > +                        ds_clear(&match);
> >> > >> > +                        ds_put_cstr(&match, "ip4");
> >> > >> > +                        ds_put_format(
> >> > >> > +                            &match, "inport == %s && eth.src ==
> %s"
> >> > >> > +                            " && !is_chassis_resident(%s)"
> >> > >> > +                            " && arp.tpa == %s && arp.op == 1",
> >> > >> > +                            op->json_key,
> op->lsp_addrs[i].ea_s, op->json_key,
> >> > >>
> >> > >> I believe the inport should match the localnet port's json_key
> here,
> >> > >> since it is coming from a localnet port.
> >> > >
> >> > >
> >> > > Both would work. If you see the code in lflow.c in this patch - it
> will get the tunnel
> >> > > key of the localnet port if the port_binding type is "external".
> >> > >
> >> > > That's how even the DHCP requests are handled. ovn-controller will
> translate
> >> > > the logical flows with action "put_dhcp_opts" only the chassis
> claiming the
> >> > > external ports.
> >> >
> >> > Oh, yes you are right. Actually I read that part in v4 and it somehow
> >> > slipped my mind. Thanks for explain.
> >>
> >> I thought it a second time, and I'd suggest to do the convertion here
> >> in northd instead of ovn-controller, for two reasons:
> >>
> >> 1. In ovn-controller there is no extra context so it just blindly
> >> transate all references to external logical port into localnet port
> >> key. This could lead to unexpected behavior. For example, if someone
> >> uses external logical port in ACL match condition. The match condition
> >> would then apply to all packets to/from localnet port which is
> >> definitely unwanted. (at the same time it would be better to document
> >> that features like port-security, ACL should not be used for external
> >> logical ports)
> >>
> >
> > That's not how it works in the present patch. Lets say you have  2
> chassis
> > hv1 and hv2 and an external port sw0-ext1 and a localnet port
> "ln-public".
> > Suppose if the requested-chassis is set to hv1, then all the logical
> flows with the
> > match "inport == sw0-ext1" will be converted to OF flows only on hv1 as
> this port
> > is bound by hv1 and the function 'lookup_port_cb()' would return true
> only
> > on hv1 . In hv2, lookup_port_cb() would return false.
>
> Yes, this is well understood.
>
> >
> > If we want to do the conversion in ovn-northd.c the match condition
> would have to
> > be - "inport == "ln-public" && is_chassis_resident(sw0-ext1) && ...."
> > instead of the present one  - "inport == sw0-ext1 && ...".
>
> Yes, this is what I would suggest (see reason below).
>
> >
> > And the ACL match condition would not be an issue because of the above
> mentioned
> > reason. i.e the ACL flows will be applied only on the chassis binding
> the external
> > port.
>
> Here is the concern. For example, chassis A has regular port sw0-lsp1
> bound. Chassis A is also set as requested-chassis for external port
> sw0-ext1. And now the user adds an ACL: to-port, 1001, 'outport ==
> "sw0-ext1", drop. This would get translated to something like:
> to-port, 1001, 'outport == "ln-public"', drop. Wouldn't this impact
> the traffic from sw0-lsp1 to the bridged network? Even if it doesn't
> impact because of some subtle reasons of current implementation, I
> would say it is risky and could leads to problems under certain
> conditions, because the conversion in ovn-controller widens the
> original intent. Whereas doing it in northd only for specific lflows
> would ensure it has impact only for intended use cases.
>


Thanks for the detailed explanation. I agree. It's clear to me now. I will
update accordingly in v6.

Regards
Numan


> >
> > The test case added checks that the OF flows are applied only on the
> bound chassis.
> >
> > I think it is better to do it in ovn-controller instead of ovn-northd.
> Please let me know
> > if you still have any concerns.
> >
> >
> >
> >> 2. A less important reason is, it is better to do it at earlier stage
> >> than later. northd handles common processing. This part of logic is
> >> common for all chassises, so it would be better if we explicitely
> >> handle it in northd, instead of let every chassis to process. And the
> >> change in northd would likely be simpler than in ovn-controller.
>
> This is less critical problem, but I think it is worth consideration,
> too. With current logic, although the conversion would take effect
> only if "is_chassis_resident()" is true, but the code logic and
> processing has to happen on every chassis.
>
> >>
> >> Thanks,
> >> Han
>
Numan Siddique Jan. 21, 2019, 3:01 p.m. UTC | #10
Hi Han,

I have addressed your comments. But before posting the patch I wanted to
get an opinion
on the HA support for these external ports.

The proposed patch doesn't support HA. If the requested chassis goes down
for some reason
it is expected that CMS would detect it and change the requested-chassis
option to other
suitable chassis.

The openstack OVN folks think this would be too much for the CMS to handle
and it would
complicate the code in networking-ovn which I agree with.

I am thinking to add the HA support on the lines of gateway chassis support
and I want to
submit this patch after adding the HA support. I think this would be better
as we won't add
more options in OVN (first requested-chassis for external ports and then
later HA chassis support).
Thoughts?

Thanks
Numan


On Sat, Jan 19, 2019 at 12:42 AM Numan Siddique <nusiddiq@redhat.com> wrote:

>
>
> On Sat, Jan 19, 2019, 12:32 AM Han Zhou <zhouhan@gmail.com wrote:
>
>> On Fri, Jan 18, 2019 at 10:16 AM Numan Siddique <nusiddiq@redhat.com>
>> wrote:
>> >
>> >
>> >
>> > On Fri, Jan 18, 2019 at 2:11 AM Han Zhou <zhouhan@gmail.com> wrote:
>> >>
>> >> On Thu, Jan 17, 2019 at 11:32 AM Han Zhou <zhouhan@gmail.com> wrote:
>> >> >
>> >> > On Thu, Jan 17, 2019 at 11:25 AM Numan Siddique <nusiddiq@redhat.com>
>> wrote:
>> >> > >
>> >> > >
>> >> > >
>> >> > > On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com>
>> wrote:
>> >> > >>
>> >> > >> Hi Numan,
>> >> > >>
>> >> > >> With v5 the new test case "external logical port" fails.
>> >> > >> And please see more comments inlined.
>> >> > >>
>> >> > >> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
>> >> > >> >
>> >> > >> > From: Numan Siddique <nusiddiq@redhat.com>
>> >> > >> >
>> >> > >> > In the case of OpenStack + OVN, when the VMs are booted on
>> >> > >> > hypervisors supporting SR-IOV nics, there are no OVS ports
>> >> > >> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
>> >> > >> > Router Solicitation requests, the local ovn-controller
>> >> > >> > cannot reply to these packets. OpenStack Neutron dhcp agent
>> >> > >> > service needs to be run to serve these requests.
>> >> > >> >
>> >> > >> > With the new logical port type - 'external', OVN itself can
>> >> > >> > handle these requests avoiding the need to deploy any
>> >> > >> > external services like neutron dhcp agent.
>> >> > >> >
>> >> > >> > To make use of this feature, CMS has to
>> >> > >> >  - create a logical port for such VMs
>> >> > >> >  - set the type to 'external'
>> >> > >> >  - set requested-chassis="<chassis-name>" in the options
>> >> > >> >    column.
>> >> > >> >  - create a localnet port for the logical switch
>> >> > >> >  - configure the ovn-bridge-mappings option in the OVS db.
>> >> > >> >
>> >> > >> > When the ovn-controller running in that 'chassis', detects
>> >> > >> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
>> >> > >> > flows. Since the packet enters the logical switch pipeline
>> >> > >> > via the localnet port, the inport register (reg14) is set
>> >> > >> > to the tunnel key of localnet port in the match conditions.
>> >> > >> >
>> >> > >> > In case the chassis goes down for some reason, it is the
>> >> > >> > responsibility of CMS to change the 'requested-chassis'
>> >> > >> > option to some other active chassis, so that it can serve
>> >> > >> > these requests.
>> >> > >> >
>> >> > >> > When the VM with the external port, sends an ARP request for
>> >> > >> > the router ips, only the chassis which has claimed the port,
>> >> > >> > will reply to the ARP requests. Rest of the chassis on
>> >> > >> > receiving these packets drop them in the ingress switch
>> >> > >> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
>> >> > >> > before S_SWITCH_IN_L2_LKUP.
>> >> > >> >
>> >> > >> > This would guarantee that only the chassis which has claimed
>> >> > >> > the external ports will run the router datapath pipeline.
>> >> > >> >
>> >> > >> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>> >> > >> > ---
>> >> > >> >
>> >> > >> > v4 -> v5
>> >> > >> > ------
>> >> > >> >   * Addressed review comments from Han Zhou.
>> >> > >> >
>> >> > >> > v3 -> v4
>> >> > >> > ------
>> >> > >> >   * Updated the documention as per Han Zhou's suggestion.
>> >> > >> >
>> >> > >> > v2 -> v3
>> >> > >> > -------
>> >> > >> >   * Rebased
>> >> > >> >
>> >> > >> >  ovn/controller/binding.c        |  12 +
>> >> > >> >  ovn/controller/lflow.c          |  41 ++-
>> >> > >> >  ovn/controller/lflow.h          |   2 +
>> >> > >> >  ovn/controller/lport.c          |  26 ++
>> >> > >> >  ovn/controller/lport.h          |   5 +
>> >> > >> >  ovn/controller/ovn-controller.c |   6 +
>> >> > >> >  ovn/lib/ovn-util.c              |   1 +
>> >> > >> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
>> >> > >> >  ovn/northd/ovn-northd.c         |  85 ++++-
>> >> > >> >  ovn/ovn-architecture.7.xml      |  78 +++++
>> >> > >> >  ovn/ovn-nb.xml                  |  47 +++
>> >> > >> >  tests/ovn.at                    | 530
>> +++++++++++++++++++++++++++++++-
>> >> > >> >  12 files changed, 848 insertions(+), 22 deletions(-)
>> >> > >> >
>> >> > >> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
>> >> > >> > index 021ecddcf..64e605b92 100644
>> >> > >> > --- a/ovn/controller/binding.c
>> >> > >> > +++ b/ovn/controller/binding.c
>> >> > >> > @@ -471,6 +471,18 @@ consider_local_datapath(struct
>> ovsdb_idl_txn *ovnsb_idl_txn,
>> >> > >> >           * for them. */
>> >> > >> >          sset_add(local_lports, binding_rec->logical_port);
>> >> > >> >          our_chassis = false;
>> >> > >> > +    } else if (!strcmp(binding_rec->type, "external")) {
>> >> > >> > +        const char *chassis_id =
>> smap_get(&binding_rec->options,
>> >> > >> > +                                          "requested-chassis");
>> >> > >> > +        our_chassis = chassis_id && (
>> >> > >> > +            !strcmp(chassis_id, chassis_rec->name) ||
>> >> > >> > +            !strcmp(chassis_id, chassis_rec->hostname));
>> >> > >> > +        if (our_chassis) {
>> >> > >> > +            add_local_datapath(sbrec_datapath_binding_by_key,
>> >> > >> > +                               sbrec_port_binding_by_datapath,
>> >> > >> > +                               sbrec_port_binding_by_name,
>> >> > >> > +                               binding_rec->datapath, true,
>> local_datapaths);
>> >> > >> > +        }
>> >> > >> >      }
>> >> > >> >
>> >> > >> >      if (our_chassis
>> >> > >> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
>> >> > >> > index 8db81927e..98e8ed3b9 100644
>> >> > >> > --- a/ovn/controller/lflow.c
>> >> > >> > +++ b/ovn/controller/lflow.c
>> >> > >> > @@ -52,7 +52,10 @@ lflow_init(void)
>> >> > >> >  struct lookup_port_aux {
>> >> > >> >      struct ovsdb_idl_index
>> *sbrec_multicast_group_by_name_datapath;
>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
>> >> > >> >      const struct sbrec_datapath_binding *dp;
>> >> > >> > +    const struct sbrec_chassis *chassis;
>> >> > >> >  };
>> >> > >> >
>> >> > >> >  struct condition_aux {
>> >> > >> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >> > >> >      struct ovsdb_idl_index
>> *sbrec_multicast_group_by_name_datapath,
>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >> > >> >      const struct sbrec_logical_flow *,
>> >> > >> >      const struct hmap *local_datapaths,
>> >> > >> >      const struct sbrec_chassis *,
>> >> > >> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char
>> *port_name, unsigned int *portp)
>> >> > >> >      const struct sbrec_port_binding *pb
>> >> > >> >          =
>> lport_lookup_by_name(aux->sbrec_port_binding_by_name, port_name);
>> >> > >> >      if (pb && pb->datapath == aux->dp) {
>> >> > >> > -        *portp = pb->tunnel_key;
>> >> > >> > -        return true;
>> >> > >> > +        if (strcmp(pb->type, "external")) {
>> >> > >> > +            *portp = pb->tunnel_key;
>> >> > >> > +            return true;
>> >> > >> > +        }
>> >> > >> > +        const char *chassis_id = smap_get(&pb->options,
>> >> > >> > +                                          "requested-chassis");
>> >> > >> > +        if (chassis_id && (!strcmp(chassis_id,
>> aux->chassis->name) ||
>> >> > >> > +                           !strcmp(chassis_id,
>> aux->chassis->hostname))) {
>> >> > >> > +            const struct sbrec_port_binding *localnet_pb
>> >> > >> > +                =
>> lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
>> >> > >> > +
>>  aux->sbrec_port_binding_by_type,
>> >> > >> > +                                       aux->dp->tunnel_key,
>> "localnet");
>> >> > >> > +            if (localnet_pb) {
>> >> > >> > +                *portp = localnet_pb->tunnel_key;
>> >> > >> > +                return true;
>> >> > >> > +            }
>> >> > >> > +        }
>> >> > >> > +        return false;
>> >> > >> >      }
>> >> > >> >
>> >> > >> >      const struct sbrec_multicast_group *mg =
>> mcgroup_lookup_by_dp_name(
>> >> > >> > @@ -144,6 +165,8 @@ add_logical_flows(
>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >> > >> >      struct ovsdb_idl_index
>> *sbrec_multicast_group_by_name_datapath,
>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >> > >> >      const struct sbrec_dhcp_options_table *dhcp_options_table,
>> >> > >> >      const struct sbrec_dhcpv6_options_table
>> *dhcpv6_options_table,
>> >> > >> >      const struct sbrec_logical_flow_table *logical_flow_table,
>> >> > >> > @@ -183,6 +206,8 @@ add_logical_flows(
>> >> > >> >          consider_logical_flow(sbrec_chassis_by_name,
>> >> > >> >
>> sbrec_multicast_group_by_name_datapath,
>> >> > >> >                                sbrec_port_binding_by_name,
>> >> > >> > +                              sbrec_port_binding_by_type,
>> >> > >> > +                              sbrec_datapath_binding_by_key,
>> >> > >> >                                lflow, local_datapaths,
>> >> > >> >                                chassis, &dhcp_opts,
>> &dhcpv6_opts, &nd_ra_opts,
>> >> > >> >                                addr_sets, port_groups,
>> active_tunnels,
>> >> > >> > @@ -200,6 +225,8 @@ consider_logical_flow(
>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >> > >> >      struct ovsdb_idl_index
>> *sbrec_multicast_group_by_name_datapath,
>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >> > >> >      const struct sbrec_logical_flow *lflow,
>> >> > >> >      const struct hmap *local_datapaths,
>> >> > >> >      const struct sbrec_chassis *chassis,
>> >> > >> > @@ -292,7 +319,10 @@ consider_logical_flow(
>> >> > >> >          .sbrec_multicast_group_by_name_datapath
>> >> > >> >              = sbrec_multicast_group_by_name_datapath,
>> >> > >> >          .sbrec_port_binding_by_name =
>> sbrec_port_binding_by_name,
>> >> > >> > -        .dp = lflow->logical_datapath
>> >> > >> > +        .sbrec_port_binding_by_type =
>> sbrec_port_binding_by_type,
>> >> > >> > +        .sbrec_datapath_binding_by_key =
>> sbrec_datapath_binding_by_key,
>> >> > >> > +        .dp = lflow->logical_datapath,
>> >> > >> > +        .chassis = chassis
>> >> > >> >      };
>> >> > >> >      struct condition_aux cond_aux = {
>> >> > >> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
>> >> > >> > @@ -463,6 +493,8 @@ void
>> >> > >> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >> > >> >            struct ovsdb_idl_index
>> *sbrec_multicast_group_by_name_datapath,
>> >> > >> >            struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> >> > >> > +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >> > >> > +          struct ovsdb_idl_index
>> *sbrec_datapath_binding_by_key,
>> >> > >> >            const struct sbrec_dhcp_options_table
>> *dhcp_options_table,
>> >> > >> >            const struct sbrec_dhcpv6_options_table
>> *dhcpv6_options_table,
>> >> > >> >            const struct sbrec_logical_flow_table
>> *logical_flow_table,
>> >> > >> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index
>> *sbrec_chassis_by_name,
>> >> > >> >
>> >> > >> >      add_logical_flows(sbrec_chassis_by_name,
>> >> > >> >                        sbrec_multicast_group_by_name_datapath,
>> >> > >> > -                      sbrec_port_binding_by_name,
>> dhcp_options_table,
>> >> > >> > +                      sbrec_port_binding_by_name,
>> sbrec_port_binding_by_type,
>> >> > >> > +                      sbrec_datapath_binding_by_key,
>> dhcp_options_table,
>> >> > >> >                        dhcpv6_options_table, logical_flow_table,
>> >> > >> >                        local_datapaths, chassis, addr_sets,
>> port_groups,
>> >> > >> >                        active_tunnels, local_lport_ids,
>> flow_table, group_table,
>> >> > >> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
>> >> > >> > index d19338140..b2911e0eb 100644
>> >> > >> > --- a/ovn/controller/lflow.h
>> >> > >> > +++ b/ovn/controller/lflow.h
>> >> > >> > @@ -68,6 +68,8 @@ void lflow_init(void);
>> >> > >> >  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >> > >> >                 struct ovsdb_idl_index
>> *sbrec_multicast_group_by_name_datapath,
>> >> > >> >                 struct ovsdb_idl_index
>> *sbrec_port_binding_by_name,
>> >> > >> > +               struct ovsdb_idl_index
>> *sbrec_port_binding_by_type,
>> >> > >> > +               struct ovsdb_idl_index
>> *sbrec_datapath_binding_by_key,
>> >> > >> >                 const struct sbrec_dhcp_options_table *,
>> >> > >> >                 const struct sbrec_dhcpv6_options_table *,
>> >> > >> >                 const struct sbrec_logical_flow_table *,
>> >> > >> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
>> >> > >> > index cc5c5fbb2..9c827d9b0 100644
>> >> > >> > --- a/ovn/controller/lport.c
>> >> > >> > +++ b/ovn/controller/lport.c
>> >> > >> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index
>> *sbrec_datapath_binding_by_key,
>> >> > >> >      return retval;
>> >> > >> >  }
>> >> > >> >
>> >> > >> > +const struct sbrec_port_binding *
>> >> > >> > +lport_lookup_by_type(struct ovsdb_idl_index
>> *sbrec_datapath_binding_by_key,
>> >> > >> > +                     struct ovsdb_idl_index
>> *sbrec_port_binding_by_type,
>> >> > >> > +                     uint64_t dp_key, const char *port_type)
>> >> > >> > +{
>> >> > >> > +    /* Lookup datapath corresponding to dp_key. */
>> >> > >> > +    const struct sbrec_datapath_binding *db =
>> datapath_lookup_by_key(
>> >> > >> > +        sbrec_datapath_binding_by_key, dp_key);
>> >> > >> > +    if (!db) {
>> >> > >> > +        return NULL;
>> >> > >> > +    }
>> >> > >> > +
>> >> > >> > +    /* Build key for an indexed lookup. */
>> >> > >> > +    struct sbrec_port_binding *pb =
>> sbrec_port_binding_index_init_row(
>> >> > >> > +            sbrec_port_binding_by_type);
>> >> > >> > +    sbrec_port_binding_index_set_datapath(pb, db);
>> >> > >> > +    sbrec_port_binding_index_set_type(pb, port_type);
>> >> > >> > +
>> >> > >> > +    const struct sbrec_port_binding *retval =
>> sbrec_port_binding_index_find(
>> >> > >> > +            sbrec_port_binding_by_type, pb);
>> >> > >> > +
>> >> > >> > +    sbrec_port_binding_index_destroy_row(pb);
>> >> > >> > +
>> >> > >> > +    return retval;
>> >> > >> > +}
>> >> > >> > +
>> >> > >> >  const struct sbrec_datapath_binding *
>> >> > >> >  datapath_lookup_by_key(struct ovsdb_idl_index
>> *sbrec_datapath_binding_by_key,
>> >> > >> >                         uint64_t dp_key)
>> >> > >> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
>> >> > >> > index 7dcd5bee0..2d49792f6 100644
>> >> > >> > --- a/ovn/controller/lport.h
>> >> > >> > +++ b/ovn/controller/lport.h
>> >> > >> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding
>> *lport_lookup_by_key(
>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
>> >> > >> >      uint64_t dp_key, uint64_t port_key);
>> >> > >> >
>> >> > >> > +const struct sbrec_port_binding *lport_lookup_by_type(
>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >> > >> > +    uint64_t dp_key, const char *port_type);
>> >> > >> > +
>> >> > >> >  const struct sbrec_datapath_binding *datapath_lookup_by_key(
>> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> uint64_t dp_key);
>> >> > >> >
>> >> > >> > diff --git a/ovn/controller/ovn-controller.c
>> b/ovn/controller/ovn-controller.c
>> >> > >> > index 4e9a5865f..5aab9142f 100644
>> >> > >> > --- a/ovn/controller/ovn-controller.c
>> >> > >> > +++ b/ovn/controller/ovn-controller.c
>> >> > >> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl
>> *ovnsb_idl,
>> >> > >> >       * ports that have a Gateway_Chassis that point's to our
>> own
>> >> > >> >       * chassis */
>> >> > >> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
>> "chassisredirect");
>> >> > >> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
>> "external");
>> >> > >> >      if (chassis) {
>> >> > >> >          /* This should be mostly redundant with the other
>> clauses for port
>> >> > >> >           * bindings, but it allows us to catch any ports that
>> are assigned to
>> >> > >> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
>> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> >> > >> >
>> &sbrec_port_binding_col_datapath);
>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
>> >> > >> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> >> > >> > +
>> &sbrec_port_binding_col_type);
>> >> > >>
>> >> > >> This index is used with two columns: datapath_binding and type,
>> so it
>> >> > >> should be created with both columns using create2.
>> >> > >>
>> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
>> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> >> > >> >
>> &sbrec_datapath_binding_col_tunnel_key);
>> >> > >> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
>> >> > >> >                              sbrec_chassis_by_name,
>> >> > >> >
>> sbrec_multicast_group_by_name_datapath,
>> >> > >> >                              sbrec_port_binding_by_name,
>> >> > >> > +                            sbrec_port_binding_by_type,
>> >> > >> > +                            sbrec_datapath_binding_by_key,
>> >> > >> >
>> sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
>> >> > >> >
>> sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
>> >> > >> >
>> sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
>> >> > >> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
>> >> > >> > index aa03919bb..a9d4b8736 100644
>> >> > >> > --- a/ovn/lib/ovn-util.c
>> >> > >> > +++ b/ovn/lib/ovn-util.c
>> >> > >> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
>> >> > >> >      "localport",
>> >> > >> >      "router",
>> >> > >> >      "vtep",
>> >> > >> > +    "external",
>> >> > >> >  };
>> >> > >> >
>> >> > >> >  bool
>> >> > >> > diff --git a/ovn/northd/ovn-northd.8.xml
>> b/ovn/northd/ovn-northd.8.xml
>> >> > >> > index 392a5efc9..c8883d60d 100644
>> >> > >> > --- a/ovn/northd/ovn-northd.8.xml
>> >> > >> > +++ b/ovn/northd/ovn-northd.8.xml
>> >> > >> > @@ -626,7 +626,8 @@ nd_na_router {
>> >> > >> >      <p>
>> >> > >> >        This table adds the DHCPv4 options to a DHCPv4 packet
>> from the
>> >> > >> >        logical ports configured with IPv4 address(es) and
>> DHCPv4 options,
>> >> > >> > -      and similarly for DHCPv6 options.
>> >> > >> > +      and similarly for DHCPv6 options. This table also adds
>> flows for the
>> >> > >> > +      logical ports of type <code>external</code>.
>> >> > >> >      </p>
>> >> > >> >
>> >> > >> >      <ul>
>> >> > >> > @@ -827,7 +828,39 @@ output;
>> >> > >> >        </li>
>> >> > >> >      </ul>
>> >> > >> >
>> >> > >> > -    <h3>Ingress Table 16 Destination Lookup</h3>
>> >> > >> > +    <h3>Ingress table 16 External ports</h3>
>> >> > >> > +
>> >> > >> > +    <p>
>> >> > >> > +      Traffic from the <code>external</code> logical ports
>> enter the ingress
>> >> > >> > +      datapath pipeline via the <code>localnet</code> port.
>> This table adds the
>> >> > >> > +      below logical flows to handle the traffic from these
>> ports.
>> >> > >> > +    </p>
>> >> > >> > +
>> >> > >> > +    <ul>
>> >> > >> > +      <li>
>> >> > >> > +        <p>
>> >> > >> > +          A priority-100 flow is added for each
>> <code>external</code> logical
>> >> > >> > +          port which doesn't reside on a chassis to drop the
>> ARP/IPv6 NS
>> >> > >> > +          request to the router IP(s) (of the logical switch)
>> which matches
>> >> > >> > +          on the <code>inport</code> of the
>> <code>external</code> logical port
>> >> > >> > +          and the valid <code>eth.src</code> address(es) of the
>> >> > >> > +          <code>external</code> logical port.
>> >> > >> > +        </p>
>> >> > >> > +
>> >> > >> > +        <p>
>> >> > >> > +          This flow guarantees that the ARP/NS request to the
>> router IP
>> >> > >> > +          address from the external ports is responded by only
>> the chassis
>> >> > >> > +          which has claimed these external ports. All the
>> other chassis,
>> >> > >> > +          drops these packets.
>> >> > >> > +        </p>
>> >> > >> > +      </li>
>> >> > >> > +
>> >> > >> > +      <li>
>> >> > >> > +        A priority-0 flow that matches all packets to advances
>> to table 17.
>> >> > >> > +      </li>
>> >> > >> > +    </ul>
>> >> > >> > +
>> >> > >> > +    <h3>Ingress Table 17 Destination Lookup</h3>
>> >> > >> >
>> >> > >> >      <p>
>> >> > >> >        This table implements switching behavior.  It contains
>> these logical
>> >> > >> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>> >> > >> > index 3fd8a8757..87208c6c1 100644
>> >> > >> > --- a/ovn/northd/ovn-northd.c
>> >> > >> > +++ b/ovn/northd/ovn-northd.c
>> >> > >> > @@ -119,7 +119,8 @@ enum ovn_stage {
>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13,
>> "ls_in_dhcp_response") \
>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14,
>> "ls_in_dns_lookup")    \
>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15,
>> "ls_in_dns_response")  \
>> >> > >> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16,
>> "ls_in_l2_lkup")       \
>> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16,
>> "ls_in_external_port") \
>> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17,
>> "ls_in_l2_lkup")       \
>> >> > >> >
>>             \
>> >> > >> >      /* Logical switch egress stages. */
>>            \
>> >> > >> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0,
>> "ls_out_pre_lb")         \
>> >> > >> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct
>> nbrec_logical_switch_port *lsp)
>> >> > >> >      return !lsp->up || *lsp->up;
>> >> > >> >  }
>> >> > >> >
>> >> > >> > +static bool
>> >> > >> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
>> >> > >> > +{
>> >> > >> > +    return !strcmp(nbsp->type, "external");
>> >> > >> > +}
>> >> > >> > +
>> >> > >> >  static bool
>> >> > >> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
>> >> > >> >                      struct ds *options_action, struct ds
>> *response_action,
>> >> > >> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap
>> *datapaths, struct hmap *ports,
>> >> > >> >           *  - port type is localport
>> >> > >> >           */
>> >> > >> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type,
>> "router") &&
>> >> > >> > -            strcmp(op->nbsp->type, "localport")) {
>> >> > >> > +            strcmp(op->nbsp->type, "localport") &&
>> lsp_is_external(op->nbsp)) {
>> >> > >>
>> >> > >> Sorry that I missed this in last review. The && condition has
>> problem.
>> >> > >> It will cause ARP responder flows added for all lports that are
>> not
>> >> > >> external. I think it should be || here.
>> >> > >
>> >> > >
>> >> > > Agree. To make it easier to read, I will add a new "if" with
>> continue - below this one for
>> >> > > external port types.
>> >> > >
>> >> > >
>> >> > >>
>> >> > >>
>> >> > >> >              continue;
>> >> > >> >          }
>> >> > >> >
>> >> > >> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap
>> *datapaths, struct hmap *ports,
>> >> > >> >              continue;
>> >> > >> >          }
>> >> > >> >
>> >> > >> > +        bool is_external = lsp_is_external(op->nbsp);
>> >> > >> > +        if (is_external && !op->od->localnet_port) {
>> >> > >> > +            /* If it's an external port and there is no
>> localnet port
>> >> > >> > +             * ignore it. */
>> >> > >> > +            continue;
>> >> > >> > +        }
>> >> > >> > +
>> >> > >> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>> >> > >> >              for (size_t j = 0; j <
>> op->lsp_addrs[i].n_ipv4_addrs; j++) {
>> >> > >> >                  struct ds options_action =
>> DS_EMPTY_INITIALIZER;
>> >> > >> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap
>> *datapaths, struct hmap *ports,
>> >> > >> >                      ds_put_format(
>> >> > >> >                          &match, "inport == %s && eth.src == %s
>> && "
>> >> > >> >                          "ip4.src == 0.0.0.0 && ip4.dst ==
>> 255.255.255.255 && "
>> >> > >> > -                        "udp.src == 68 && udp.dst == 67",
>> op->json_key,
>> >> > >> > -                        op->lsp_addrs[i].ea_s);
>> >> > >> > +                        "udp.src == 68 && udp.dst == 67",
>> >> > >> > +                        op->json_key, op->lsp_addrs[i].ea_s);
>> >> > >>
>> >> > >> No change here?
>> >> > >
>> >> > >
>> >> > > I think it's unwanted and unrelated change. I will correct it.
>> >> > >>
>> >> > >> >
>> >> > >> >                      ovn_lflow_add(lflows, op->od,
>> S_SWITCH_IN_DHCP_OPTIONS,
>> >> > >> >                                    100, ds_cstr(&match),
>> >> > >> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap
>> *datapaths, struct hmap *ports,
>> >> > >> >      /* Ingress table 12 and 13: DHCP options and response, by
>> default goto
>> >> > >> >       * next. (priority 0).
>> >> > >> >       * Ingress table 14 and 15: DNS lookup and response, by
>> default goto next.
>> >> > >> > -     * (priority 0).*/
>> >> > >> > +     * (priority 0).
>> >> > >> > +     * Ingress table 16 - External port handling, by default
>> goto next.
>> >> > >> > +     * (priority 0). */
>> >> > >> >
>> >> > >> >      HMAP_FOR_EACH (od, key_node, datapaths) {
>> >> > >> >          if (!od->nbs) {
>> >> > >> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap
>> *datapaths, struct hmap *ports,
>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE,
>> 0, "1", "next;");
>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0,
>> "1", "next;");
>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0,
>> "1", "next;");
>> >> > >> > +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT,
>> 0, "1", "next;");
>> >> > >> >      }
>> >> > >> >
>> >> > >> > -    /* Ingress table 16: Destination lookup, broadcast and
>> multicast handling
>> >> > >> > +    HMAP_FOR_EACH (op, key_node, ports) {
>> >> > >> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
>> >> > >> > +           continue;
>> >> > >> > +        }
>> >> > >> > +
>> >> > >> > +        /* Table 16: External port. Drop ARP request for
>> router ips from
>> >> > >> > +         * external ports  on chassis not binding those ports.
>> >> > >> > +         * This makes the router pipeline to be run only on
>> the chassis
>> >> > >> > +         * binding the external ports. */
>> >> > >> > +
>> >> > >> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>> >> > >> > +            for (size_t j = 0; j < op->od->n_router_ports;
>> j++) {
>> >> > >> > +                struct ovn_port *rp = op->od->router_ports[j];
>> >> > >> > +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
>> >> > >> > +                    for (size_t l = 0; l <
>> rp->lsp_addrs[k].n_ipv4_addrs;
>> >> > >> > +                         l++) {
>> >> > >> > +                        ds_clear(&match);
>> >> > >> > +                        ds_put_cstr(&match, "ip4");
>> >> > >> > +                        ds_put_format(
>> >> > >> > +                            &match, "inport == %s && eth.src
>> == %s"
>> >> > >> > +                            " && !is_chassis_resident(%s)"
>> >> > >> > +                            " && arp.tpa == %s && arp.op == 1",
>> >> > >> > +                            op->json_key,
>> op->lsp_addrs[i].ea_s, op->json_key,
>> >> > >>
>> >> > >> I believe the inport should match the localnet port's json_key
>> here,
>> >> > >> since it is coming from a localnet port.
>> >> > >
>> >> > >
>> >> > > Both would work. If you see the code in lflow.c in this patch - it
>> will get the tunnel
>> >> > > key of the localnet port if the port_binding type is "external".
>> >> > >
>> >> > > That's how even the DHCP requests are handled. ovn-controller will
>> translate
>> >> > > the logical flows with action "put_dhcp_opts" only the chassis
>> claiming the
>> >> > > external ports.
>> >> >
>> >> > Oh, yes you are right. Actually I read that part in v4 and it somehow
>> >> > slipped my mind. Thanks for explain.
>> >>
>> >> I thought it a second time, and I'd suggest to do the convertion here
>> >> in northd instead of ovn-controller, for two reasons:
>> >>
>> >> 1. In ovn-controller there is no extra context so it just blindly
>> >> transate all references to external logical port into localnet port
>> >> key. This could lead to unexpected behavior. For example, if someone
>> >> uses external logical port in ACL match condition. The match condition
>> >> would then apply to all packets to/from localnet port which is
>> >> definitely unwanted. (at the same time it would be better to document
>> >> that features like port-security, ACL should not be used for external
>> >> logical ports)
>> >>
>> >
>> > That's not how it works in the present patch. Lets say you have  2
>> chassis
>> > hv1 and hv2 and an external port sw0-ext1 and a localnet port
>> "ln-public".
>> > Suppose if the requested-chassis is set to hv1, then all the logical
>> flows with the
>> > match "inport == sw0-ext1" will be converted to OF flows only on hv1 as
>> this port
>> > is bound by hv1 and the function 'lookup_port_cb()' would return true
>> only
>> > on hv1 . In hv2, lookup_port_cb() would return false.
>>
>> Yes, this is well understood.
>>
>> >
>> > If we want to do the conversion in ovn-northd.c the match condition
>> would have to
>> > be - "inport == "ln-public" && is_chassis_resident(sw0-ext1) && ...."
>> > instead of the present one  - "inport == sw0-ext1 && ...".
>>
>> Yes, this is what I would suggest (see reason below).
>>
>> >
>> > And the ACL match condition would not be an issue because of the above
>> mentioned
>> > reason. i.e the ACL flows will be applied only on the chassis binding
>> the external
>> > port.
>>
>> Here is the concern. For example, chassis A has regular port sw0-lsp1
>> bound. Chassis A is also set as requested-chassis for external port
>> sw0-ext1. And now the user adds an ACL: to-port, 1001, 'outport ==
>> "sw0-ext1", drop. This would get translated to something like:
>> to-port, 1001, 'outport == "ln-public"', drop. Wouldn't this impact
>> the traffic from sw0-lsp1 to the bridged network? Even if it doesn't
>> impact because of some subtle reasons of current implementation, I
>> would say it is risky and could leads to problems under certain
>> conditions, because the conversion in ovn-controller widens the
>> original intent. Whereas doing it in northd only for specific lflows
>> would ensure it has impact only for intended use cases.
>>
>
>
> Thanks for the detailed explanation. I agree. It's clear to me now. I will
> update accordingly in v6.
>
> Regards
> Numan
>
>
>> >
>> > The test case added checks that the OF flows are applied only on the
>> bound chassis.
>> >
>> > I think it is better to do it in ovn-controller instead of ovn-northd.
>> Please let me know
>> > if you still have any concerns.
>> >
>> >
>> >
>> >> 2. A less important reason is, it is better to do it at earlier stage
>> >> than later. northd handles common processing. This part of logic is
>> >> common for all chassises, so it would be better if we explicitely
>> >> handle it in northd, instead of let every chassis to process. And the
>> >> change in northd would likely be simpler than in ovn-controller.
>>
>> This is less critical problem, but I think it is worth consideration,
>> too. With current logic, although the conversion would take effect
>> only if "is_chassis_resident()" is true, but the code logic and
>> processing has to happen on every chassis.
>>
>> >>
>> >> Thanks,
>> >> Han
>>
>
Miguel Angel Ajo Jan. 21, 2019, 3:05 p.m. UTC | #11
On Mon, Jan 21, 2019 at 4:02 PM Numan Siddique <nusiddiq@redhat.com> wrote:

>
> Hi Han,
>
> I have addressed your comments. But before posting the patch I wanted to
> get an opinion
> on the HA support for these external ports.
>
> The proposed patch doesn't support HA. If the requested chassis goes down
> for some reason
> it is expected that CMS would detect it and change the requested-chassis
> option to other
> suitable chassis.
>
> The openstack OVN folks think this would be too much for the CMS to handle
> and it would
> complicate the code in networking-ovn which I agree with.
>
>
Not only the complexity part. If we implement this from the CMS, then every
CMS using ovn
will need to replicate that behaviour.

That's in my opinion a good reason why it's better to handle HA within OVN
itself.


> I am thinking to add the HA support on the lines of gateway chassis
> support and I want to
> submit this patch after adding the HA support. I think this would be
> better as we won't add
> more options in OVN (first requested-chassis for external ports and then
> later HA chassis support).
> Thoughts?
>
> Thanks
> Numan
>
>
> On Sat, Jan 19, 2019 at 12:42 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
>
>>
>>
>> On Sat, Jan 19, 2019, 12:32 AM Han Zhou <zhouhan@gmail.com wrote:
>>
>>> On Fri, Jan 18, 2019 at 10:16 AM Numan Siddique <nusiddiq@redhat.com>
>>> wrote:
>>> >
>>> >
>>> >
>>> > On Fri, Jan 18, 2019 at 2:11 AM Han Zhou <zhouhan@gmail.com> wrote:
>>> >>
>>> >> On Thu, Jan 17, 2019 at 11:32 AM Han Zhou <zhouhan@gmail.com> wrote:
>>> >> >
>>> >> > On Thu, Jan 17, 2019 at 11:25 AM Numan Siddique <
>>> nusiddiq@redhat.com> wrote:
>>> >> > >
>>> >> > >
>>> >> > >
>>> >> > > On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com>
>>> wrote:
>>> >> > >>
>>> >> > >> Hi Numan,
>>> >> > >>
>>> >> > >> With v5 the new test case "external logical port" fails.
>>> >> > >> And please see more comments inlined.
>>> >> > >>
>>> >> > >> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
>>> >> > >> >
>>> >> > >> > From: Numan Siddique <nusiddiq@redhat.com>
>>> >> > >> >
>>> >> > >> > In the case of OpenStack + OVN, when the VMs are booted on
>>> >> > >> > hypervisors supporting SR-IOV nics, there are no OVS ports
>>> >> > >> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
>>> >> > >> > Router Solicitation requests, the local ovn-controller
>>> >> > >> > cannot reply to these packets. OpenStack Neutron dhcp agent
>>> >> > >> > service needs to be run to serve these requests.
>>> >> > >> >
>>> >> > >> > With the new logical port type - 'external', OVN itself can
>>> >> > >> > handle these requests avoiding the need to deploy any
>>> >> > >> > external services like neutron dhcp agent.
>>> >> > >> >
>>> >> > >> > To make use of this feature, CMS has to
>>> >> > >> >  - create a logical port for such VMs
>>> >> > >> >  - set the type to 'external'
>>> >> > >> >  - set requested-chassis="<chassis-name>" in the options
>>> >> > >> >    column.
>>> >> > >> >  - create a localnet port for the logical switch
>>> >> > >> >  - configure the ovn-bridge-mappings option in the OVS db.
>>> >> > >> >
>>> >> > >> > When the ovn-controller running in that 'chassis', detects
>>> >> > >> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
>>> >> > >> > flows. Since the packet enters the logical switch pipeline
>>> >> > >> > via the localnet port, the inport register (reg14) is set
>>> >> > >> > to the tunnel key of localnet port in the match conditions.
>>> >> > >> >
>>> >> > >> > In case the chassis goes down for some reason, it is the
>>> >> > >> > responsibility of CMS to change the 'requested-chassis'
>>> >> > >> > option to some other active chassis, so that it can serve
>>> >> > >> > these requests.
>>> >> > >> >
>>> >> > >> > When the VM with the external port, sends an ARP request for
>>> >> > >> > the router ips, only the chassis which has claimed the port,
>>> >> > >> > will reply to the ARP requests. Rest of the chassis on
>>> >> > >> > receiving these packets drop them in the ingress switch
>>> >> > >> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
>>> >> > >> > before S_SWITCH_IN_L2_LKUP.
>>> >> > >> >
>>> >> > >> > This would guarantee that only the chassis which has claimed
>>> >> > >> > the external ports will run the router datapath pipeline.
>>> >> > >> >
>>> >> > >> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>>> >> > >> > ---
>>> >> > >> >
>>> >> > >> > v4 -> v5
>>> >> > >> > ------
>>> >> > >> >   * Addressed review comments from Han Zhou.
>>> >> > >> >
>>> >> > >> > v3 -> v4
>>> >> > >> > ------
>>> >> > >> >   * Updated the documention as per Han Zhou's suggestion.
>>> >> > >> >
>>> >> > >> > v2 -> v3
>>> >> > >> > -------
>>> >> > >> >   * Rebased
>>> >> > >> >
>>> >> > >> >  ovn/controller/binding.c        |  12 +
>>> >> > >> >  ovn/controller/lflow.c          |  41 ++-
>>> >> > >> >  ovn/controller/lflow.h          |   2 +
>>> >> > >> >  ovn/controller/lport.c          |  26 ++
>>> >> > >> >  ovn/controller/lport.h          |   5 +
>>> >> > >> >  ovn/controller/ovn-controller.c |   6 +
>>> >> > >> >  ovn/lib/ovn-util.c              |   1 +
>>> >> > >> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
>>> >> > >> >  ovn/northd/ovn-northd.c         |  85 ++++-
>>> >> > >> >  ovn/ovn-architecture.7.xml      |  78 +++++
>>> >> > >> >  ovn/ovn-nb.xml                  |  47 +++
>>> >> > >> >  tests/ovn.at                    | 530
>>> +++++++++++++++++++++++++++++++-
>>> >> > >> >  12 files changed, 848 insertions(+), 22 deletions(-)
>>> >> > >> >
>>> >> > >> > diff --git a/ovn/controller/binding.c
>>> b/ovn/controller/binding.c
>>> >> > >> > index 021ecddcf..64e605b92 100644
>>> >> > >> > --- a/ovn/controller/binding.c
>>> >> > >> > +++ b/ovn/controller/binding.c
>>> >> > >> > @@ -471,6 +471,18 @@ consider_local_datapath(struct
>>> ovsdb_idl_txn *ovnsb_idl_txn,
>>> >> > >> >           * for them. */
>>> >> > >> >          sset_add(local_lports, binding_rec->logical_port);
>>> >> > >> >          our_chassis = false;
>>> >> > >> > +    } else if (!strcmp(binding_rec->type, "external")) {
>>> >> > >> > +        const char *chassis_id =
>>> smap_get(&binding_rec->options,
>>> >> > >> > +
>>> "requested-chassis");
>>> >> > >> > +        our_chassis = chassis_id && (
>>> >> > >> > +            !strcmp(chassis_id, chassis_rec->name) ||
>>> >> > >> > +            !strcmp(chassis_id, chassis_rec->hostname));
>>> >> > >> > +        if (our_chassis) {
>>> >> > >> > +            add_local_datapath(sbrec_datapath_binding_by_key,
>>> >> > >> > +                               sbrec_port_binding_by_datapath,
>>> >> > >> > +                               sbrec_port_binding_by_name,
>>> >> > >> > +                               binding_rec->datapath, true,
>>> local_datapaths);
>>> >> > >> > +        }
>>> >> > >> >      }
>>> >> > >> >
>>> >> > >> >      if (our_chassis
>>> >> > >> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
>>> >> > >> > index 8db81927e..98e8ed3b9 100644
>>> >> > >> > --- a/ovn/controller/lflow.c
>>> >> > >> > +++ b/ovn/controller/lflow.c
>>> >> > >> > @@ -52,7 +52,10 @@ lflow_init(void)
>>> >> > >> >  struct lookup_port_aux {
>>> >> > >> >      struct ovsdb_idl_index
>>> *sbrec_multicast_group_by_name_datapath;
>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
>>> >> > >> >      const struct sbrec_datapath_binding *dp;
>>> >> > >> > +    const struct sbrec_chassis *chassis;
>>> >> > >> >  };
>>> >> > >> >
>>> >> > >> >  struct condition_aux {
>>> >> > >> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>>> >> > >> >      struct ovsdb_idl_index
>>> *sbrec_multicast_group_by_name_datapath,
>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>> >> > >> >      const struct sbrec_logical_flow *,
>>> >> > >> >      const struct hmap *local_datapaths,
>>> >> > >> >      const struct sbrec_chassis *,
>>> >> > >> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char
>>> *port_name, unsigned int *portp)
>>> >> > >> >      const struct sbrec_port_binding *pb
>>> >> > >> >          =
>>> lport_lookup_by_name(aux->sbrec_port_binding_by_name, port_name);
>>> >> > >> >      if (pb && pb->datapath == aux->dp) {
>>> >> > >> > -        *portp = pb->tunnel_key;
>>> >> > >> > -        return true;
>>> >> > >> > +        if (strcmp(pb->type, "external")) {
>>> >> > >> > +            *portp = pb->tunnel_key;
>>> >> > >> > +            return true;
>>> >> > >> > +        }
>>> >> > >> > +        const char *chassis_id = smap_get(&pb->options,
>>> >> > >> > +
>>> "requested-chassis");
>>> >> > >> > +        if (chassis_id && (!strcmp(chassis_id,
>>> aux->chassis->name) ||
>>> >> > >> > +                           !strcmp(chassis_id,
>>> aux->chassis->hostname))) {
>>> >> > >> > +            const struct sbrec_port_binding *localnet_pb
>>> >> > >> > +                =
>>> lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
>>> >> > >> > +
>>>  aux->sbrec_port_binding_by_type,
>>> >> > >> > +                                       aux->dp->tunnel_key,
>>> "localnet");
>>> >> > >> > +            if (localnet_pb) {
>>> >> > >> > +                *portp = localnet_pb->tunnel_key;
>>> >> > >> > +                return true;
>>> >> > >> > +            }
>>> >> > >> > +        }
>>> >> > >> > +        return false;
>>> >> > >> >      }
>>> >> > >> >
>>> >> > >> >      const struct sbrec_multicast_group *mg =
>>> mcgroup_lookup_by_dp_name(
>>> >> > >> > @@ -144,6 +165,8 @@ add_logical_flows(
>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>>> >> > >> >      struct ovsdb_idl_index
>>> *sbrec_multicast_group_by_name_datapath,
>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>> >> > >> >      const struct sbrec_dhcp_options_table *dhcp_options_table,
>>> >> > >> >      const struct sbrec_dhcpv6_options_table
>>> *dhcpv6_options_table,
>>> >> > >> >      const struct sbrec_logical_flow_table *logical_flow_table,
>>> >> > >> > @@ -183,6 +206,8 @@ add_logical_flows(
>>> >> > >> >          consider_logical_flow(sbrec_chassis_by_name,
>>> >> > >> >
>>> sbrec_multicast_group_by_name_datapath,
>>> >> > >> >                                sbrec_port_binding_by_name,
>>> >> > >> > +                              sbrec_port_binding_by_type,
>>> >> > >> > +                              sbrec_datapath_binding_by_key,
>>> >> > >> >                                lflow, local_datapaths,
>>> >> > >> >                                chassis, &dhcp_opts,
>>> &dhcpv6_opts, &nd_ra_opts,
>>> >> > >> >                                addr_sets, port_groups,
>>> active_tunnels,
>>> >> > >> > @@ -200,6 +225,8 @@ consider_logical_flow(
>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>>> >> > >> >      struct ovsdb_idl_index
>>> *sbrec_multicast_group_by_name_datapath,
>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>> >> > >> >      const struct sbrec_logical_flow *lflow,
>>> >> > >> >      const struct hmap *local_datapaths,
>>> >> > >> >      const struct sbrec_chassis *chassis,
>>> >> > >> > @@ -292,7 +319,10 @@ consider_logical_flow(
>>> >> > >> >          .sbrec_multicast_group_by_name_datapath
>>> >> > >> >              = sbrec_multicast_group_by_name_datapath,
>>> >> > >> >          .sbrec_port_binding_by_name =
>>> sbrec_port_binding_by_name,
>>> >> > >> > -        .dp = lflow->logical_datapath
>>> >> > >> > +        .sbrec_port_binding_by_type =
>>> sbrec_port_binding_by_type,
>>> >> > >> > +        .sbrec_datapath_binding_by_key =
>>> sbrec_datapath_binding_by_key,
>>> >> > >> > +        .dp = lflow->logical_datapath,
>>> >> > >> > +        .chassis = chassis
>>> >> > >> >      };
>>> >> > >> >      struct condition_aux cond_aux = {
>>> >> > >> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
>>> >> > >> > @@ -463,6 +493,8 @@ void
>>> >> > >> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>>> >> > >> >            struct ovsdb_idl_index
>>> *sbrec_multicast_group_by_name_datapath,
>>> >> > >> >            struct ovsdb_idl_index *sbrec_port_binding_by_name,
>>> >> > >> > +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>> >> > >> > +          struct ovsdb_idl_index
>>> *sbrec_datapath_binding_by_key,
>>> >> > >> >            const struct sbrec_dhcp_options_table
>>> *dhcp_options_table,
>>> >> > >> >            const struct sbrec_dhcpv6_options_table
>>> *dhcpv6_options_table,
>>> >> > >> >            const struct sbrec_logical_flow_table
>>> *logical_flow_table,
>>> >> > >> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index
>>> *sbrec_chassis_by_name,
>>> >> > >> >
>>> >> > >> >      add_logical_flows(sbrec_chassis_by_name,
>>> >> > >> >                        sbrec_multicast_group_by_name_datapath,
>>> >> > >> > -                      sbrec_port_binding_by_name,
>>> dhcp_options_table,
>>> >> > >> > +                      sbrec_port_binding_by_name,
>>> sbrec_port_binding_by_type,
>>> >> > >> > +                      sbrec_datapath_binding_by_key,
>>> dhcp_options_table,
>>> >> > >> >                        dhcpv6_options_table,
>>> logical_flow_table,
>>> >> > >> >                        local_datapaths, chassis, addr_sets,
>>> port_groups,
>>> >> > >> >                        active_tunnels, local_lport_ids,
>>> flow_table, group_table,
>>> >> > >> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
>>> >> > >> > index d19338140..b2911e0eb 100644
>>> >> > >> > --- a/ovn/controller/lflow.h
>>> >> > >> > +++ b/ovn/controller/lflow.h
>>> >> > >> > @@ -68,6 +68,8 @@ void lflow_init(void);
>>> >> > >> >  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>>> >> > >> >                 struct ovsdb_idl_index
>>> *sbrec_multicast_group_by_name_datapath,
>>> >> > >> >                 struct ovsdb_idl_index
>>> *sbrec_port_binding_by_name,
>>> >> > >> > +               struct ovsdb_idl_index
>>> *sbrec_port_binding_by_type,
>>> >> > >> > +               struct ovsdb_idl_index
>>> *sbrec_datapath_binding_by_key,
>>> >> > >> >                 const struct sbrec_dhcp_options_table *,
>>> >> > >> >                 const struct sbrec_dhcpv6_options_table *,
>>> >> > >> >                 const struct sbrec_logical_flow_table *,
>>> >> > >> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
>>> >> > >> > index cc5c5fbb2..9c827d9b0 100644
>>> >> > >> > --- a/ovn/controller/lport.c
>>> >> > >> > +++ b/ovn/controller/lport.c
>>> >> > >> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index
>>> *sbrec_datapath_binding_by_key,
>>> >> > >> >      return retval;
>>> >> > >> >  }
>>> >> > >> >
>>> >> > >> > +const struct sbrec_port_binding *
>>> >> > >> > +lport_lookup_by_type(struct ovsdb_idl_index
>>> *sbrec_datapath_binding_by_key,
>>> >> > >> > +                     struct ovsdb_idl_index
>>> *sbrec_port_binding_by_type,
>>> >> > >> > +                     uint64_t dp_key, const char *port_type)
>>> >> > >> > +{
>>> >> > >> > +    /* Lookup datapath corresponding to dp_key. */
>>> >> > >> > +    const struct sbrec_datapath_binding *db =
>>> datapath_lookup_by_key(
>>> >> > >> > +        sbrec_datapath_binding_by_key, dp_key);
>>> >> > >> > +    if (!db) {
>>> >> > >> > +        return NULL;
>>> >> > >> > +    }
>>> >> > >> > +
>>> >> > >> > +    /* Build key for an indexed lookup. */
>>> >> > >> > +    struct sbrec_port_binding *pb =
>>> sbrec_port_binding_index_init_row(
>>> >> > >> > +            sbrec_port_binding_by_type);
>>> >> > >> > +    sbrec_port_binding_index_set_datapath(pb, db);
>>> >> > >> > +    sbrec_port_binding_index_set_type(pb, port_type);
>>> >> > >> > +
>>> >> > >> > +    const struct sbrec_port_binding *retval =
>>> sbrec_port_binding_index_find(
>>> >> > >> > +            sbrec_port_binding_by_type, pb);
>>> >> > >> > +
>>> >> > >> > +    sbrec_port_binding_index_destroy_row(pb);
>>> >> > >> > +
>>> >> > >> > +    return retval;
>>> >> > >> > +}
>>> >> > >> > +
>>> >> > >> >  const struct sbrec_datapath_binding *
>>> >> > >> >  datapath_lookup_by_key(struct ovsdb_idl_index
>>> *sbrec_datapath_binding_by_key,
>>> >> > >> >                         uint64_t dp_key)
>>> >> > >> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
>>> >> > >> > index 7dcd5bee0..2d49792f6 100644
>>> >> > >> > --- a/ovn/controller/lport.h
>>> >> > >> > +++ b/ovn/controller/lport.h
>>> >> > >> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding
>>> *lport_lookup_by_key(
>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
>>> >> > >> >      uint64_t dp_key, uint64_t port_key);
>>> >> > >> >
>>> >> > >> > +const struct sbrec_port_binding *lport_lookup_by_type(
>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>> >> > >> > +    uint64_t dp_key, const char *port_type);
>>> >> > >> > +
>>> >> > >> >  const struct sbrec_datapath_binding *datapath_lookup_by_key(
>>> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>> uint64_t dp_key);
>>> >> > >> >
>>> >> > >> > diff --git a/ovn/controller/ovn-controller.c
>>> b/ovn/controller/ovn-controller.c
>>> >> > >> > index 4e9a5865f..5aab9142f 100644
>>> >> > >> > --- a/ovn/controller/ovn-controller.c
>>> >> > >> > +++ b/ovn/controller/ovn-controller.c
>>> >> > >> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl
>>> *ovnsb_idl,
>>> >> > >> >       * ports that have a Gateway_Chassis that point's to our
>>> own
>>> >> > >> >       * chassis */
>>> >> > >> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
>>> "chassisredirect");
>>> >> > >> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
>>> "external");
>>> >> > >> >      if (chassis) {
>>> >> > >> >          /* This should be mostly redundant with the other
>>> clauses for port
>>> >> > >> >           * bindings, but it allows us to catch any ports that
>>> are assigned to
>>> >> > >> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
>>> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>>> >> > >> >
>>> &sbrec_port_binding_col_datapath);
>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
>>> >> > >> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>>> >> > >> > +
>>> &sbrec_port_binding_col_type);
>>> >> > >>
>>> >> > >> This index is used with two columns: datapath_binding and type,
>>> so it
>>> >> > >> should be created with both columns using create2.
>>> >> > >>
>>> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
>>> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>>> >> > >> >
>>> &sbrec_datapath_binding_col_tunnel_key);
>>> >> > >> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
>>> >> > >> >                              sbrec_chassis_by_name,
>>> >> > >> >
>>> sbrec_multicast_group_by_name_datapath,
>>> >> > >> >                              sbrec_port_binding_by_name,
>>> >> > >> > +                            sbrec_port_binding_by_type,
>>> >> > >> > +                            sbrec_datapath_binding_by_key,
>>> >> > >> >
>>> sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
>>> >> > >> >
>>> sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
>>> >> > >> >
>>> sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
>>> >> > >> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
>>> >> > >> > index aa03919bb..a9d4b8736 100644
>>> >> > >> > --- a/ovn/lib/ovn-util.c
>>> >> > >> > +++ b/ovn/lib/ovn-util.c
>>> >> > >> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
>>> >> > >> >      "localport",
>>> >> > >> >      "router",
>>> >> > >> >      "vtep",
>>> >> > >> > +    "external",
>>> >> > >> >  };
>>> >> > >> >
>>> >> > >> >  bool
>>> >> > >> > diff --git a/ovn/northd/ovn-northd.8.xml
>>> b/ovn/northd/ovn-northd.8.xml
>>> >> > >> > index 392a5efc9..c8883d60d 100644
>>> >> > >> > --- a/ovn/northd/ovn-northd.8.xml
>>> >> > >> > +++ b/ovn/northd/ovn-northd.8.xml
>>> >> > >> > @@ -626,7 +626,8 @@ nd_na_router {
>>> >> > >> >      <p>
>>> >> > >> >        This table adds the DHCPv4 options to a DHCPv4 packet
>>> from the
>>> >> > >> >        logical ports configured with IPv4 address(es) and
>>> DHCPv4 options,
>>> >> > >> > -      and similarly for DHCPv6 options.
>>> >> > >> > +      and similarly for DHCPv6 options. This table also adds
>>> flows for the
>>> >> > >> > +      logical ports of type <code>external</code>.
>>> >> > >> >      </p>
>>> >> > >> >
>>> >> > >> >      <ul>
>>> >> > >> > @@ -827,7 +828,39 @@ output;
>>> >> > >> >        </li>
>>> >> > >> >      </ul>
>>> >> > >> >
>>> >> > >> > -    <h3>Ingress Table 16 Destination Lookup</h3>
>>> >> > >> > +    <h3>Ingress table 16 External ports</h3>
>>> >> > >> > +
>>> >> > >> > +    <p>
>>> >> > >> > +      Traffic from the <code>external</code> logical ports
>>> enter the ingress
>>> >> > >> > +      datapath pipeline via the <code>localnet</code> port.
>>> This table adds the
>>> >> > >> > +      below logical flows to handle the traffic from these
>>> ports.
>>> >> > >> > +    </p>
>>> >> > >> > +
>>> >> > >> > +    <ul>
>>> >> > >> > +      <li>
>>> >> > >> > +        <p>
>>> >> > >> > +          A priority-100 flow is added for each
>>> <code>external</code> logical
>>> >> > >> > +          port which doesn't reside on a chassis to drop the
>>> ARP/IPv6 NS
>>> >> > >> > +          request to the router IP(s) (of the logical switch)
>>> which matches
>>> >> > >> > +          on the <code>inport</code> of the
>>> <code>external</code> logical port
>>> >> > >> > +          and the valid <code>eth.src</code> address(es) of
>>> the
>>> >> > >> > +          <code>external</code> logical port.
>>> >> > >> > +        </p>
>>> >> > >> > +
>>> >> > >> > +        <p>
>>> >> > >> > +          This flow guarantees that the ARP/NS request to the
>>> router IP
>>> >> > >> > +          address from the external ports is responded by
>>> only the chassis
>>> >> > >> > +          which has claimed these external ports. All the
>>> other chassis,
>>> >> > >> > +          drops these packets.
>>> >> > >> > +        </p>
>>> >> > >> > +      </li>
>>> >> > >> > +
>>> >> > >> > +      <li>
>>> >> > >> > +        A priority-0 flow that matches all packets to
>>> advances to table 17.
>>> >> > >> > +      </li>
>>> >> > >> > +    </ul>
>>> >> > >> > +
>>> >> > >> > +    <h3>Ingress Table 17 Destination Lookup</h3>
>>> >> > >> >
>>> >> > >> >      <p>
>>> >> > >> >        This table implements switching behavior.  It contains
>>> these logical
>>> >> > >> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>>> >> > >> > index 3fd8a8757..87208c6c1 100644
>>> >> > >> > --- a/ovn/northd/ovn-northd.c
>>> >> > >> > +++ b/ovn/northd/ovn-northd.c
>>> >> > >> > @@ -119,7 +119,8 @@ enum ovn_stage {
>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13,
>>> "ls_in_dhcp_response") \
>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14,
>>> "ls_in_dns_lookup")    \
>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15,
>>> "ls_in_dns_response")  \
>>> >> > >> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16,
>>> "ls_in_l2_lkup")       \
>>> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16,
>>> "ls_in_external_port") \
>>> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17,
>>> "ls_in_l2_lkup")       \
>>> >> > >> >
>>>             \
>>> >> > >> >      /* Logical switch egress stages. */
>>>              \
>>> >> > >> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0,
>>> "ls_out_pre_lb")         \
>>> >> > >> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct
>>> nbrec_logical_switch_port *lsp)
>>> >> > >> >      return !lsp->up || *lsp->up;
>>> >> > >> >  }
>>> >> > >> >
>>> >> > >> > +static bool
>>> >> > >> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
>>> >> > >> > +{
>>> >> > >> > +    return !strcmp(nbsp->type, "external");
>>> >> > >> > +}
>>> >> > >> > +
>>> >> > >> >  static bool
>>> >> > >> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
>>> >> > >> >                      struct ds *options_action, struct ds
>>> *response_action,
>>> >> > >> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap
>>> *datapaths, struct hmap *ports,
>>> >> > >> >           *  - port type is localport
>>> >> > >> >           */
>>> >> > >> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type,
>>> "router") &&
>>> >> > >> > -            strcmp(op->nbsp->type, "localport")) {
>>> >> > >> > +            strcmp(op->nbsp->type, "localport") &&
>>> lsp_is_external(op->nbsp)) {
>>> >> > >>
>>> >> > >> Sorry that I missed this in last review. The && condition has
>>> problem.
>>> >> > >> It will cause ARP responder flows added for all lports that are
>>> not
>>> >> > >> external. I think it should be || here.
>>> >> > >
>>> >> > >
>>> >> > > Agree. To make it easier to read, I will add a new "if" with
>>> continue - below this one for
>>> >> > > external port types.
>>> >> > >
>>> >> > >
>>> >> > >>
>>> >> > >>
>>> >> > >> >              continue;
>>> >> > >> >          }
>>> >> > >> >
>>> >> > >> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap
>>> *datapaths, struct hmap *ports,
>>> >> > >> >              continue;
>>> >> > >> >          }
>>> >> > >> >
>>> >> > >> > +        bool is_external = lsp_is_external(op->nbsp);
>>> >> > >> > +        if (is_external && !op->od->localnet_port) {
>>> >> > >> > +            /* If it's an external port and there is no
>>> localnet port
>>> >> > >> > +             * ignore it. */
>>> >> > >> > +            continue;
>>> >> > >> > +        }
>>> >> > >> > +
>>> >> > >> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>>> >> > >> >              for (size_t j = 0; j <
>>> op->lsp_addrs[i].n_ipv4_addrs; j++) {
>>> >> > >> >                  struct ds options_action =
>>> DS_EMPTY_INITIALIZER;
>>> >> > >> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap
>>> *datapaths, struct hmap *ports,
>>> >> > >> >                      ds_put_format(
>>> >> > >> >                          &match, "inport == %s && eth.src ==
>>> %s && "
>>> >> > >> >                          "ip4.src == 0.0.0.0 && ip4.dst ==
>>> 255.255.255.255 && "
>>> >> > >> > -                        "udp.src == 68 && udp.dst == 67",
>>> op->json_key,
>>> >> > >> > -                        op->lsp_addrs[i].ea_s);
>>> >> > >> > +                        "udp.src == 68 && udp.dst == 67",
>>> >> > >> > +                        op->json_key, op->lsp_addrs[i].ea_s);
>>> >> > >>
>>> >> > >> No change here?
>>> >> > >
>>> >> > >
>>> >> > > I think it's unwanted and unrelated change. I will correct it.
>>> >> > >>
>>> >> > >> >
>>> >> > >> >                      ovn_lflow_add(lflows, op->od,
>>> S_SWITCH_IN_DHCP_OPTIONS,
>>> >> > >> >                                    100, ds_cstr(&match),
>>> >> > >> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap
>>> *datapaths, struct hmap *ports,
>>> >> > >> >      /* Ingress table 12 and 13: DHCP options and response, by
>>> default goto
>>> >> > >> >       * next. (priority 0).
>>> >> > >> >       * Ingress table 14 and 15: DNS lookup and response, by
>>> default goto next.
>>> >> > >> > -     * (priority 0).*/
>>> >> > >> > +     * (priority 0).
>>> >> > >> > +     * Ingress table 16 - External port handling, by default
>>> goto next.
>>> >> > >> > +     * (priority 0). */
>>> >> > >> >
>>> >> > >> >      HMAP_FOR_EACH (od, key_node, datapaths) {
>>> >> > >> >          if (!od->nbs) {
>>> >> > >> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap
>>> *datapaths, struct hmap *ports,
>>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE,
>>> 0, "1", "next;");
>>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0,
>>> "1", "next;");
>>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE,
>>> 0, "1", "next;");
>>> >> > >> > +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT,
>>> 0, "1", "next;");
>>> >> > >> >      }
>>> >> > >> >
>>> >> > >> > -    /* Ingress table 16: Destination lookup, broadcast and
>>> multicast handling
>>> >> > >> > +    HMAP_FOR_EACH (op, key_node, ports) {
>>> >> > >> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
>>> >> > >> > +           continue;
>>> >> > >> > +        }
>>> >> > >> > +
>>> >> > >> > +        /* Table 16: External port. Drop ARP request for
>>> router ips from
>>> >> > >> > +         * external ports  on chassis not binding those ports.
>>> >> > >> > +         * This makes the router pipeline to be run only on
>>> the chassis
>>> >> > >> > +         * binding the external ports. */
>>> >> > >> > +
>>> >> > >> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>>> >> > >> > +            for (size_t j = 0; j < op->od->n_router_ports;
>>> j++) {
>>> >> > >> > +                struct ovn_port *rp = op->od->router_ports[j];
>>> >> > >> > +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
>>> >> > >> > +                    for (size_t l = 0; l <
>>> rp->lsp_addrs[k].n_ipv4_addrs;
>>> >> > >> > +                         l++) {
>>> >> > >> > +                        ds_clear(&match);
>>> >> > >> > +                        ds_put_cstr(&match, "ip4");
>>> >> > >> > +                        ds_put_format(
>>> >> > >> > +                            &match, "inport == %s && eth.src
>>> == %s"
>>> >> > >> > +                            " && !is_chassis_resident(%s)"
>>> >> > >> > +                            " && arp.tpa == %s && arp.op ==
>>> 1",
>>> >> > >> > +                            op->json_key,
>>> op->lsp_addrs[i].ea_s, op->json_key,
>>> >> > >>
>>> >> > >> I believe the inport should match the localnet port's json_key
>>> here,
>>> >> > >> since it is coming from a localnet port.
>>> >> > >
>>> >> > >
>>> >> > > Both would work. If you see the code in lflow.c in this patch -
>>> it will get the tunnel
>>> >> > > key of the localnet port if the port_binding type is "external".
>>> >> > >
>>> >> > > That's how even the DHCP requests are handled. ovn-controller
>>> will translate
>>> >> > > the logical flows with action "put_dhcp_opts" only the chassis
>>> claiming the
>>> >> > > external ports.
>>> >> >
>>> >> > Oh, yes you are right. Actually I read that part in v4 and it
>>> somehow
>>> >> > slipped my mind. Thanks for explain.
>>> >>
>>> >> I thought it a second time, and I'd suggest to do the convertion here
>>> >> in northd instead of ovn-controller, for two reasons:
>>> >>
>>> >> 1. In ovn-controller there is no extra context so it just blindly
>>> >> transate all references to external logical port into localnet port
>>> >> key. This could lead to unexpected behavior. For example, if someone
>>> >> uses external logical port in ACL match condition. The match condition
>>> >> would then apply to all packets to/from localnet port which is
>>> >> definitely unwanted. (at the same time it would be better to document
>>> >> that features like port-security, ACL should not be used for external
>>> >> logical ports)
>>> >>
>>> >
>>> > That's not how it works in the present patch. Lets say you have  2
>>> chassis
>>> > hv1 and hv2 and an external port sw0-ext1 and a localnet port
>>> "ln-public".
>>> > Suppose if the requested-chassis is set to hv1, then all the logical
>>> flows with the
>>> > match "inport == sw0-ext1" will be converted to OF flows only on hv1
>>> as this port
>>> > is bound by hv1 and the function 'lookup_port_cb()' would return true
>>> only
>>> > on hv1 . In hv2, lookup_port_cb() would return false.
>>>
>>> Yes, this is well understood.
>>>
>>> >
>>> > If we want to do the conversion in ovn-northd.c the match condition
>>> would have to
>>> > be - "inport == "ln-public" && is_chassis_resident(sw0-ext1) && ...."
>>> > instead of the present one  - "inport == sw0-ext1 && ...".
>>>
>>> Yes, this is what I would suggest (see reason below).
>>>
>>> >
>>> > And the ACL match condition would not be an issue because of the above
>>> mentioned
>>> > reason. i.e the ACL flows will be applied only on the chassis binding
>>> the external
>>> > port.
>>>
>>> Here is the concern. For example, chassis A has regular port sw0-lsp1
>>> bound. Chassis A is also set as requested-chassis for external port
>>> sw0-ext1. And now the user adds an ACL: to-port, 1001, 'outport ==
>>> "sw0-ext1", drop. This would get translated to something like:
>>> to-port, 1001, 'outport == "ln-public"', drop. Wouldn't this impact
>>> the traffic from sw0-lsp1 to the bridged network? Even if it doesn't
>>> impact because of some subtle reasons of current implementation, I
>>> would say it is risky and could leads to problems under certain
>>> conditions, because the conversion in ovn-controller widens the
>>> original intent. Whereas doing it in northd only for specific lflows
>>> would ensure it has impact only for intended use cases.
>>>
>>
>>
>> Thanks for the detailed explanation. I agree. It's clear to me now. I
>> will update accordingly in v6.
>>
>> Regards
>> Numan
>>
>>
>>> >
>>> > The test case added checks that the OF flows are applied only on the
>>> bound chassis.
>>> >
>>> > I think it is better to do it in ovn-controller instead of ovn-northd.
>>> Please let me know
>>> > if you still have any concerns.
>>> >
>>> >
>>> >
>>> >> 2. A less important reason is, it is better to do it at earlier stage
>>> >> than later. northd handles common processing. This part of logic is
>>> >> common for all chassises, so it would be better if we explicitely
>>> >> handle it in northd, instead of let every chassis to process. And the
>>> >> change in northd would likely be simpler than in ovn-controller.
>>>
>>> This is less critical problem, but I think it is worth consideration,
>>> too. With current logic, although the conversion would take effect
>>> only if "is_chassis_resident()" is true, but the code logic and
>>> processing has to happen on every chassis.
>>>
>>> >>
>>> >> Thanks,
>>> >> Han
>>>
>>
Han Zhou Jan. 24, 2019, 3:44 p.m. UTC | #12
On Mon, Jan 21, 2019 at 7:06 AM Miguel Angel Ajo Pelayo
<majopela@redhat.com> wrote:
>
>
>
> On Mon, Jan 21, 2019 at 4:02 PM Numan Siddique <nusiddiq@redhat.com> wrote:
>>
>>
>> Hi Han,
>>
>> I have addressed your comments. But before posting the patch I wanted to get an opinion
>> on the HA support for these external ports.
>>
>> The proposed patch doesn't support HA. If the requested chassis goes down for some reason
>> it is expected that CMS would detect it and change the requested-chassis option to other
>> suitable chassis.
>>
>> The openstack OVN folks think this would be too much for the CMS to handle and it would
>> complicate the code in networking-ovn which I agree with.
>>
>
> Not only the complexity part. If we implement this from the CMS, then every CMS using ovn
> will need to replicate that behaviour.
>
> That's in my opinion a good reason why it's better to handle HA within OVN itself.
>
>>
>> I am thinking to add the HA support on the lines of gateway chassis support and I want to
>> submit this patch after adding the HA support. I think this would be better as we won't add
>> more options in OVN (first requested-chassis for external ports and then later HA chassis support).
>> Thoughts?

I thought it would be easier to support outside of OVN combining with
chassis life-cycle management, but I didn't go deeper in any CMS
implementation. I agree it is better to handle HA in OVN than
implementing it in every CMS. But I am also worring about the
complexity in OVN itself. Could you describe briefly how would you
support it in OVN? For example, how to detect if a chassis failed? It
is different from gateway chassis because the major use case of
external port is for bridged networks (vlan/flat), so I think the BFD
mechanism for tunnel health monitoring may not be a good fit here.

Thanks,
Han
>>
>> Thanks
>> Numan
>>
>>
>> On Sat, Jan 19, 2019 at 12:42 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>>>
>>>
>>>
>>> On Sat, Jan 19, 2019, 12:32 AM Han Zhou <zhouhan@gmail.com wrote:
>>>>
>>>> On Fri, Jan 18, 2019 at 10:16 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Jan 18, 2019 at 2:11 AM Han Zhou <zhouhan@gmail.com> wrote:
>>>> >>
>>>> >> On Thu, Jan 17, 2019 at 11:32 AM Han Zhou <zhouhan@gmail.com> wrote:
>>>> >> >
>>>> >> > On Thu, Jan 17, 2019 at 11:25 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>>>> >> > >
>>>> >> > >
>>>> >> > >
>>>> >> > > On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com> wrote:
>>>> >> > >>
>>>> >> > >> Hi Numan,
>>>> >> > >>
>>>> >> > >> With v5 the new test case "external logical port" fails.
>>>> >> > >> And please see more comments inlined.
>>>> >> > >>
>>>> >> > >> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
>>>> >> > >> >
>>>> >> > >> > From: Numan Siddique <nusiddiq@redhat.com>
>>>> >> > >> >
>>>> >> > >> > In the case of OpenStack + OVN, when the VMs are booted on
>>>> >> > >> > hypervisors supporting SR-IOV nics, there are no OVS ports
>>>> >> > >> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
>>>> >> > >> > Router Solicitation requests, the local ovn-controller
>>>> >> > >> > cannot reply to these packets. OpenStack Neutron dhcp agent
>>>> >> > >> > service needs to be run to serve these requests.
>>>> >> > >> >
>>>> >> > >> > With the new logical port type - 'external', OVN itself can
>>>> >> > >> > handle these requests avoiding the need to deploy any
>>>> >> > >> > external services like neutron dhcp agent.
>>>> >> > >> >
>>>> >> > >> > To make use of this feature, CMS has to
>>>> >> > >> >  - create a logical port for such VMs
>>>> >> > >> >  - set the type to 'external'
>>>> >> > >> >  - set requested-chassis="<chassis-name>" in the options
>>>> >> > >> >    column.
>>>> >> > >> >  - create a localnet port for the logical switch
>>>> >> > >> >  - configure the ovn-bridge-mappings option in the OVS db.
>>>> >> > >> >
>>>> >> > >> > When the ovn-controller running in that 'chassis', detects
>>>> >> > >> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
>>>> >> > >> > flows. Since the packet enters the logical switch pipeline
>>>> >> > >> > via the localnet port, the inport register (reg14) is set
>>>> >> > >> > to the tunnel key of localnet port in the match conditions.
>>>> >> > >> >
>>>> >> > >> > In case the chassis goes down for some reason, it is the
>>>> >> > >> > responsibility of CMS to change the 'requested-chassis'
>>>> >> > >> > option to some other active chassis, so that it can serve
>>>> >> > >> > these requests.
>>>> >> > >> >
>>>> >> > >> > When the VM with the external port, sends an ARP request for
>>>> >> > >> > the router ips, only the chassis which has claimed the port,
>>>> >> > >> > will reply to the ARP requests. Rest of the chassis on
>>>> >> > >> > receiving these packets drop them in the ingress switch
>>>> >> > >> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
>>>> >> > >> > before S_SWITCH_IN_L2_LKUP.
>>>> >> > >> >
>>>> >> > >> > This would guarantee that only the chassis which has claimed
>>>> >> > >> > the external ports will run the router datapath pipeline.
>>>> >> > >> >
>>>> >> > >> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>>>> >> > >> > ---
>>>> >> > >> >
>>>> >> > >> > v4 -> v5
>>>> >> > >> > ------
>>>> >> > >> >   * Addressed review comments from Han Zhou.
>>>> >> > >> >
>>>> >> > >> > v3 -> v4
>>>> >> > >> > ------
>>>> >> > >> >   * Updated the documention as per Han Zhou's suggestion.
>>>> >> > >> >
>>>> >> > >> > v2 -> v3
>>>> >> > >> > -------
>>>> >> > >> >   * Rebased
>>>> >> > >> >
>>>> >> > >> >  ovn/controller/binding.c        |  12 +
>>>> >> > >> >  ovn/controller/lflow.c          |  41 ++-
>>>> >> > >> >  ovn/controller/lflow.h          |   2 +
>>>> >> > >> >  ovn/controller/lport.c          |  26 ++
>>>> >> > >> >  ovn/controller/lport.h          |   5 +
>>>> >> > >> >  ovn/controller/ovn-controller.c |   6 +
>>>> >> > >> >  ovn/lib/ovn-util.c              |   1 +
>>>> >> > >> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
>>>> >> > >> >  ovn/northd/ovn-northd.c         |  85 ++++-
>>>> >> > >> >  ovn/ovn-architecture.7.xml      |  78 +++++
>>>> >> > >> >  ovn/ovn-nb.xml                  |  47 +++
>>>> >> > >> >  tests/ovn.at                    | 530 +++++++++++++++++++++++++++++++-
>>>> >> > >> >  12 files changed, 848 insertions(+), 22 deletions(-)
>>>> >> > >> >
>>>> >> > >> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
>>>> >> > >> > index 021ecddcf..64e605b92 100644
>>>> >> > >> > --- a/ovn/controller/binding.c
>>>> >> > >> > +++ b/ovn/controller/binding.c
>>>> >> > >> > @@ -471,6 +471,18 @@ consider_local_datapath(struct ovsdb_idl_txn *ovnsb_idl_txn,
>>>> >> > >> >           * for them. */
>>>> >> > >> >          sset_add(local_lports, binding_rec->logical_port);
>>>> >> > >> >          our_chassis = false;
>>>> >> > >> > +    } else if (!strcmp(binding_rec->type, "external")) {
>>>> >> > >> > +        const char *chassis_id = smap_get(&binding_rec->options,
>>>> >> > >> > +                                          "requested-chassis");
>>>> >> > >> > +        our_chassis = chassis_id && (
>>>> >> > >> > +            !strcmp(chassis_id, chassis_rec->name) ||
>>>> >> > >> > +            !strcmp(chassis_id, chassis_rec->hostname));
>>>> >> > >> > +        if (our_chassis) {
>>>> >> > >> > +            add_local_datapath(sbrec_datapath_binding_by_key,
>>>> >> > >> > +                               sbrec_port_binding_by_datapath,
>>>> >> > >> > +                               sbrec_port_binding_by_name,
>>>> >> > >> > +                               binding_rec->datapath, true, local_datapaths);
>>>> >> > >> > +        }
>>>> >> > >> >      }
>>>> >> > >> >
>>>> >> > >> >      if (our_chassis
>>>> >> > >> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
>>>> >> > >> > index 8db81927e..98e8ed3b9 100644
>>>> >> > >> > --- a/ovn/controller/lflow.c
>>>> >> > >> > +++ b/ovn/controller/lflow.c
>>>> >> > >> > @@ -52,7 +52,10 @@ lflow_init(void)
>>>> >> > >> >  struct lookup_port_aux {
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
>>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
>>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
>>>> >> > >> >      const struct sbrec_datapath_binding *dp;
>>>> >> > >> > +    const struct sbrec_chassis *chassis;
>>>> >> > >> >  };
>>>> >> > >> >
>>>> >> > >> >  struct condition_aux {
>>>> >> > >> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>>> >> > >> >      const struct sbrec_logical_flow *,
>>>> >> > >> >      const struct hmap *local_datapaths,
>>>> >> > >> >      const struct sbrec_chassis *,
>>>> >> > >> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char *port_name, unsigned int *portp)
>>>> >> > >> >      const struct sbrec_port_binding *pb
>>>> >> > >> >          = lport_lookup_by_name(aux->sbrec_port_binding_by_name, port_name);
>>>> >> > >> >      if (pb && pb->datapath == aux->dp) {
>>>> >> > >> > -        *portp = pb->tunnel_key;
>>>> >> > >> > -        return true;
>>>> >> > >> > +        if (strcmp(pb->type, "external")) {
>>>> >> > >> > +            *portp = pb->tunnel_key;
>>>> >> > >> > +            return true;
>>>> >> > >> > +        }
>>>> >> > >> > +        const char *chassis_id = smap_get(&pb->options,
>>>> >> > >> > +                                          "requested-chassis");
>>>> >> > >> > +        if (chassis_id && (!strcmp(chassis_id, aux->chassis->name) ||
>>>> >> > >> > +                           !strcmp(chassis_id, aux->chassis->hostname))) {
>>>> >> > >> > +            const struct sbrec_port_binding *localnet_pb
>>>> >> > >> > +                = lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
>>>> >> > >> > +                                       aux->sbrec_port_binding_by_type,
>>>> >> > >> > +                                       aux->dp->tunnel_key, "localnet");
>>>> >> > >> > +            if (localnet_pb) {
>>>> >> > >> > +                *portp = localnet_pb->tunnel_key;
>>>> >> > >> > +                return true;
>>>> >> > >> > +            }
>>>> >> > >> > +        }
>>>> >> > >> > +        return false;
>>>> >> > >> >      }
>>>> >> > >> >
>>>> >> > >> >      const struct sbrec_multicast_group *mg = mcgroup_lookup_by_dp_name(
>>>> >> > >> > @@ -144,6 +165,8 @@ add_logical_flows(
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>>> >> > >> >      const struct sbrec_dhcp_options_table *dhcp_options_table,
>>>> >> > >> >      const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
>>>> >> > >> >      const struct sbrec_logical_flow_table *logical_flow_table,
>>>> >> > >> > @@ -183,6 +206,8 @@ add_logical_flows(
>>>> >> > >> >          consider_logical_flow(sbrec_chassis_by_name,
>>>> >> > >> >                                sbrec_multicast_group_by_name_datapath,
>>>> >> > >> >                                sbrec_port_binding_by_name,
>>>> >> > >> > +                              sbrec_port_binding_by_type,
>>>> >> > >> > +                              sbrec_datapath_binding_by_key,
>>>> >> > >> >                                lflow, local_datapaths,
>>>> >> > >> >                                chassis, &dhcp_opts, &dhcpv6_opts, &nd_ra_opts,
>>>> >> > >> >                                addr_sets, port_groups, active_tunnels,
>>>> >> > >> > @@ -200,6 +225,8 @@ consider_logical_flow(
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>>> >> > >> >      const struct sbrec_logical_flow *lflow,
>>>> >> > >> >      const struct hmap *local_datapaths,
>>>> >> > >> >      const struct sbrec_chassis *chassis,
>>>> >> > >> > @@ -292,7 +319,10 @@ consider_logical_flow(
>>>> >> > >> >          .sbrec_multicast_group_by_name_datapath
>>>> >> > >> >              = sbrec_multicast_group_by_name_datapath,
>>>> >> > >> >          .sbrec_port_binding_by_name = sbrec_port_binding_by_name,
>>>> >> > >> > -        .dp = lflow->logical_datapath
>>>> >> > >> > +        .sbrec_port_binding_by_type = sbrec_port_binding_by_type,
>>>> >> > >> > +        .sbrec_datapath_binding_by_key = sbrec_datapath_binding_by_key,
>>>> >> > >> > +        .dp = lflow->logical_datapath,
>>>> >> > >> > +        .chassis = chassis
>>>> >> > >> >      };
>>>> >> > >> >      struct condition_aux cond_aux = {
>>>> >> > >> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
>>>> >> > >> > @@ -463,6 +493,8 @@ void
>>>> >> > >> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>>>> >> > >> >            struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>>>> >> > >> >            struct ovsdb_idl_index *sbrec_port_binding_by_name,
>>>> >> > >> > +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>>> >> > >> > +          struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>>> >> > >> >            const struct sbrec_dhcp_options_table *dhcp_options_table,
>>>> >> > >> >            const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
>>>> >> > >> >            const struct sbrec_logical_flow_table *logical_flow_table,
>>>> >> > >> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>>>> >> > >> >
>>>> >> > >> >      add_logical_flows(sbrec_chassis_by_name,
>>>> >> > >> >                        sbrec_multicast_group_by_name_datapath,
>>>> >> > >> > -                      sbrec_port_binding_by_name, dhcp_options_table,
>>>> >> > >> > +                      sbrec_port_binding_by_name, sbrec_port_binding_by_type,
>>>> >> > >> > +                      sbrec_datapath_binding_by_key, dhcp_options_table,
>>>> >> > >> >                        dhcpv6_options_table, logical_flow_table,
>>>> >> > >> >                        local_datapaths, chassis, addr_sets, port_groups,
>>>> >> > >> >                        active_tunnels, local_lport_ids, flow_table, group_table,
>>>> >> > >> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
>>>> >> > >> > index d19338140..b2911e0eb 100644
>>>> >> > >> > --- a/ovn/controller/lflow.h
>>>> >> > >> > +++ b/ovn/controller/lflow.h
>>>> >> > >> > @@ -68,6 +68,8 @@ void lflow_init(void);
>>>> >> > >> >  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>>>> >> > >> >                 struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>>>> >> > >> >                 struct ovsdb_idl_index *sbrec_port_binding_by_name,
>>>> >> > >> > +               struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>>> >> > >> > +               struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>>> >> > >> >                 const struct sbrec_dhcp_options_table *,
>>>> >> > >> >                 const struct sbrec_dhcpv6_options_table *,
>>>> >> > >> >                 const struct sbrec_logical_flow_table *,
>>>> >> > >> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
>>>> >> > >> > index cc5c5fbb2..9c827d9b0 100644
>>>> >> > >> > --- a/ovn/controller/lport.c
>>>> >> > >> > +++ b/ovn/controller/lport.c
>>>> >> > >> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>>> >> > >> >      return retval;
>>>> >> > >> >  }
>>>> >> > >> >
>>>> >> > >> > +const struct sbrec_port_binding *
>>>> >> > >> > +lport_lookup_by_type(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>>> >> > >> > +                     struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>>> >> > >> > +                     uint64_t dp_key, const char *port_type)
>>>> >> > >> > +{
>>>> >> > >> > +    /* Lookup datapath corresponding to dp_key. */
>>>> >> > >> > +    const struct sbrec_datapath_binding *db = datapath_lookup_by_key(
>>>> >> > >> > +        sbrec_datapath_binding_by_key, dp_key);
>>>> >> > >> > +    if (!db) {
>>>> >> > >> > +        return NULL;
>>>> >> > >> > +    }
>>>> >> > >> > +
>>>> >> > >> > +    /* Build key for an indexed lookup. */
>>>> >> > >> > +    struct sbrec_port_binding *pb = sbrec_port_binding_index_init_row(
>>>> >> > >> > +            sbrec_port_binding_by_type);
>>>> >> > >> > +    sbrec_port_binding_index_set_datapath(pb, db);
>>>> >> > >> > +    sbrec_port_binding_index_set_type(pb, port_type);
>>>> >> > >> > +
>>>> >> > >> > +    const struct sbrec_port_binding *retval = sbrec_port_binding_index_find(
>>>> >> > >> > +            sbrec_port_binding_by_type, pb);
>>>> >> > >> > +
>>>> >> > >> > +    sbrec_port_binding_index_destroy_row(pb);
>>>> >> > >> > +
>>>> >> > >> > +    return retval;
>>>> >> > >> > +}
>>>> >> > >> > +
>>>> >> > >> >  const struct sbrec_datapath_binding *
>>>> >> > >> >  datapath_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>>> >> > >> >                         uint64_t dp_key)
>>>> >> > >> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
>>>> >> > >> > index 7dcd5bee0..2d49792f6 100644
>>>> >> > >> > --- a/ovn/controller/lport.h
>>>> >> > >> > +++ b/ovn/controller/lport.h
>>>> >> > >> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding *lport_lookup_by_key(
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
>>>> >> > >> >      uint64_t dp_key, uint64_t port_key);
>>>> >> > >> >
>>>> >> > >> > +const struct sbrec_port_binding *lport_lookup_by_type(
>>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>>>> >> > >> > +    uint64_t dp_key, const char *port_type);
>>>> >> > >> > +
>>>> >> > >> >  const struct sbrec_datapath_binding *datapath_lookup_by_key(
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key, uint64_t dp_key);
>>>> >> > >> >
>>>> >> > >> > diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
>>>> >> > >> > index 4e9a5865f..5aab9142f 100644
>>>> >> > >> > --- a/ovn/controller/ovn-controller.c
>>>> >> > >> > +++ b/ovn/controller/ovn-controller.c
>>>> >> > >> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
>>>> >> > >> >       * ports that have a Gateway_Chassis that point's to our own
>>>> >> > >> >       * chassis */
>>>> >> > >> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "chassisredirect");
>>>> >> > >> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "external");
>>>> >> > >> >      if (chassis) {
>>>> >> > >> >          /* This should be mostly redundant with the other clauses for port
>>>> >> > >> >           * bindings, but it allows us to catch any ports that are assigned to
>>>> >> > >> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
>>>> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>>>> >> > >> >                                    &sbrec_port_binding_col_datapath);
>>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
>>>> >> > >> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>>>> >> > >> > +                                  &sbrec_port_binding_col_type);
>>>> >> > >>
>>>> >> > >> This index is used with two columns: datapath_binding and type, so it
>>>> >> > >> should be created with both columns using create2.
>>>> >> > >>
>>>> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
>>>> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>>>> >> > >> >                                    &sbrec_datapath_binding_col_tunnel_key);
>>>> >> > >> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
>>>> >> > >> >                              sbrec_chassis_by_name,
>>>> >> > >> >                              sbrec_multicast_group_by_name_datapath,
>>>> >> > >> >                              sbrec_port_binding_by_name,
>>>> >> > >> > +                            sbrec_port_binding_by_type,
>>>> >> > >> > +                            sbrec_datapath_binding_by_key,
>>>> >> > >> >                              sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
>>>> >> > >> >                              sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
>>>> >> > >> >                              sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
>>>> >> > >> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
>>>> >> > >> > index aa03919bb..a9d4b8736 100644
>>>> >> > >> > --- a/ovn/lib/ovn-util.c
>>>> >> > >> > +++ b/ovn/lib/ovn-util.c
>>>> >> > >> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
>>>> >> > >> >      "localport",
>>>> >> > >> >      "router",
>>>> >> > >> >      "vtep",
>>>> >> > >> > +    "external",
>>>> >> > >> >  };
>>>> >> > >> >
>>>> >> > >> >  bool
>>>> >> > >> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
>>>> >> > >> > index 392a5efc9..c8883d60d 100644
>>>> >> > >> > --- a/ovn/northd/ovn-northd.8.xml
>>>> >> > >> > +++ b/ovn/northd/ovn-northd.8.xml
>>>> >> > >> > @@ -626,7 +626,8 @@ nd_na_router {
>>>> >> > >> >      <p>
>>>> >> > >> >        This table adds the DHCPv4 options to a DHCPv4 packet from the
>>>> >> > >> >        logical ports configured with IPv4 address(es) and DHCPv4 options,
>>>> >> > >> > -      and similarly for DHCPv6 options.
>>>> >> > >> > +      and similarly for DHCPv6 options. This table also adds flows for the
>>>> >> > >> > +      logical ports of type <code>external</code>.
>>>> >> > >> >      </p>
>>>> >> > >> >
>>>> >> > >> >      <ul>
>>>> >> > >> > @@ -827,7 +828,39 @@ output;
>>>> >> > >> >        </li>
>>>> >> > >> >      </ul>
>>>> >> > >> >
>>>> >> > >> > -    <h3>Ingress Table 16 Destination Lookup</h3>
>>>> >> > >> > +    <h3>Ingress table 16 External ports</h3>
>>>> >> > >> > +
>>>> >> > >> > +    <p>
>>>> >> > >> > +      Traffic from the <code>external</code> logical ports enter the ingress
>>>> >> > >> > +      datapath pipeline via the <code>localnet</code> port. This table adds the
>>>> >> > >> > +      below logical flows to handle the traffic from these ports.
>>>> >> > >> > +    </p>
>>>> >> > >> > +
>>>> >> > >> > +    <ul>
>>>> >> > >> > +      <li>
>>>> >> > >> > +        <p>
>>>> >> > >> > +          A priority-100 flow is added for each <code>external</code> logical
>>>> >> > >> > +          port which doesn't reside on a chassis to drop the ARP/IPv6 NS
>>>> >> > >> > +          request to the router IP(s) (of the logical switch) which matches
>>>> >> > >> > +          on the <code>inport</code> of the <code>external</code> logical port
>>>> >> > >> > +          and the valid <code>eth.src</code> address(es) of the
>>>> >> > >> > +          <code>external</code> logical port.
>>>> >> > >> > +        </p>
>>>> >> > >> > +
>>>> >> > >> > +        <p>
>>>> >> > >> > +          This flow guarantees that the ARP/NS request to the router IP
>>>> >> > >> > +          address from the external ports is responded by only the chassis
>>>> >> > >> > +          which has claimed these external ports. All the other chassis,
>>>> >> > >> > +          drops these packets.
>>>> >> > >> > +        </p>
>>>> >> > >> > +      </li>
>>>> >> > >> > +
>>>> >> > >> > +      <li>
>>>> >> > >> > +        A priority-0 flow that matches all packets to advances to table 17.
>>>> >> > >> > +      </li>
>>>> >> > >> > +    </ul>
>>>> >> > >> > +
>>>> >> > >> > +    <h3>Ingress Table 17 Destination Lookup</h3>
>>>> >> > >> >
>>>> >> > >> >      <p>
>>>> >> > >> >        This table implements switching behavior.  It contains these logical
>>>> >> > >> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>>>> >> > >> > index 3fd8a8757..87208c6c1 100644
>>>> >> > >> > --- a/ovn/northd/ovn-northd.c
>>>> >> > >> > +++ b/ovn/northd/ovn-northd.c
>>>> >> > >> > @@ -119,7 +119,8 @@ enum ovn_stage {
>>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13, "ls_in_dhcp_response") \
>>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14, "ls_in_dns_lookup")    \
>>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15, "ls_in_dns_response")  \
>>>> >> > >> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16, "ls_in_l2_lkup")       \
>>>> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16, "ls_in_external_port") \
>>>> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17, "ls_in_l2_lkup")       \
>>>> >> > >> >                                                                            \
>>>> >> > >> >      /* Logical switch egress stages. */                                   \
>>>> >> > >> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")         \
>>>> >> > >> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct nbrec_logical_switch_port *lsp)
>>>> >> > >> >      return !lsp->up || *lsp->up;
>>>> >> > >> >  }
>>>> >> > >> >
>>>> >> > >> > +static bool
>>>> >> > >> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
>>>> >> > >> > +{
>>>> >> > >> > +    return !strcmp(nbsp->type, "external");
>>>> >> > >> > +}
>>>> >> > >> > +
>>>> >> > >> >  static bool
>>>> >> > >> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
>>>> >> > >> >                      struct ds *options_action, struct ds *response_action,
>>>> >> > >> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>>>> >> > >> >           *  - port type is localport
>>>> >> > >> >           */
>>>> >> > >> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") &&
>>>> >> > >> > -            strcmp(op->nbsp->type, "localport")) {
>>>> >> > >> > +            strcmp(op->nbsp->type, "localport") && lsp_is_external(op->nbsp)) {
>>>> >> > >>
>>>> >> > >> Sorry that I missed this in last review. The && condition has problem.
>>>> >> > >> It will cause ARP responder flows added for all lports that are not
>>>> >> > >> external. I think it should be || here.
>>>> >> > >
>>>> >> > >
>>>> >> > > Agree. To make it easier to read, I will add a new "if" with continue - below this one for
>>>> >> > > external port types.
>>>> >> > >
>>>> >> > >
>>>> >> > >>
>>>> >> > >>
>>>> >> > >> >              continue;
>>>> >> > >> >          }
>>>> >> > >> >
>>>> >> > >> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>>>> >> > >> >              continue;
>>>> >> > >> >          }
>>>> >> > >> >
>>>> >> > >> > +        bool is_external = lsp_is_external(op->nbsp);
>>>> >> > >> > +        if (is_external && !op->od->localnet_port) {
>>>> >> > >> > +            /* If it's an external port and there is no localnet port
>>>> >> > >> > +             * ignore it. */
>>>> >> > >> > +            continue;
>>>> >> > >> > +        }
>>>> >> > >> > +
>>>> >> > >> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>>>> >> > >> >              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
>>>> >> > >> >                  struct ds options_action = DS_EMPTY_INITIALIZER;
>>>> >> > >> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>>>> >> > >> >                      ds_put_format(
>>>> >> > >> >                          &match, "inport == %s && eth.src == %s && "
>>>> >> > >> >                          "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
>>>> >> > >> > -                        "udp.src == 68 && udp.dst == 67", op->json_key,
>>>> >> > >> > -                        op->lsp_addrs[i].ea_s);
>>>> >> > >> > +                        "udp.src == 68 && udp.dst == 67",
>>>> >> > >> > +                        op->json_key, op->lsp_addrs[i].ea_s);
>>>> >> > >>
>>>> >> > >> No change here?
>>>> >> > >
>>>> >> > >
>>>> >> > > I think it's unwanted and unrelated change. I will correct it.
>>>> >> > >>
>>>> >> > >> >
>>>> >> > >> >                      ovn_lflow_add(lflows, op->od, S_SWITCH_IN_DHCP_OPTIONS,
>>>> >> > >> >                                    100, ds_cstr(&match),
>>>> >> > >> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>>>> >> > >> >      /* Ingress table 12 and 13: DHCP options and response, by default goto
>>>> >> > >> >       * next. (priority 0).
>>>> >> > >> >       * Ingress table 14 and 15: DNS lookup and response, by default goto next.
>>>> >> > >> > -     * (priority 0).*/
>>>> >> > >> > +     * (priority 0).
>>>> >> > >> > +     * Ingress table 16 - External port handling, by default goto next.
>>>> >> > >> > +     * (priority 0). */
>>>> >> > >> >
>>>> >> > >> >      HMAP_FOR_EACH (od, key_node, datapaths) {
>>>> >> > >> >          if (!od->nbs) {
>>>> >> > >> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>>>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1", "next;");
>>>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, "1", "next;");
>>>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0, "1", "next;");
>>>> >> > >> > +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0, "1", "next;");
>>>> >> > >> >      }
>>>> >> > >> >
>>>> >> > >> > -    /* Ingress table 16: Destination lookup, broadcast and multicast handling
>>>> >> > >> > +    HMAP_FOR_EACH (op, key_node, ports) {
>>>> >> > >> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
>>>> >> > >> > +           continue;
>>>> >> > >> > +        }
>>>> >> > >> > +
>>>> >> > >> > +        /* Table 16: External port. Drop ARP request for router ips from
>>>> >> > >> > +         * external ports  on chassis not binding those ports.
>>>> >> > >> > +         * This makes the router pipeline to be run only on the chassis
>>>> >> > >> > +         * binding the external ports. */
>>>> >> > >> > +
>>>> >> > >> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>>>> >> > >> > +            for (size_t j = 0; j < op->od->n_router_ports; j++) {
>>>> >> > >> > +                struct ovn_port *rp = op->od->router_ports[j];
>>>> >> > >> > +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
>>>> >> > >> > +                    for (size_t l = 0; l < rp->lsp_addrs[k].n_ipv4_addrs;
>>>> >> > >> > +                         l++) {
>>>> >> > >> > +                        ds_clear(&match);
>>>> >> > >> > +                        ds_put_cstr(&match, "ip4");
>>>> >> > >> > +                        ds_put_format(
>>>> >> > >> > +                            &match, "inport == %s && eth.src == %s"
>>>> >> > >> > +                            " && !is_chassis_resident(%s)"
>>>> >> > >> > +                            " && arp.tpa == %s && arp.op == 1",
>>>> >> > >> > +                            op->json_key, op->lsp_addrs[i].ea_s, op->json_key,
>>>> >> > >>
>>>> >> > >> I believe the inport should match the localnet port's json_key here,
>>>> >> > >> since it is coming from a localnet port.
>>>> >> > >
>>>> >> > >
>>>> >> > > Both would work. If you see the code in lflow.c in this patch - it will get the tunnel
>>>> >> > > key of the localnet port if the port_binding type is "external".
>>>> >> > >
>>>> >> > > That's how even the DHCP requests are handled. ovn-controller will translate
>>>> >> > > the logical flows with action "put_dhcp_opts" only the chassis claiming the
>>>> >> > > external ports.
>>>> >> >
>>>> >> > Oh, yes you are right. Actually I read that part in v4 and it somehow
>>>> >> > slipped my mind. Thanks for explain.
>>>> >>
>>>> >> I thought it a second time, and I'd suggest to do the convertion here
>>>> >> in northd instead of ovn-controller, for two reasons:
>>>> >>
>>>> >> 1. In ovn-controller there is no extra context so it just blindly
>>>> >> transate all references to external logical port into localnet port
>>>> >> key. This could lead to unexpected behavior. For example, if someone
>>>> >> uses external logical port in ACL match condition. The match condition
>>>> >> would then apply to all packets to/from localnet port which is
>>>> >> definitely unwanted. (at the same time it would be better to document
>>>> >> that features like port-security, ACL should not be used for external
>>>> >> logical ports)
>>>> >>
>>>> >
>>>> > That's not how it works in the present patch. Lets say you have  2 chassis
>>>> > hv1 and hv2 and an external port sw0-ext1 and a localnet port "ln-public".
>>>> > Suppose if the requested-chassis is set to hv1, then all the logical flows with the
>>>> > match "inport == sw0-ext1" will be converted to OF flows only on hv1 as this port
>>>> > is bound by hv1 and the function 'lookup_port_cb()' would return true only
>>>> > on hv1 . In hv2, lookup_port_cb() would return false.
>>>>
>>>> Yes, this is well understood.
>>>>
>>>> >
>>>> > If we want to do the conversion in ovn-northd.c the match condition would have to
>>>> > be - "inport == "ln-public" && is_chassis_resident(sw0-ext1) && ...."
>>>> > instead of the present one  - "inport == sw0-ext1 && ...".
>>>>
>>>> Yes, this is what I would suggest (see reason below).
>>>>
>>>> >
>>>> > And the ACL match condition would not be an issue because of the above mentioned
>>>> > reason. i.e the ACL flows will be applied only on the chassis binding the external
>>>> > port.
>>>>
>>>> Here is the concern. For example, chassis A has regular port sw0-lsp1
>>>> bound. Chassis A is also set as requested-chassis for external port
>>>> sw0-ext1. And now the user adds an ACL: to-port, 1001, 'outport ==
>>>> "sw0-ext1", drop. This would get translated to something like:
>>>> to-port, 1001, 'outport == "ln-public"', drop. Wouldn't this impact
>>>> the traffic from sw0-lsp1 to the bridged network? Even if it doesn't
>>>> impact because of some subtle reasons of current implementation, I
>>>> would say it is risky and could leads to problems under certain
>>>> conditions, because the conversion in ovn-controller widens the
>>>> original intent. Whereas doing it in northd only for specific lflows
>>>> would ensure it has impact only for intended use cases.
>>>
>>>
>>>
>>> Thanks for the detailed explanation. I agree. It's clear to me now. I will update accordingly in v6.
>>>
>>> Regards
>>> Numan
>>>
>>>>
>>>> >
>>>> > The test case added checks that the OF flows are applied only on the bound chassis.
>>>> >
>>>> > I think it is better to do it in ovn-controller instead of ovn-northd. Please let me know
>>>> > if you still have any concerns.
>>>> >
>>>> >
>>>> >
>>>> >> 2. A less important reason is, it is better to do it at earlier stage
>>>> >> than later. northd handles common processing. This part of logic is
>>>> >> common for all chassises, so it would be better if we explicitely
>>>> >> handle it in northd, instead of let every chassis to process. And the
>>>> >> change in northd would likely be simpler than in ovn-controller.
>>>>
>>>> This is less critical problem, but I think it is worth consideration,
>>>> too. With current logic, although the conversion would take effect
>>>> only if "is_chassis_resident()" is true, but the code logic and
>>>> processing has to happen on every chassis.
>>>>
>>>> >>
>>>> >> Thanks,
>>>> >> Han
>
>
>
> --
> Miguel Ángel Ajo
> OSP / Networking DFG, OVN Squad Engineering
Numan Siddique Jan. 24, 2019, 6:56 p.m. UTC | #13
On Thu, Jan 24, 2019 at 9:20 PM Han Zhou <zhouhan@gmail.com> wrote:

> On Mon, Jan 21, 2019 at 7:06 AM Miguel Angel Ajo Pelayo
> <majopela@redhat.com> wrote:
> >
> >
> >
> > On Mon, Jan 21, 2019 at 4:02 PM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> >>
> >>
> >> Hi Han,
> >>
> >> I have addressed your comments. But before posting the patch I wanted
> to get an opinion
> >> on the HA support for these external ports.
> >>
> >> The proposed patch doesn't support HA. If the requested chassis goes
> down for some reason
> >> it is expected that CMS would detect it and change the
> requested-chassis option to other
> >> suitable chassis.
> >>
> >> The openstack OVN folks think this would be too much for the CMS to
> handle and it would
> >> complicate the code in networking-ovn which I agree with.
> >>
> >
> > Not only the complexity part. If we implement this from the CMS, then
> every CMS using ovn
> > will need to replicate that behaviour.
> >
> > That's in my opinion a good reason why it's better to handle HA within
> OVN itself.
> >
> >>
> >> I am thinking to add the HA support on the lines of gateway chassis
> support and I want to
> >> submit this patch after adding the HA support. I think this would be
> better as we won't add
> >> more options in OVN (first requested-chassis for external ports and
> then later HA chassis support).
> >> Thoughts?
>
> I thought it would be easier to support outside of OVN combining with
> chassis life-cycle management, but I didn't go deeper in any CMS
> implementation. I agree it is better to handle HA in OVN than
> implementing it in every CMS. But I am also worring about the
> complexity in OVN itself. Could you describe briefly how would you
> support it in OVN? For example, how to detect if a chassis failed? It
> is different from gateway chassis because the major use case of
> external port is for bridged networks (vlan/flat), so I think the BFD
> mechanism for tunnel health monitoring may not be a good fit here.
>

At present, as you know, ovn-controller's do establish geneve tunnels
even if there are only logical switches representing bridged networks.
So I feel we can leverage it unless we have a better mechanism
to detect health monitoring.

Do you have any thoughts/ideas of any other possibilities ?

I am also trying to make the HA support more generic and a bit simpler
(hopefully :))
so that it can be used either for "external" ports or for the
redirectchassis router ports.


Thanks
Numan

Thanks,
> Han
> >>
> >> Thanks
> >> Numan
> >>
> >>
> >> On Sat, Jan 19, 2019 at 12:42 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> >>>
> >>>
> >>>
> >>> On Sat, Jan 19, 2019, 12:32 AM Han Zhou <zhouhan@gmail.com wrote:
> >>>>
> >>>> On Fri, Jan 18, 2019 at 10:16 AM Numan Siddique <nusiddiq@redhat.com>
> wrote:
> >>>> >
> >>>> >
> >>>> >
> >>>> > On Fri, Jan 18, 2019 at 2:11 AM Han Zhou <zhouhan@gmail.com> wrote:
> >>>> >>
> >>>> >> On Thu, Jan 17, 2019 at 11:32 AM Han Zhou <zhouhan@gmail.com>
> wrote:
> >>>> >> >
> >>>> >> > On Thu, Jan 17, 2019 at 11:25 AM Numan Siddique <
> nusiddiq@redhat.com> wrote:
> >>>> >> > >
> >>>> >> > >
> >>>> >> > >
> >>>> >> > > On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com>
> wrote:
> >>>> >> > >>
> >>>> >> > >> Hi Numan,
> >>>> >> > >>
> >>>> >> > >> With v5 the new test case "external logical port" fails.
> >>>> >> > >> And please see more comments inlined.
> >>>> >> > >>
> >>>> >> > >> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
> >>>> >> > >> >
> >>>> >> > >> > From: Numan Siddique <nusiddiq@redhat.com>
> >>>> >> > >> >
> >>>> >> > >> > In the case of OpenStack + OVN, when the VMs are booted on
> >>>> >> > >> > hypervisors supporting SR-IOV nics, there are no OVS ports
> >>>> >> > >> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
> >>>> >> > >> > Router Solicitation requests, the local ovn-controller
> >>>> >> > >> > cannot reply to these packets. OpenStack Neutron dhcp agent
> >>>> >> > >> > service needs to be run to serve these requests.
> >>>> >> > >> >
> >>>> >> > >> > With the new logical port type - 'external', OVN itself can
> >>>> >> > >> > handle these requests avoiding the need to deploy any
> >>>> >> > >> > external services like neutron dhcp agent.
> >>>> >> > >> >
> >>>> >> > >> > To make use of this feature, CMS has to
> >>>> >> > >> >  - create a logical port for such VMs
> >>>> >> > >> >  - set the type to 'external'
> >>>> >> > >> >  - set requested-chassis="<chassis-name>" in the options
> >>>> >> > >> >    column.
> >>>> >> > >> >  - create a localnet port for the logical switch
> >>>> >> > >> >  - configure the ovn-bridge-mappings option in the OVS db.
> >>>> >> > >> >
> >>>> >> > >> > When the ovn-controller running in that 'chassis', detects
> >>>> >> > >> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
> >>>> >> > >> > flows. Since the packet enters the logical switch pipeline
> >>>> >> > >> > via the localnet port, the inport register (reg14) is set
> >>>> >> > >> > to the tunnel key of localnet port in the match conditions.
> >>>> >> > >> >
> >>>> >> > >> > In case the chassis goes down for some reason, it is the
> >>>> >> > >> > responsibility of CMS to change the 'requested-chassis'
> >>>> >> > >> > option to some other active chassis, so that it can serve
> >>>> >> > >> > these requests.
> >>>> >> > >> >
> >>>> >> > >> > When the VM with the external port, sends an ARP request for
> >>>> >> > >> > the router ips, only the chassis which has claimed the port,
> >>>> >> > >> > will reply to the ARP requests. Rest of the chassis on
> >>>> >> > >> > receiving these packets drop them in the ingress switch
> >>>> >> > >> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
> >>>> >> > >> > before S_SWITCH_IN_L2_LKUP.
> >>>> >> > >> >
> >>>> >> > >> > This would guarantee that only the chassis which has claimed
> >>>> >> > >> > the external ports will run the router datapath pipeline.
> >>>> >> > >> >
> >>>> >> > >> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> >>>> >> > >> > ---
> >>>> >> > >> >
> >>>> >> > >> > v4 -> v5
> >>>> >> > >> > ------
> >>>> >> > >> >   * Addressed review comments from Han Zhou.
> >>>> >> > >> >
> >>>> >> > >> > v3 -> v4
> >>>> >> > >> > ------
> >>>> >> > >> >   * Updated the documention as per Han Zhou's suggestion.
> >>>> >> > >> >
> >>>> >> > >> > v2 -> v3
> >>>> >> > >> > -------
> >>>> >> > >> >   * Rebased
> >>>> >> > >> >
> >>>> >> > >> >  ovn/controller/binding.c        |  12 +
> >>>> >> > >> >  ovn/controller/lflow.c          |  41 ++-
> >>>> >> > >> >  ovn/controller/lflow.h          |   2 +
> >>>> >> > >> >  ovn/controller/lport.c          |  26 ++
> >>>> >> > >> >  ovn/controller/lport.h          |   5 +
> >>>> >> > >> >  ovn/controller/ovn-controller.c |   6 +
> >>>> >> > >> >  ovn/lib/ovn-util.c              |   1 +
> >>>> >> > >> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
> >>>> >> > >> >  ovn/northd/ovn-northd.c         |  85 ++++-
> >>>> >> > >> >  ovn/ovn-architecture.7.xml      |  78 +++++
> >>>> >> > >> >  ovn/ovn-nb.xml                  |  47 +++
> >>>> >> > >> >  tests/ovn.at                    | 530
> +++++++++++++++++++++++++++++++-
> >>>> >> > >> >  12 files changed, 848 insertions(+), 22 deletions(-)
> >>>> >> > >> >
> >>>> >> > >> > diff --git a/ovn/controller/binding.c
> b/ovn/controller/binding.c
> >>>> >> > >> > index 021ecddcf..64e605b92 100644
> >>>> >> > >> > --- a/ovn/controller/binding.c
> >>>> >> > >> > +++ b/ovn/controller/binding.c
> >>>> >> > >> > @@ -471,6 +471,18 @@ consider_local_datapath(struct
> ovsdb_idl_txn *ovnsb_idl_txn,
> >>>> >> > >> >           * for them. */
> >>>> >> > >> >          sset_add(local_lports, binding_rec->logical_port);
> >>>> >> > >> >          our_chassis = false;
> >>>> >> > >> > +    } else if (!strcmp(binding_rec->type, "external")) {
> >>>> >> > >> > +        const char *chassis_id =
> smap_get(&binding_rec->options,
> >>>> >> > >> > +
> "requested-chassis");
> >>>> >> > >> > +        our_chassis = chassis_id && (
> >>>> >> > >> > +            !strcmp(chassis_id, chassis_rec->name) ||
> >>>> >> > >> > +            !strcmp(chassis_id, chassis_rec->hostname));
> >>>> >> > >> > +        if (our_chassis) {
> >>>> >> > >> > +
> add_local_datapath(sbrec_datapath_binding_by_key,
> >>>> >> > >> > +
>  sbrec_port_binding_by_datapath,
> >>>> >> > >> > +                               sbrec_port_binding_by_name,
> >>>> >> > >> > +                               binding_rec->datapath,
> true, local_datapaths);
> >>>> >> > >> > +        }
> >>>> >> > >> >      }
> >>>> >> > >> >
> >>>> >> > >> >      if (our_chassis
> >>>> >> > >> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
> >>>> >> > >> > index 8db81927e..98e8ed3b9 100644
> >>>> >> > >> > --- a/ovn/controller/lflow.c
> >>>> >> > >> > +++ b/ovn/controller/lflow.c
> >>>> >> > >> > @@ -52,7 +52,10 @@ lflow_init(void)
> >>>> >> > >> >  struct lookup_port_aux {
> >>>> >> > >> >      struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath;
> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
> >>>> >> > >> >      const struct sbrec_datapath_binding *dp;
> >>>> >> > >> > +    const struct sbrec_chassis *chassis;
> >>>> >> > >> >  };
> >>>> >> > >> >
> >>>> >> > >> >  struct condition_aux {
> >>>> >> > >> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >>>> >> > >> >      struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >>>> >> > >> >      const struct sbrec_logical_flow *,
> >>>> >> > >> >      const struct hmap *local_datapaths,
> >>>> >> > >> >      const struct sbrec_chassis *,
> >>>> >> > >> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const
> char *port_name, unsigned int *portp)
> >>>> >> > >> >      const struct sbrec_port_binding *pb
> >>>> >> > >> >          =
> lport_lookup_by_name(aux->sbrec_port_binding_by_name, port_name);
> >>>> >> > >> >      if (pb && pb->datapath == aux->dp) {
> >>>> >> > >> > -        *portp = pb->tunnel_key;
> >>>> >> > >> > -        return true;
> >>>> >> > >> > +        if (strcmp(pb->type, "external")) {
> >>>> >> > >> > +            *portp = pb->tunnel_key;
> >>>> >> > >> > +            return true;
> >>>> >> > >> > +        }
> >>>> >> > >> > +        const char *chassis_id = smap_get(&pb->options,
> >>>> >> > >> > +
> "requested-chassis");
> >>>> >> > >> > +        if (chassis_id && (!strcmp(chassis_id,
> aux->chassis->name) ||
> >>>> >> > >> > +                           !strcmp(chassis_id,
> aux->chassis->hostname))) {
> >>>> >> > >> > +            const struct sbrec_port_binding *localnet_pb
> >>>> >> > >> > +                =
> lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
> >>>> >> > >> > +
>  aux->sbrec_port_binding_by_type,
> >>>> >> > >> > +
>  aux->dp->tunnel_key, "localnet");
> >>>> >> > >> > +            if (localnet_pb) {
> >>>> >> > >> > +                *portp = localnet_pb->tunnel_key;
> >>>> >> > >> > +                return true;
> >>>> >> > >> > +            }
> >>>> >> > >> > +        }
> >>>> >> > >> > +        return false;
> >>>> >> > >> >      }
> >>>> >> > >> >
> >>>> >> > >> >      const struct sbrec_multicast_group *mg =
> mcgroup_lookup_by_dp_name(
> >>>> >> > >> > @@ -144,6 +165,8 @@ add_logical_flows(
> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >>>> >> > >> >      struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >>>> >> > >> >      const struct sbrec_dhcp_options_table
> *dhcp_options_table,
> >>>> >> > >> >      const struct sbrec_dhcpv6_options_table
> *dhcpv6_options_table,
> >>>> >> > >> >      const struct sbrec_logical_flow_table
> *logical_flow_table,
> >>>> >> > >> > @@ -183,6 +206,8 @@ add_logical_flows(
> >>>> >> > >> >          consider_logical_flow(sbrec_chassis_by_name,
> >>>> >> > >> >
> sbrec_multicast_group_by_name_datapath,
> >>>> >> > >> >                                sbrec_port_binding_by_name,
> >>>> >> > >> > +                              sbrec_port_binding_by_type,
> >>>> >> > >> > +
> sbrec_datapath_binding_by_key,
> >>>> >> > >> >                                lflow, local_datapaths,
> >>>> >> > >> >                                chassis, &dhcp_opts,
> &dhcpv6_opts, &nd_ra_opts,
> >>>> >> > >> >                                addr_sets, port_groups,
> active_tunnels,
> >>>> >> > >> > @@ -200,6 +225,8 @@ consider_logical_flow(
> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
> >>>> >> > >> >      struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >>>> >> > >> >      const struct sbrec_logical_flow *lflow,
> >>>> >> > >> >      const struct hmap *local_datapaths,
> >>>> >> > >> >      const struct sbrec_chassis *chassis,
> >>>> >> > >> > @@ -292,7 +319,10 @@ consider_logical_flow(
> >>>> >> > >> >          .sbrec_multicast_group_by_name_datapath
> >>>> >> > >> >              = sbrec_multicast_group_by_name_datapath,
> >>>> >> > >> >          .sbrec_port_binding_by_name =
> sbrec_port_binding_by_name,
> >>>> >> > >> > -        .dp = lflow->logical_datapath
> >>>> >> > >> > +        .sbrec_port_binding_by_type =
> sbrec_port_binding_by_type,
> >>>> >> > >> > +        .sbrec_datapath_binding_by_key =
> sbrec_datapath_binding_by_key,
> >>>> >> > >> > +        .dp = lflow->logical_datapath,
> >>>> >> > >> > +        .chassis = chassis
> >>>> >> > >> >      };
> >>>> >> > >> >      struct condition_aux cond_aux = {
> >>>> >> > >> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
> >>>> >> > >> > @@ -463,6 +493,8 @@ void
> >>>> >> > >> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
> >>>> >> > >> >            struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >>>> >> > >> >            struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
> >>>> >> > >> > +          struct ovsdb_idl_index
> *sbrec_port_binding_by_type,
> >>>> >> > >> > +          struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >>>> >> > >> >            const struct sbrec_dhcp_options_table
> *dhcp_options_table,
> >>>> >> > >> >            const struct sbrec_dhcpv6_options_table
> *dhcpv6_options_table,
> >>>> >> > >> >            const struct sbrec_logical_flow_table
> *logical_flow_table,
> >>>> >> > >> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index
> *sbrec_chassis_by_name,
> >>>> >> > >> >
> >>>> >> > >> >      add_logical_flows(sbrec_chassis_by_name,
> >>>> >> > >> >
> sbrec_multicast_group_by_name_datapath,
> >>>> >> > >> > -                      sbrec_port_binding_by_name,
> dhcp_options_table,
> >>>> >> > >> > +                      sbrec_port_binding_by_name,
> sbrec_port_binding_by_type,
> >>>> >> > >> > +                      sbrec_datapath_binding_by_key,
> dhcp_options_table,
> >>>> >> > >> >                        dhcpv6_options_table,
> logical_flow_table,
> >>>> >> > >> >                        local_datapaths, chassis, addr_sets,
> port_groups,
> >>>> >> > >> >                        active_tunnels, local_lport_ids,
> flow_table, group_table,
> >>>> >> > >> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
> >>>> >> > >> > index d19338140..b2911e0eb 100644
> >>>> >> > >> > --- a/ovn/controller/lflow.h
> >>>> >> > >> > +++ b/ovn/controller/lflow.h
> >>>> >> > >> > @@ -68,6 +68,8 @@ void lflow_init(void);
> >>>> >> > >> >  void lflow_run(struct ovsdb_idl_index
> *sbrec_chassis_by_name,
> >>>> >> > >> >                 struct ovsdb_idl_index
> *sbrec_multicast_group_by_name_datapath,
> >>>> >> > >> >                 struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
> >>>> >> > >> > +               struct ovsdb_idl_index
> *sbrec_port_binding_by_type,
> >>>> >> > >> > +               struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >>>> >> > >> >                 const struct sbrec_dhcp_options_table *,
> >>>> >> > >> >                 const struct sbrec_dhcpv6_options_table *,
> >>>> >> > >> >                 const struct sbrec_logical_flow_table *,
> >>>> >> > >> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
> >>>> >> > >> > index cc5c5fbb2..9c827d9b0 100644
> >>>> >> > >> > --- a/ovn/controller/lport.c
> >>>> >> > >> > +++ b/ovn/controller/lport.c
> >>>> >> > >> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct
> ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >>>> >> > >> >      return retval;
> >>>> >> > >> >  }
> >>>> >> > >> >
> >>>> >> > >> > +const struct sbrec_port_binding *
> >>>> >> > >> > +lport_lookup_by_type(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >>>> >> > >> > +                     struct ovsdb_idl_index
> *sbrec_port_binding_by_type,
> >>>> >> > >> > +                     uint64_t dp_key, const char
> *port_type)
> >>>> >> > >> > +{
> >>>> >> > >> > +    /* Lookup datapath corresponding to dp_key. */
> >>>> >> > >> > +    const struct sbrec_datapath_binding *db =
> datapath_lookup_by_key(
> >>>> >> > >> > +        sbrec_datapath_binding_by_key, dp_key);
> >>>> >> > >> > +    if (!db) {
> >>>> >> > >> > +        return NULL;
> >>>> >> > >> > +    }
> >>>> >> > >> > +
> >>>> >> > >> > +    /* Build key for an indexed lookup. */
> >>>> >> > >> > +    struct sbrec_port_binding *pb =
> sbrec_port_binding_index_init_row(
> >>>> >> > >> > +            sbrec_port_binding_by_type);
> >>>> >> > >> > +    sbrec_port_binding_index_set_datapath(pb, db);
> >>>> >> > >> > +    sbrec_port_binding_index_set_type(pb, port_type);
> >>>> >> > >> > +
> >>>> >> > >> > +    const struct sbrec_port_binding *retval =
> sbrec_port_binding_index_find(
> >>>> >> > >> > +            sbrec_port_binding_by_type, pb);
> >>>> >> > >> > +
> >>>> >> > >> > +    sbrec_port_binding_index_destroy_row(pb);
> >>>> >> > >> > +
> >>>> >> > >> > +    return retval;
> >>>> >> > >> > +}
> >>>> >> > >> > +
> >>>> >> > >> >  const struct sbrec_datapath_binding *
> >>>> >> > >> >  datapath_lookup_by_key(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
> >>>> >> > >> >                         uint64_t dp_key)
> >>>> >> > >> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
> >>>> >> > >> > index 7dcd5bee0..2d49792f6 100644
> >>>> >> > >> > --- a/ovn/controller/lport.h
> >>>> >> > >> > +++ b/ovn/controller/lport.h
> >>>> >> > >> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding
> *lport_lookup_by_key(
> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
> >>>> >> > >> >      uint64_t dp_key, uint64_t port_key);
> >>>> >> > >> >
> >>>> >> > >> > +const struct sbrec_port_binding *lport_lookup_by_type(
> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
> >>>> >> > >> > +    uint64_t dp_key, const char *port_type);
> >>>> >> > >> > +
> >>>> >> > >> >  const struct sbrec_datapath_binding
> *datapath_lookup_by_key(
> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
> uint64_t dp_key);
> >>>> >> > >> >
> >>>> >> > >> > diff --git a/ovn/controller/ovn-controller.c
> b/ovn/controller/ovn-controller.c
> >>>> >> > >> > index 4e9a5865f..5aab9142f 100644
> >>>> >> > >> > --- a/ovn/controller/ovn-controller.c
> >>>> >> > >> > +++ b/ovn/controller/ovn-controller.c
> >>>> >> > >> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl
> *ovnsb_idl,
> >>>> >> > >> >       * ports that have a Gateway_Chassis that point's to
> our own
> >>>> >> > >> >       * chassis */
> >>>> >> > >> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
> "chassisredirect");
> >>>> >> > >> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ,
> "external");
> >>>> >> > >> >      if (chassis) {
> >>>> >> > >> >          /* This should be mostly redundant with the other
> clauses for port
> >>>> >> > >> >           * bindings, but it allows us to catch any ports
> that are assigned to
> >>>> >> > >> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
> >>>> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >>>> >> > >> >
> &sbrec_port_binding_col_datapath);
> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
> >>>> >> > >> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >>>> >> > >> > +
> &sbrec_port_binding_col_type);
> >>>> >> > >>
> >>>> >> > >> This index is used with two columns: datapath_binding and
> type, so it
> >>>> >> > >> should be created with both columns using create2.
> >>>> >> > >>
> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
> >>>> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> >>>> >> > >> >
> &sbrec_datapath_binding_col_tunnel_key);
> >>>> >> > >> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
> >>>> >> > >> >                              sbrec_chassis_by_name,
> >>>> >> > >> >
> sbrec_multicast_group_by_name_datapath,
> >>>> >> > >> >                              sbrec_port_binding_by_name,
> >>>> >> > >> > +                            sbrec_port_binding_by_type,
> >>>> >> > >> > +                            sbrec_datapath_binding_by_key,
> >>>> >> > >> >
> sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
> >>>> >> > >> >
> sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
> >>>> >> > >> >
> sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
> >>>> >> > >> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
> >>>> >> > >> > index aa03919bb..a9d4b8736 100644
> >>>> >> > >> > --- a/ovn/lib/ovn-util.c
> >>>> >> > >> > +++ b/ovn/lib/ovn-util.c
> >>>> >> > >> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] =
> {
> >>>> >> > >> >      "localport",
> >>>> >> > >> >      "router",
> >>>> >> > >> >      "vtep",
> >>>> >> > >> > +    "external",
> >>>> >> > >> >  };
> >>>> >> > >> >
> >>>> >> > >> >  bool
> >>>> >> > >> > diff --git a/ovn/northd/ovn-northd.8.xml
> b/ovn/northd/ovn-northd.8.xml
> >>>> >> > >> > index 392a5efc9..c8883d60d 100644
> >>>> >> > >> > --- a/ovn/northd/ovn-northd.8.xml
> >>>> >> > >> > +++ b/ovn/northd/ovn-northd.8.xml
> >>>> >> > >> > @@ -626,7 +626,8 @@ nd_na_router {
> >>>> >> > >> >      <p>
> >>>> >> > >> >        This table adds the DHCPv4 options to a DHCPv4
> packet from the
> >>>> >> > >> >        logical ports configured with IPv4 address(es) and
> DHCPv4 options,
> >>>> >> > >> > -      and similarly for DHCPv6 options.
> >>>> >> > >> > +      and similarly for DHCPv6 options. This table also
> adds flows for the
> >>>> >> > >> > +      logical ports of type <code>external</code>.
> >>>> >> > >> >      </p>
> >>>> >> > >> >
> >>>> >> > >> >      <ul>
> >>>> >> > >> > @@ -827,7 +828,39 @@ output;
> >>>> >> > >> >        </li>
> >>>> >> > >> >      </ul>
> >>>> >> > >> >
> >>>> >> > >> > -    <h3>Ingress Table 16 Destination Lookup</h3>
> >>>> >> > >> > +    <h3>Ingress table 16 External ports</h3>
> >>>> >> > >> > +
> >>>> >> > >> > +    <p>
> >>>> >> > >> > +      Traffic from the <code>external</code> logical ports
> enter the ingress
> >>>> >> > >> > +      datapath pipeline via the <code>localnet</code>
> port. This table adds the
> >>>> >> > >> > +      below logical flows to handle the traffic from these
> ports.
> >>>> >> > >> > +    </p>
> >>>> >> > >> > +
> >>>> >> > >> > +    <ul>
> >>>> >> > >> > +      <li>
> >>>> >> > >> > +        <p>
> >>>> >> > >> > +          A priority-100 flow is added for each
> <code>external</code> logical
> >>>> >> > >> > +          port which doesn't reside on a chassis to drop
> the ARP/IPv6 NS
> >>>> >> > >> > +          request to the router IP(s) (of the logical
> switch) which matches
> >>>> >> > >> > +          on the <code>inport</code> of the
> <code>external</code> logical port
> >>>> >> > >> > +          and the valid <code>eth.src</code> address(es)
> of the
> >>>> >> > >> > +          <code>external</code> logical port.
> >>>> >> > >> > +        </p>
> >>>> >> > >> > +
> >>>> >> > >> > +        <p>
> >>>> >> > >> > +          This flow guarantees that the ARP/NS request to
> the router IP
> >>>> >> > >> > +          address from the external ports is responded by
> only the chassis
> >>>> >> > >> > +          which has claimed these external ports. All the
> other chassis,
> >>>> >> > >> > +          drops these packets.
> >>>> >> > >> > +        </p>
> >>>> >> > >> > +      </li>
> >>>> >> > >> > +
> >>>> >> > >> > +      <li>
> >>>> >> > >> > +        A priority-0 flow that matches all packets to
> advances to table 17.
> >>>> >> > >> > +      </li>
> >>>> >> > >> > +    </ul>
> >>>> >> > >> > +
> >>>> >> > >> > +    <h3>Ingress Table 17 Destination Lookup</h3>
> >>>> >> > >> >
> >>>> >> > >> >      <p>
> >>>> >> > >> >        This table implements switching behavior.  It
> contains these logical
> >>>> >> > >> > diff --git a/ovn/northd/ovn-northd.c
> b/ovn/northd/ovn-northd.c
> >>>> >> > >> > index 3fd8a8757..87208c6c1 100644
> >>>> >> > >> > --- a/ovn/northd/ovn-northd.c
> >>>> >> > >> > +++ b/ovn/northd/ovn-northd.c
> >>>> >> > >> > @@ -119,7 +119,8 @@ enum ovn_stage {
> >>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13,
> "ls_in_dhcp_response") \
> >>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14,
> "ls_in_dns_lookup")    \
> >>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15,
> "ls_in_dns_response")  \
> >>>> >> > >> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16,
> "ls_in_l2_lkup")       \
> >>>> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16,
> "ls_in_external_port") \
> >>>> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17,
> "ls_in_l2_lkup")       \
> >>>> >> > >> >
>                 \
> >>>> >> > >> >      /* Logical switch egress stages. */
>                \
> >>>> >> > >> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0,
> "ls_out_pre_lb")         \
> >>>> >> > >> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct
> nbrec_logical_switch_port *lsp)
> >>>> >> > >> >      return !lsp->up || *lsp->up;
> >>>> >> > >> >  }
> >>>> >> > >> >
> >>>> >> > >> > +static bool
> >>>> >> > >> > +lsp_is_external(const struct nbrec_logical_switch_port
> *nbsp)
> >>>> >> > >> > +{
> >>>> >> > >> > +    return !strcmp(nbsp->type, "external");
> >>>> >> > >> > +}
> >>>> >> > >> > +
> >>>> >> > >> >  static bool
> >>>> >> > >> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
> >>>> >> > >> >                      struct ds *options_action, struct ds
> *response_action,
> >>>> >> > >> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap
> *datapaths, struct hmap *ports,
> >>>> >> > >> >           *  - port type is localport
> >>>> >> > >> >           */
> >>>> >> > >> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type,
> "router") &&
> >>>> >> > >> > -            strcmp(op->nbsp->type, "localport")) {
> >>>> >> > >> > +            strcmp(op->nbsp->type, "localport") &&
> lsp_is_external(op->nbsp)) {
> >>>> >> > >>
> >>>> >> > >> Sorry that I missed this in last review. The && condition has
> problem.
> >>>> >> > >> It will cause ARP responder flows added for all lports that
> are not
> >>>> >> > >> external. I think it should be || here.
> >>>> >> > >
> >>>> >> > >
> >>>> >> > > Agree. To make it easier to read, I will add a new "if" with
> continue - below this one for
> >>>> >> > > external port types.
> >>>> >> > >
> >>>> >> > >
> >>>> >> > >>
> >>>> >> > >>
> >>>> >> > >> >              continue;
> >>>> >> > >> >          }
> >>>> >> > >> >
> >>>> >> > >> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap
> *datapaths, struct hmap *ports,
> >>>> >> > >> >              continue;
> >>>> >> > >> >          }
> >>>> >> > >> >
> >>>> >> > >> > +        bool is_external = lsp_is_external(op->nbsp);
> >>>> >> > >> > +        if (is_external && !op->od->localnet_port) {
> >>>> >> > >> > +            /* If it's an external port and there is no
> localnet port
> >>>> >> > >> > +             * ignore it. */
> >>>> >> > >> > +            continue;
> >>>> >> > >> > +        }
> >>>> >> > >> > +
> >>>> >> > >> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> >>>> >> > >> >              for (size_t j = 0; j <
> op->lsp_addrs[i].n_ipv4_addrs; j++) {
> >>>> >> > >> >                  struct ds options_action =
> DS_EMPTY_INITIALIZER;
> >>>> >> > >> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap
> *datapaths, struct hmap *ports,
> >>>> >> > >> >                      ds_put_format(
> >>>> >> > >> >                          &match, "inport == %s && eth.src
> == %s && "
> >>>> >> > >> >                          "ip4.src == 0.0.0.0 && ip4.dst ==
> 255.255.255.255 && "
> >>>> >> > >> > -                        "udp.src == 68 && udp.dst == 67",
> op->json_key,
> >>>> >> > >> > -                        op->lsp_addrs[i].ea_s);
> >>>> >> > >> > +                        "udp.src == 68 && udp.dst == 67",
> >>>> >> > >> > +                        op->json_key,
> op->lsp_addrs[i].ea_s);
> >>>> >> > >>
> >>>> >> > >> No change here?
> >>>> >> > >
> >>>> >> > >
> >>>> >> > > I think it's unwanted and unrelated change. I will correct it.
> >>>> >> > >>
> >>>> >> > >> >
> >>>> >> > >> >                      ovn_lflow_add(lflows, op->od,
> S_SWITCH_IN_DHCP_OPTIONS,
> >>>> >> > >> >                                    100, ds_cstr(&match),
> >>>> >> > >> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap
> *datapaths, struct hmap *ports,
> >>>> >> > >> >      /* Ingress table 12 and 13: DHCP options and response,
> by default goto
> >>>> >> > >> >       * next. (priority 0).
> >>>> >> > >> >       * Ingress table 14 and 15: DNS lookup and response,
> by default goto next.
> >>>> >> > >> > -     * (priority 0).*/
> >>>> >> > >> > +     * (priority 0).
> >>>> >> > >> > +     * Ingress table 16 - External port handling, by
> default goto next.
> >>>> >> > >> > +     * (priority 0). */
> >>>> >> > >> >
> >>>> >> > >> >      HMAP_FOR_EACH (od, key_node, datapaths) {
> >>>> >> > >> >          if (!od->nbs) {
> >>>> >> > >> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap
> *datapaths, struct hmap *ports,
> >>>> >> > >> >          ovn_lflow_add(lflows, od,
> S_SWITCH_IN_DHCP_RESPONSE, 0, "1", "next;");
> >>>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP,
> 0, "1", "next;");
> >>>> >> > >> >          ovn_lflow_add(lflows, od,
> S_SWITCH_IN_DNS_RESPONSE, 0, "1", "next;");
> >>>> >> > >> > +        ovn_lflow_add(lflows, od,
> S_SWITCH_IN_EXTERNAL_PORT, 0, "1", "next;");
> >>>> >> > >> >      }
> >>>> >> > >> >
> >>>> >> > >> > -    /* Ingress table 16: Destination lookup, broadcast and
> multicast handling
> >>>> >> > >> > +    HMAP_FOR_EACH (op, key_node, ports) {
> >>>> >> > >> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
> >>>> >> > >> > +           continue;
> >>>> >> > >> > +        }
> >>>> >> > >> > +
> >>>> >> > >> > +        /* Table 16: External port. Drop ARP request for
> router ips from
> >>>> >> > >> > +         * external ports  on chassis not binding those
> ports.
> >>>> >> > >> > +         * This makes the router pipeline to be run only
> on the chassis
> >>>> >> > >> > +         * binding the external ports. */
> >>>> >> > >> > +
> >>>> >> > >> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> >>>> >> > >> > +            for (size_t j = 0; j < op->od->n_router_ports;
> j++) {
> >>>> >> > >> > +                struct ovn_port *rp =
> op->od->router_ports[j];
> >>>> >> > >> > +                for (size_t k = 0; k < rp->n_lsp_addrs;
> k++) {
> >>>> >> > >> > +                    for (size_t l = 0; l <
> rp->lsp_addrs[k].n_ipv4_addrs;
> >>>> >> > >> > +                         l++) {
> >>>> >> > >> > +                        ds_clear(&match);
> >>>> >> > >> > +                        ds_put_cstr(&match, "ip4");
> >>>> >> > >> > +                        ds_put_format(
> >>>> >> > >> > +                            &match, "inport == %s &&
> eth.src == %s"
> >>>> >> > >> > +                            " && !is_chassis_resident(%s)"
> >>>> >> > >> > +                            " && arp.tpa == %s && arp.op
> == 1",
> >>>> >> > >> > +                            op->json_key,
> op->lsp_addrs[i].ea_s, op->json_key,
> >>>> >> > >>
> >>>> >> > >> I believe the inport should match the localnet port's
> json_key here,
> >>>> >> > >> since it is coming from a localnet port.
> >>>> >> > >
> >>>> >> > >
> >>>> >> > > Both would work. If you see the code in lflow.c in this patch
> - it will get the tunnel
> >>>> >> > > key of the localnet port if the port_binding type is
> "external".
> >>>> >> > >
> >>>> >> > > That's how even the DHCP requests are handled. ovn-controller
> will translate
> >>>> >> > > the logical flows with action "put_dhcp_opts" only the chassis
> claiming the
> >>>> >> > > external ports.
> >>>> >> >
> >>>> >> > Oh, yes you are right. Actually I read that part in v4 and it
> somehow
> >>>> >> > slipped my mind. Thanks for explain.
> >>>> >>
> >>>> >> I thought it a second time, and I'd suggest to do the convertion
> here
> >>>> >> in northd instead of ovn-controller, for two reasons:
> >>>> >>
> >>>> >> 1. In ovn-controller there is no extra context so it just blindly
> >>>> >> transate all references to external logical port into localnet port
> >>>> >> key. This could lead to unexpected behavior. For example, if
> someone
> >>>> >> uses external logical port in ACL match condition. The match
> condition
> >>>> >> would then apply to all packets to/from localnet port which is
> >>>> >> definitely unwanted. (at the same time it would be better to
> document
> >>>> >> that features like port-security, ACL should not be used for
> external
> >>>> >> logical ports)
> >>>> >>
> >>>> >
> >>>> > That's not how it works in the present patch. Lets say you have  2
> chassis
> >>>> > hv1 and hv2 and an external port sw0-ext1 and a localnet port
> "ln-public".
> >>>> > Suppose if the requested-chassis is set to hv1, then all the
> logical flows with the
> >>>> > match "inport == sw0-ext1" will be converted to OF flows only on
> hv1 as this port
> >>>> > is bound by hv1 and the function 'lookup_port_cb()' would return
> true only
> >>>> > on hv1 . In hv2, lookup_port_cb() would return false.
> >>>>
> >>>> Yes, this is well understood.
> >>>>
> >>>> >
> >>>> > If we want to do the conversion in ovn-northd.c the match condition
> would have to
> >>>> > be - "inport == "ln-public" && is_chassis_resident(sw0-ext1) &&
> ...."
> >>>> > instead of the present one  - "inport == sw0-ext1 && ...".
> >>>>
> >>>> Yes, this is what I would suggest (see reason below).
> >>>>
> >>>> >
> >>>> > And the ACL match condition would not be an issue because of the
> above mentioned
> >>>> > reason. i.e the ACL flows will be applied only on the chassis
> binding the external
> >>>> > port.
> >>>>
> >>>> Here is the concern. For example, chassis A has regular port sw0-lsp1
> >>>> bound. Chassis A is also set as requested-chassis for external port
> >>>> sw0-ext1. And now the user adds an ACL: to-port, 1001, 'outport ==
> >>>> "sw0-ext1", drop. This would get translated to something like:
> >>>> to-port, 1001, 'outport == "ln-public"', drop. Wouldn't this impact
> >>>> the traffic from sw0-lsp1 to the bridged network? Even if it doesn't
> >>>> impact because of some subtle reasons of current implementation, I
> >>>> would say it is risky and could leads to problems under certain
> >>>> conditions, because the conversion in ovn-controller widens the
> >>>> original intent. Whereas doing it in northd only for specific lflows
> >>>> would ensure it has impact only for intended use cases.
> >>>
> >>>
> >>>
> >>> Thanks for the detailed explanation. I agree. It's clear to me now. I
> will update accordingly in v6.
> >>>
> >>> Regards
> >>> Numan
> >>>
> >>>>
> >>>> >
> >>>> > The test case added checks that the OF flows are applied only on
> the bound chassis.
> >>>> >
> >>>> > I think it is better to do it in ovn-controller instead of
> ovn-northd. Please let me know
> >>>> > if you still have any concerns.
> >>>> >
> >>>> >
> >>>> >
> >>>> >> 2. A less important reason is, it is better to do it at earlier
> stage
> >>>> >> than later. northd handles common processing. This part of logic is
> >>>> >> common for all chassises, so it would be better if we explicitely
> >>>> >> handle it in northd, instead of let every chassis to process. And
> the
> >>>> >> change in northd would likely be simpler than in ovn-controller.
> >>>>
> >>>> This is less critical problem, but I think it is worth consideration,
> >>>> too. With current logic, although the conversion would take effect
> >>>> only if "is_chassis_resident()" is true, but the code logic and
> >>>> processing has to happen on every chassis.
> >>>>
> >>>> >>
> >>>> >> Thanks,
> >>>> >> Han
> >
> >
> >
> > --
> > Miguel Ángel Ajo
> > OSP / Networking DFG, OVN Squad Engineering
>
Han Zhou Jan. 24, 2019, 7:51 p.m. UTC | #14
On Thu, Jan 24, 2019 at 10:56 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>
>
>
> On Thu, Jan 24, 2019 at 9:20 PM Han Zhou <zhouhan@gmail.com> wrote:
>>
>> On Mon, Jan 21, 2019 at 7:06 AM Miguel Angel Ajo Pelayo
>> <majopela@redhat.com> wrote:
>> >
>> >
>> >
>> > On Mon, Jan 21, 2019 at 4:02 PM Numan Siddique <nusiddiq@redhat.com> wrote:
>> >>
>> >>
>> >> Hi Han,
>> >>
>> >> I have addressed your comments. But before posting the patch I wanted to get an opinion
>> >> on the HA support for these external ports.
>> >>
>> >> The proposed patch doesn't support HA. If the requested chassis goes down for some reason
>> >> it is expected that CMS would detect it and change the requested-chassis option to other
>> >> suitable chassis.
>> >>
>> >> The openstack OVN folks think this would be too much for the CMS to handle and it would
>> >> complicate the code in networking-ovn which I agree with.
>> >>
>> >
>> > Not only the complexity part. If we implement this from the CMS, then every CMS using ovn
>> > will need to replicate that behaviour.
>> >
>> > That's in my opinion a good reason why it's better to handle HA within OVN itself.
>> >
>> >>
>> >> I am thinking to add the HA support on the lines of gateway chassis support and I want to
>> >> submit this patch after adding the HA support. I think this would be better as we won't add
>> >> more options in OVN (first requested-chassis for external ports and then later HA chassis support).
>> >> Thoughts?
>>
>> I thought it would be easier to support outside of OVN combining with
>> chassis life-cycle management, but I didn't go deeper in any CMS
>> implementation. I agree it is better to handle HA in OVN than
>> implementing it in every CMS. But I am also worring about the
>> complexity in OVN itself. Could you describe briefly how would you
>> support it in OVN? For example, how to detect if a chassis failed? It
>> is different from gateway chassis because the major use case of
>> external port is for bridged networks (vlan/flat), so I think the BFD
>> mechanism for tunnel health monitoring may not be a good fit here.
>
>
> At present, as you know, ovn-controller's do establish geneve tunnels
> even if there are only logical switches representing bridged networks.
> So I feel we can leverage it unless we have a better mechanism
> to detect health monitoring.
>
> Do you have any thoughts/ideas of any other possibilities ?
>
> I am also trying to make the HA support more generic and a bit simpler (hopefully :))
> so that it can be used either for "external" ports or for the redirectchassis router ports.
>
One example that the BFD monitoring may not be good for external port
HA is, in purely bridged environment there is a potential optimization
to not creat the tunnel mesh at all, but this BFD dependency makes it
harder. However it seems not a strong blocker, since the optimization
can selectively create tunnels.

If we have to implement HV in OVN, I don't have any better idea than
BFD for now. It may be good in practice. We'd better hear opinion from
more people. It may worth abstracting data plane monitoring mechanism,
and implement new mechanisms in the future, but it doesn't need to be
in this version of code if BFD is the only mechanism for now anyway.

Outside of OVN, I believe all chassis manage systems should have their
own ways of monitoring. Would it be sufficient just to provide an
interface from OVN (and even CMS) so that the failures detected by
external systems can be used to trigger the failover?

In addition, the current priority based gateway chassis HA mechanism
has the split-brain problem upon network partitioning. This may be a
independent topic ;)

>
> Thanks
> Numan
>
>> Thanks,
>> Han
>> >>
>> >> Thanks
>> >> Numan
>> >>
>> >>
>> >> On Sat, Jan 19, 2019 at 12:42 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>> >>>
>> >>>
>> >>>
>> >>> On Sat, Jan 19, 2019, 12:32 AM Han Zhou <zhouhan@gmail.com wrote:
>> >>>>
>> >>>> On Fri, Jan 18, 2019 at 10:16 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > On Fri, Jan 18, 2019 at 2:11 AM Han Zhou <zhouhan@gmail.com> wrote:
>> >>>> >>
>> >>>> >> On Thu, Jan 17, 2019 at 11:32 AM Han Zhou <zhouhan@gmail.com> wrote:
>> >>>> >> >
>> >>>> >> > On Thu, Jan 17, 2019 at 11:25 AM Numan Siddique <nusiddiq@redhat.com> wrote:
>> >>>> >> > >
>> >>>> >> > >
>> >>>> >> > >
>> >>>> >> > > On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <zhouhan@gmail.com> wrote:
>> >>>> >> > >>
>> >>>> >> > >> Hi Numan,
>> >>>> >> > >>
>> >>>> >> > >> With v5 the new test case "external logical port" fails.
>> >>>> >> > >> And please see more comments inlined.
>> >>>> >> > >>
>> >>>> >> > >> On Tue, Jan 15, 2019 at 12:09 PM <nusiddiq@redhat.com> wrote:
>> >>>> >> > >> >
>> >>>> >> > >> > From: Numan Siddique <nusiddiq@redhat.com>
>> >>>> >> > >> >
>> >>>> >> > >> > In the case of OpenStack + OVN, when the VMs are booted on
>> >>>> >> > >> > hypervisors supporting SR-IOV nics, there are no OVS ports
>> >>>> >> > >> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6
>> >>>> >> > >> > Router Solicitation requests, the local ovn-controller
>> >>>> >> > >> > cannot reply to these packets. OpenStack Neutron dhcp agent
>> >>>> >> > >> > service needs to be run to serve these requests.
>> >>>> >> > >> >
>> >>>> >> > >> > With the new logical port type - 'external', OVN itself can
>> >>>> >> > >> > handle these requests avoiding the need to deploy any
>> >>>> >> > >> > external services like neutron dhcp agent.
>> >>>> >> > >> >
>> >>>> >> > >> > To make use of this feature, CMS has to
>> >>>> >> > >> >  - create a logical port for such VMs
>> >>>> >> > >> >  - set the type to 'external'
>> >>>> >> > >> >  - set requested-chassis="<chassis-name>" in the options
>> >>>> >> > >> >    column.
>> >>>> >> > >> >  - create a localnet port for the logical switch
>> >>>> >> > >> >  - configure the ovn-bridge-mappings option in the OVS db.
>> >>>> >> > >> >
>> >>>> >> > >> > When the ovn-controller running in that 'chassis', detects
>> >>>> >> > >> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF
>> >>>> >> > >> > flows. Since the packet enters the logical switch pipeline
>> >>>> >> > >> > via the localnet port, the inport register (reg14) is set
>> >>>> >> > >> > to the tunnel key of localnet port in the match conditions.
>> >>>> >> > >> >
>> >>>> >> > >> > In case the chassis goes down for some reason, it is the
>> >>>> >> > >> > responsibility of CMS to change the 'requested-chassis'
>> >>>> >> > >> > option to some other active chassis, so that it can serve
>> >>>> >> > >> > these requests.
>> >>>> >> > >> >
>> >>>> >> > >> > When the VM with the external port, sends an ARP request for
>> >>>> >> > >> > the router ips, only the chassis which has claimed the port,
>> >>>> >> > >> > will reply to the ARP requests. Rest of the chassis on
>> >>>> >> > >> > receiving these packets drop them in the ingress switch
>> >>>> >> > >> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just
>> >>>> >> > >> > before S_SWITCH_IN_L2_LKUP.
>> >>>> >> > >> >
>> >>>> >> > >> > This would guarantee that only the chassis which has claimed
>> >>>> >> > >> > the external ports will run the router datapath pipeline.
>> >>>> >> > >> >
>> >>>> >> > >> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>> >>>> >> > >> > ---
>> >>>> >> > >> >
>> >>>> >> > >> > v4 -> v5
>> >>>> >> > >> > ------
>> >>>> >> > >> >   * Addressed review comments from Han Zhou.
>> >>>> >> > >> >
>> >>>> >> > >> > v3 -> v4
>> >>>> >> > >> > ------
>> >>>> >> > >> >   * Updated the documention as per Han Zhou's suggestion.
>> >>>> >> > >> >
>> >>>> >> > >> > v2 -> v3
>> >>>> >> > >> > -------
>> >>>> >> > >> >   * Rebased
>> >>>> >> > >> >
>> >>>> >> > >> >  ovn/controller/binding.c        |  12 +
>> >>>> >> > >> >  ovn/controller/lflow.c          |  41 ++-
>> >>>> >> > >> >  ovn/controller/lflow.h          |   2 +
>> >>>> >> > >> >  ovn/controller/lport.c          |  26 ++
>> >>>> >> > >> >  ovn/controller/lport.h          |   5 +
>> >>>> >> > >> >  ovn/controller/ovn-controller.c |   6 +
>> >>>> >> > >> >  ovn/lib/ovn-util.c              |   1 +
>> >>>> >> > >> >  ovn/northd/ovn-northd.8.xml     |  37 ++-
>> >>>> >> > >> >  ovn/northd/ovn-northd.c         |  85 ++++-
>> >>>> >> > >> >  ovn/ovn-architecture.7.xml      |  78 +++++
>> >>>> >> > >> >  ovn/ovn-nb.xml                  |  47 +++
>> >>>> >> > >> >  tests/ovn.at                    | 530 +++++++++++++++++++++++++++++++-
>> >>>> >> > >> >  12 files changed, 848 insertions(+), 22 deletions(-)
>> >>>> >> > >> >
>> >>>> >> > >> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
>> >>>> >> > >> > index 021ecddcf..64e605b92 100644
>> >>>> >> > >> > --- a/ovn/controller/binding.c
>> >>>> >> > >> > +++ b/ovn/controller/binding.c
>> >>>> >> > >> > @@ -471,6 +471,18 @@ consider_local_datapath(struct ovsdb_idl_txn *ovnsb_idl_txn,
>> >>>> >> > >> >           * for them. */
>> >>>> >> > >> >          sset_add(local_lports, binding_rec->logical_port);
>> >>>> >> > >> >          our_chassis = false;
>> >>>> >> > >> > +    } else if (!strcmp(binding_rec->type, "external")) {
>> >>>> >> > >> > +        const char *chassis_id = smap_get(&binding_rec->options,
>> >>>> >> > >> > +                                          "requested-chassis");
>> >>>> >> > >> > +        our_chassis = chassis_id && (
>> >>>> >> > >> > +            !strcmp(chassis_id, chassis_rec->name) ||
>> >>>> >> > >> > +            !strcmp(chassis_id, chassis_rec->hostname));
>> >>>> >> > >> > +        if (our_chassis) {
>> >>>> >> > >> > +            add_local_datapath(sbrec_datapath_binding_by_key,
>> >>>> >> > >> > +                               sbrec_port_binding_by_datapath,
>> >>>> >> > >> > +                               sbrec_port_binding_by_name,
>> >>>> >> > >> > +                               binding_rec->datapath, true, local_datapaths);
>> >>>> >> > >> > +        }
>> >>>> >> > >> >      }
>> >>>> >> > >> >
>> >>>> >> > >> >      if (our_chassis
>> >>>> >> > >> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
>> >>>> >> > >> > index 8db81927e..98e8ed3b9 100644
>> >>>> >> > >> > --- a/ovn/controller/lflow.c
>> >>>> >> > >> > +++ b/ovn/controller/lflow.c
>> >>>> >> > >> > @@ -52,7 +52,10 @@ lflow_init(void)
>> >>>> >> > >> >  struct lookup_port_aux {
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name;
>> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type;
>> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
>> >>>> >> > >> >      const struct sbrec_datapath_binding *dp;
>> >>>> >> > >> > +    const struct sbrec_chassis *chassis;
>> >>>> >> > >> >  };
>> >>>> >> > >> >
>> >>>> >> > >> >  struct condition_aux {
>> >>>> >> > >> > @@ -66,6 +69,8 @@ static void consider_logical_flow(
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >>>> >> > >> >      const struct sbrec_logical_flow *,
>> >>>> >> > >> >      const struct hmap *local_datapaths,
>> >>>> >> > >> >      const struct sbrec_chassis *,
>> >>>> >> > >> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char *port_name, unsigned int *portp)
>> >>>> >> > >> >      const struct sbrec_port_binding *pb
>> >>>> >> > >> >          = lport_lookup_by_name(aux->sbrec_port_binding_by_name, port_name);
>> >>>> >> > >> >      if (pb && pb->datapath == aux->dp) {
>> >>>> >> > >> > -        *portp = pb->tunnel_key;
>> >>>> >> > >> > -        return true;
>> >>>> >> > >> > +        if (strcmp(pb->type, "external")) {
>> >>>> >> > >> > +            *portp = pb->tunnel_key;
>> >>>> >> > >> > +            return true;
>> >>>> >> > >> > +        }
>> >>>> >> > >> > +        const char *chassis_id = smap_get(&pb->options,
>> >>>> >> > >> > +                                          "requested-chassis");
>> >>>> >> > >> > +        if (chassis_id && (!strcmp(chassis_id, aux->chassis->name) ||
>> >>>> >> > >> > +                           !strcmp(chassis_id, aux->chassis->hostname))) {
>> >>>> >> > >> > +            const struct sbrec_port_binding *localnet_pb
>> >>>> >> > >> > +                = lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
>> >>>> >> > >> > +                                       aux->sbrec_port_binding_by_type,
>> >>>> >> > >> > +                                       aux->dp->tunnel_key, "localnet");
>> >>>> >> > >> > +            if (localnet_pb) {
>> >>>> >> > >> > +                *portp = localnet_pb->tunnel_key;
>> >>>> >> > >> > +                return true;
>> >>>> >> > >> > +            }
>> >>>> >> > >> > +        }
>> >>>> >> > >> > +        return false;
>> >>>> >> > >> >      }
>> >>>> >> > >> >
>> >>>> >> > >> >      const struct sbrec_multicast_group *mg = mcgroup_lookup_by_dp_name(
>> >>>> >> > >> > @@ -144,6 +165,8 @@ add_logical_flows(
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >>>> >> > >> >      const struct sbrec_dhcp_options_table *dhcp_options_table,
>> >>>> >> > >> >      const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
>> >>>> >> > >> >      const struct sbrec_logical_flow_table *logical_flow_table,
>> >>>> >> > >> > @@ -183,6 +206,8 @@ add_logical_flows(
>> >>>> >> > >> >          consider_logical_flow(sbrec_chassis_by_name,
>> >>>> >> > >> >                                sbrec_multicast_group_by_name_datapath,
>> >>>> >> > >> >                                sbrec_port_binding_by_name,
>> >>>> >> > >> > +                              sbrec_port_binding_by_type,
>> >>>> >> > >> > +                              sbrec_datapath_binding_by_key,
>> >>>> >> > >> >                                lflow, local_datapaths,
>> >>>> >> > >> >                                chassis, &dhcp_opts, &dhcpv6_opts, &nd_ra_opts,
>> >>>> >> > >> >                                addr_sets, port_groups, active_tunnels,
>> >>>> >> > >> > @@ -200,6 +225,8 @@ consider_logical_flow(
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >>>> >> > >> >      const struct sbrec_logical_flow *lflow,
>> >>>> >> > >> >      const struct hmap *local_datapaths,
>> >>>> >> > >> >      const struct sbrec_chassis *chassis,
>> >>>> >> > >> > @@ -292,7 +319,10 @@ consider_logical_flow(
>> >>>> >> > >> >          .sbrec_multicast_group_by_name_datapath
>> >>>> >> > >> >              = sbrec_multicast_group_by_name_datapath,
>> >>>> >> > >> >          .sbrec_port_binding_by_name = sbrec_port_binding_by_name,
>> >>>> >> > >> > -        .dp = lflow->logical_datapath
>> >>>> >> > >> > +        .sbrec_port_binding_by_type = sbrec_port_binding_by_type,
>> >>>> >> > >> > +        .sbrec_datapath_binding_by_key = sbrec_datapath_binding_by_key,
>> >>>> >> > >> > +        .dp = lflow->logical_datapath,
>> >>>> >> > >> > +        .chassis = chassis
>> >>>> >> > >> >      };
>> >>>> >> > >> >      struct condition_aux cond_aux = {
>> >>>> >> > >> >          .sbrec_chassis_by_name = sbrec_chassis_by_name,
>> >>>> >> > >> > @@ -463,6 +493,8 @@ void
>> >>>> >> > >> >  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >>>> >> > >> >            struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> >>>> >> > >> >            struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> >>>> >> > >> > +          struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >>>> >> > >> > +          struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >>>> >> > >> >            const struct sbrec_dhcp_options_table *dhcp_options_table,
>> >>>> >> > >> >            const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
>> >>>> >> > >> >            const struct sbrec_logical_flow_table *logical_flow_table,
>> >>>> >> > >> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >>>> >> > >> >
>> >>>> >> > >> >      add_logical_flows(sbrec_chassis_by_name,
>> >>>> >> > >> >                        sbrec_multicast_group_by_name_datapath,
>> >>>> >> > >> > -                      sbrec_port_binding_by_name, dhcp_options_table,
>> >>>> >> > >> > +                      sbrec_port_binding_by_name, sbrec_port_binding_by_type,
>> >>>> >> > >> > +                      sbrec_datapath_binding_by_key, dhcp_options_table,
>> >>>> >> > >> >                        dhcpv6_options_table, logical_flow_table,
>> >>>> >> > >> >                        local_datapaths, chassis, addr_sets, port_groups,
>> >>>> >> > >> >                        active_tunnels, local_lport_ids, flow_table, group_table,
>> >>>> >> > >> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
>> >>>> >> > >> > index d19338140..b2911e0eb 100644
>> >>>> >> > >> > --- a/ovn/controller/lflow.h
>> >>>> >> > >> > +++ b/ovn/controller/lflow.h
>> >>>> >> > >> > @@ -68,6 +68,8 @@ void lflow_init(void);
>> >>>> >> > >> >  void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
>> >>>> >> > >> >                 struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
>> >>>> >> > >> >                 struct ovsdb_idl_index *sbrec_port_binding_by_name,
>> >>>> >> > >> > +               struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >>>> >> > >> > +               struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >>>> >> > >> >                 const struct sbrec_dhcp_options_table *,
>> >>>> >> > >> >                 const struct sbrec_dhcpv6_options_table *,
>> >>>> >> > >> >                 const struct sbrec_logical_flow_table *,
>> >>>> >> > >> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
>> >>>> >> > >> > index cc5c5fbb2..9c827d9b0 100644
>> >>>> >> > >> > --- a/ovn/controller/lport.c
>> >>>> >> > >> > +++ b/ovn/controller/lport.c
>> >>>> >> > >> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >>>> >> > >> >      return retval;
>> >>>> >> > >> >  }
>> >>>> >> > >> >
>> >>>> >> > >> > +const struct sbrec_port_binding *
>> >>>> >> > >> > +lport_lookup_by_type(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >>>> >> > >> > +                     struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >>>> >> > >> > +                     uint64_t dp_key, const char *port_type)
>> >>>> >> > >> > +{
>> >>>> >> > >> > +    /* Lookup datapath corresponding to dp_key. */
>> >>>> >> > >> > +    const struct sbrec_datapath_binding *db = datapath_lookup_by_key(
>> >>>> >> > >> > +        sbrec_datapath_binding_by_key, dp_key);
>> >>>> >> > >> > +    if (!db) {
>> >>>> >> > >> > +        return NULL;
>> >>>> >> > >> > +    }
>> >>>> >> > >> > +
>> >>>> >> > >> > +    /* Build key for an indexed lookup. */
>> >>>> >> > >> > +    struct sbrec_port_binding *pb = sbrec_port_binding_index_init_row(
>> >>>> >> > >> > +            sbrec_port_binding_by_type);
>> >>>> >> > >> > +    sbrec_port_binding_index_set_datapath(pb, db);
>> >>>> >> > >> > +    sbrec_port_binding_index_set_type(pb, port_type);
>> >>>> >> > >> > +
>> >>>> >> > >> > +    const struct sbrec_port_binding *retval = sbrec_port_binding_index_find(
>> >>>> >> > >> > +            sbrec_port_binding_by_type, pb);
>> >>>> >> > >> > +
>> >>>> >> > >> > +    sbrec_port_binding_index_destroy_row(pb);
>> >>>> >> > >> > +
>> >>>> >> > >> > +    return retval;
>> >>>> >> > >> > +}
>> >>>> >> > >> > +
>> >>>> >> > >> >  const struct sbrec_datapath_binding *
>> >>>> >> > >> >  datapath_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >>>> >> > >> >                         uint64_t dp_key)
>> >>>> >> > >> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
>> >>>> >> > >> > index 7dcd5bee0..2d49792f6 100644
>> >>>> >> > >> > --- a/ovn/controller/lport.h
>> >>>> >> > >> > +++ b/ovn/controller/lport.h
>> >>>> >> > >> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding *lport_lookup_by_key(
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_key,
>> >>>> >> > >> >      uint64_t dp_key, uint64_t port_key);
>> >>>> >> > >> >
>> >>>> >> > >> > +const struct sbrec_port_binding *lport_lookup_by_type(
>> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
>> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type,
>> >>>> >> > >> > +    uint64_t dp_key, const char *port_type);
>> >>>> >> > >> > +
>> >>>> >> > >> >  const struct sbrec_datapath_binding *datapath_lookup_by_key(
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key, uint64_t dp_key);
>> >>>> >> > >> >
>> >>>> >> > >> > diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
>> >>>> >> > >> > index 4e9a5865f..5aab9142f 100644
>> >>>> >> > >> > --- a/ovn/controller/ovn-controller.c
>> >>>> >> > >> > +++ b/ovn/controller/ovn-controller.c
>> >>>> >> > >> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
>> >>>> >> > >> >       * ports that have a Gateway_Chassis that point's to our own
>> >>>> >> > >> >       * chassis */
>> >>>> >> > >> >      sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "chassisredirect");
>> >>>> >> > >> > +    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "external");
>> >>>> >> > >> >      if (chassis) {
>> >>>> >> > >> >          /* This should be mostly redundant with the other clauses for port
>> >>>> >> > >> >           * bindings, but it allows us to catch any ports that are assigned to
>> >>>> >> > >> > @@ -616,6 +617,9 @@ main(int argc, char *argv[])
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_port_binding_by_datapath
>> >>>> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> >>>> >> > >> >                                    &sbrec_port_binding_col_datapath);
>> >>>> >> > >> > +    struct ovsdb_idl_index *sbrec_port_binding_by_type
>> >>>> >> > >> > +        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> >>>> >> > >> > +                                  &sbrec_port_binding_col_type);
>> >>>> >> > >>
>> >>>> >> > >> This index is used with two columns: datapath_binding and type, so it
>> >>>> >> > >> should be created with both columns using create2.
>> >>>> >> > >>
>> >>>> >> > >> >      struct ovsdb_idl_index *sbrec_datapath_binding_by_key
>> >>>> >> > >> >          = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>> >>>> >> > >> >                                    &sbrec_datapath_binding_col_tunnel_key);
>> >>>> >> > >> > @@ -743,6 +747,8 @@ main(int argc, char *argv[])
>> >>>> >> > >> >                              sbrec_chassis_by_name,
>> >>>> >> > >> >                              sbrec_multicast_group_by_name_datapath,
>> >>>> >> > >> >                              sbrec_port_binding_by_name,
>> >>>> >> > >> > +                            sbrec_port_binding_by_type,
>> >>>> >> > >> > +                            sbrec_datapath_binding_by_key,
>> >>>> >> > >> >                              sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
>> >>>> >> > >> >                              sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
>> >>>> >> > >> >                              sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
>> >>>> >> > >> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
>> >>>> >> > >> > index aa03919bb..a9d4b8736 100644
>> >>>> >> > >> > --- a/ovn/lib/ovn-util.c
>> >>>> >> > >> > +++ b/ovn/lib/ovn-util.c
>> >>>> >> > >> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = {
>> >>>> >> > >> >      "localport",
>> >>>> >> > >> >      "router",
>> >>>> >> > >> >      "vtep",
>> >>>> >> > >> > +    "external",
>> >>>> >> > >> >  };
>> >>>> >> > >> >
>> >>>> >> > >> >  bool
>> >>>> >> > >> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
>> >>>> >> > >> > index 392a5efc9..c8883d60d 100644
>> >>>> >> > >> > --- a/ovn/northd/ovn-northd.8.xml
>> >>>> >> > >> > +++ b/ovn/northd/ovn-northd.8.xml
>> >>>> >> > >> > @@ -626,7 +626,8 @@ nd_na_router {
>> >>>> >> > >> >      <p>
>> >>>> >> > >> >        This table adds the DHCPv4 options to a DHCPv4 packet from the
>> >>>> >> > >> >        logical ports configured with IPv4 address(es) and DHCPv4 options,
>> >>>> >> > >> > -      and similarly for DHCPv6 options.
>> >>>> >> > >> > +      and similarly for DHCPv6 options. This table also adds flows for the
>> >>>> >> > >> > +      logical ports of type <code>external</code>.
>> >>>> >> > >> >      </p>
>> >>>> >> > >> >
>> >>>> >> > >> >      <ul>
>> >>>> >> > >> > @@ -827,7 +828,39 @@ output;
>> >>>> >> > >> >        </li>
>> >>>> >> > >> >      </ul>
>> >>>> >> > >> >
>> >>>> >> > >> > -    <h3>Ingress Table 16 Destination Lookup</h3>
>> >>>> >> > >> > +    <h3>Ingress table 16 External ports</h3>
>> >>>> >> > >> > +
>> >>>> >> > >> > +    <p>
>> >>>> >> > >> > +      Traffic from the <code>external</code> logical ports enter the ingress
>> >>>> >> > >> > +      datapath pipeline via the <code>localnet</code> port. This table adds the
>> >>>> >> > >> > +      below logical flows to handle the traffic from these ports.
>> >>>> >> > >> > +    </p>
>> >>>> >> > >> > +
>> >>>> >> > >> > +    <ul>
>> >>>> >> > >> > +      <li>
>> >>>> >> > >> > +        <p>
>> >>>> >> > >> > +          A priority-100 flow is added for each <code>external</code> logical
>> >>>> >> > >> > +          port which doesn't reside on a chassis to drop the ARP/IPv6 NS
>> >>>> >> > >> > +          request to the router IP(s) (of the logical switch) which matches
>> >>>> >> > >> > +          on the <code>inport</code> of the <code>external</code> logical port
>> >>>> >> > >> > +          and the valid <code>eth.src</code> address(es) of the
>> >>>> >> > >> > +          <code>external</code> logical port.
>> >>>> >> > >> > +        </p>
>> >>>> >> > >> > +
>> >>>> >> > >> > +        <p>
>> >>>> >> > >> > +          This flow guarantees that the ARP/NS request to the router IP
>> >>>> >> > >> > +          address from the external ports is responded by only the chassis
>> >>>> >> > >> > +          which has claimed these external ports. All the other chassis,
>> >>>> >> > >> > +          drops these packets.
>> >>>> >> > >> > +        </p>
>> >>>> >> > >> > +      </li>
>> >>>> >> > >> > +
>> >>>> >> > >> > +      <li>
>> >>>> >> > >> > +        A priority-0 flow that matches all packets to advances to table 17.
>> >>>> >> > >> > +      </li>
>> >>>> >> > >> > +    </ul>
>> >>>> >> > >> > +
>> >>>> >> > >> > +    <h3>Ingress Table 17 Destination Lookup</h3>
>> >>>> >> > >> >
>> >>>> >> > >> >      <p>
>> >>>> >> > >> >        This table implements switching behavior.  It contains these logical
>> >>>> >> > >> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>> >>>> >> > >> > index 3fd8a8757..87208c6c1 100644
>> >>>> >> > >> > --- a/ovn/northd/ovn-northd.c
>> >>>> >> > >> > +++ b/ovn/northd/ovn-northd.c
>> >>>> >> > >> > @@ -119,7 +119,8 @@ enum ovn_stage {
>> >>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13, "ls_in_dhcp_response") \
>> >>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14, "ls_in_dns_lookup")    \
>> >>>> >> > >> >      PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15, "ls_in_dns_response")  \
>> >>>> >> > >> > -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16, "ls_in_l2_lkup")       \
>> >>>> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16, "ls_in_external_port") \
>> >>>> >> > >> > +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17, "ls_in_l2_lkup")       \
>> >>>> >> > >> >                                                                            \
>> >>>> >> > >> >      /* Logical switch egress stages. */                                   \
>> >>>> >> > >> >      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")         \
>> >>>> >> > >> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct nbrec_logical_switch_port *lsp)
>> >>>> >> > >> >      return !lsp->up || *lsp->up;
>> >>>> >> > >> >  }
>> >>>> >> > >> >
>> >>>> >> > >> > +static bool
>> >>>> >> > >> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
>> >>>> >> > >> > +{
>> >>>> >> > >> > +    return !strcmp(nbsp->type, "external");
>> >>>> >> > >> > +}
>> >>>> >> > >> > +
>> >>>> >> > >> >  static bool
>> >>>> >> > >> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
>> >>>> >> > >> >                      struct ds *options_action, struct ds *response_action,
>> >>>> >> > >> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >>>> >> > >> >           *  - port type is localport
>> >>>> >> > >> >           */
>> >>>> >> > >> >          if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") &&
>> >>>> >> > >> > -            strcmp(op->nbsp->type, "localport")) {
>> >>>> >> > >> > +            strcmp(op->nbsp->type, "localport") && lsp_is_external(op->nbsp)) {
>> >>>> >> > >>
>> >>>> >> > >> Sorry that I missed this in last review. The && condition has problem.
>> >>>> >> > >> It will cause ARP responder flows added for all lports that are not
>> >>>> >> > >> external. I think it should be || here.
>> >>>> >> > >
>> >>>> >> > >
>> >>>> >> > > Agree. To make it easier to read, I will add a new "if" with continue - below this one for
>> >>>> >> > > external port types.
>> >>>> >> > >
>> >>>> >> > >
>> >>>> >> > >>
>> >>>> >> > >>
>> >>>> >> > >> >              continue;
>> >>>> >> > >> >          }
>> >>>> >> > >> >
>> >>>> >> > >> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >>>> >> > >> >              continue;
>> >>>> >> > >> >          }
>> >>>> >> > >> >
>> >>>> >> > >> > +        bool is_external = lsp_is_external(op->nbsp);
>> >>>> >> > >> > +        if (is_external && !op->od->localnet_port) {
>> >>>> >> > >> > +            /* If it's an external port and there is no localnet port
>> >>>> >> > >> > +             * ignore it. */
>> >>>> >> > >> > +            continue;
>> >>>> >> > >> > +        }
>> >>>> >> > >> > +
>> >>>> >> > >> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>> >>>> >> > >> >              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
>> >>>> >> > >> >                  struct ds options_action = DS_EMPTY_INITIALIZER;
>> >>>> >> > >> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >>>> >> > >> >                      ds_put_format(
>> >>>> >> > >> >                          &match, "inport == %s && eth.src == %s && "
>> >>>> >> > >> >                          "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
>> >>>> >> > >> > -                        "udp.src == 68 && udp.dst == 67", op->json_key,
>> >>>> >> > >> > -                        op->lsp_addrs[i].ea_s);
>> >>>> >> > >> > +                        "udp.src == 68 && udp.dst == 67",
>> >>>> >> > >> > +                        op->json_key, op->lsp_addrs[i].ea_s);
>> >>>> >> > >>
>> >>>> >> > >> No change here?
>> >>>> >> > >
>> >>>> >> > >
>> >>>> >> > > I think it's unwanted and unrelated change. I will correct it.
>> >>>> >> > >>
>> >>>> >> > >> >
>> >>>> >> > >> >                      ovn_lflow_add(lflows, op->od, S_SWITCH_IN_DHCP_OPTIONS,
>> >>>> >> > >> >                                    100, ds_cstr(&match),
>> >>>> >> > >> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >>>> >> > >> >      /* Ingress table 12 and 13: DHCP options and response, by default goto
>> >>>> >> > >> >       * next. (priority 0).
>> >>>> >> > >> >       * Ingress table 14 and 15: DNS lookup and response, by default goto next.
>> >>>> >> > >> > -     * (priority 0).*/
>> >>>> >> > >> > +     * (priority 0).
>> >>>> >> > >> > +     * Ingress table 16 - External port handling, by default goto next.
>> >>>> >> > >> > +     * (priority 0). */
>> >>>> >> > >> >
>> >>>> >> > >> >      HMAP_FOR_EACH (od, key_node, datapaths) {
>> >>>> >> > >> >          if (!od->nbs) {
>> >>>> >> > >> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
>> >>>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1", "next;");
>> >>>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, "1", "next;");
>> >>>> >> > >> >          ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0, "1", "next;");
>> >>>> >> > >> > +        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0, "1", "next;");
>> >>>> >> > >> >      }
>> >>>> >> > >> >
>> >>>> >> > >> > -    /* Ingress table 16: Destination lookup, broadcast and multicast handling
>> >>>> >> > >> > +    HMAP_FOR_EACH (op, key_node, ports) {
>> >>>> >> > >> > +        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
>> >>>> >> > >> > +           continue;
>> >>>> >> > >> > +        }
>> >>>> >> > >> > +
>> >>>> >> > >> > +        /* Table 16: External port. Drop ARP request for router ips from
>> >>>> >> > >> > +         * external ports  on chassis not binding those ports.
>> >>>> >> > >> > +         * This makes the router pipeline to be run only on the chassis
>> >>>> >> > >> > +         * binding the external ports. */
>> >>>> >> > >> > +
>> >>>> >> > >> > +        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>> >>>> >> > >> > +            for (size_t j = 0; j < op->od->n_router_ports; j++) {
>> >>>> >> > >> > +                struct ovn_port *rp = op->od->router_ports[j];
>> >>>> >> > >> > +                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
>> >>>> >> > >> > +                    for (size_t l = 0; l < rp->lsp_addrs[k].n_ipv4_addrs;
>> >>>> >> > >> > +                         l++) {
>> >>>> >> > >> > +                        ds_clear(&match);
>> >>>> >> > >> > +                        ds_put_cstr(&match, "ip4");
>> >>>> >> > >> > +                        ds_put_format(
>> >>>> >> > >> > +                            &match, "inport == %s && eth.src == %s"
>> >>>> >> > >> > +                            " && !is_chassis_resident(%s)"
>> >>>> >> > >> > +                            " && arp.tpa == %s && arp.op == 1",
>> >>>> >> > >> > +                            op->json_key, op->lsp_addrs[i].ea_s, op->json_key,
>> >>>> >> > >>
>> >>>> >> > >> I believe the inport should match the localnet port's json_key here,
>> >>>> >> > >> since it is coming from a localnet port.
>> >>>> >> > >
>> >>>> >> > >
>> >>>> >> > > Both would work. If you see the code in lflow.c in this patch - it will get the tunnel
>> >>>> >> > > key of the localnet port if the port_binding type is "external".
>> >>>> >> > >
>> >>>> >> > > That's how even the DHCP requests are handled. ovn-controller will translate
>> >>>> >> > > the logical flows with action "put_dhcp_opts" only the chassis claiming the
>> >>>> >> > > external ports.
>> >>>> >> >
>> >>>> >> > Oh, yes you are right. Actually I read that part in v4 and it somehow
>> >>>> >> > slipped my mind. Thanks for explain.
>> >>>> >>
>> >>>> >> I thought it a second time, and I'd suggest to do the convertion here
>> >>>> >> in northd instead of ovn-controller, for two reasons:
>> >>>> >>
>> >>>> >> 1. In ovn-controller there is no extra context so it just blindly
>> >>>> >> transate all references to external logical port into localnet port
>> >>>> >> key. This could lead to unexpected behavior. For example, if someone
>> >>>> >> uses external logical port in ACL match condition. The match condition
>> >>>> >> would then apply to all packets to/from localnet port which is
>> >>>> >> definitely unwanted. (at the same time it would be better to document
>> >>>> >> that features like port-security, ACL should not be used for external
>> >>>> >> logical ports)
>> >>>> >>
>> >>>> >
>> >>>> > That's not how it works in the present patch. Lets say you have  2 chassis
>> >>>> > hv1 and hv2 and an external port sw0-ext1 and a localnet port "ln-public".
>> >>>> > Suppose if the requested-chassis is set to hv1, then all the logical flows with the
>> >>>> > match "inport == sw0-ext1" will be converted to OF flows only on hv1 as this port
>> >>>> > is bound by hv1 and the function 'lookup_port_cb()' would return true only
>> >>>> > on hv1 . In hv2, lookup_port_cb() would return false.
>> >>>>
>> >>>> Yes, this is well understood.
>> >>>>
>> >>>> >
>> >>>> > If we want to do the conversion in ovn-northd.c the match condition would have to
>> >>>> > be - "inport == "ln-public" && is_chassis_resident(sw0-ext1) && ...."
>> >>>> > instead of the present one  - "inport == sw0-ext1 && ...".
>> >>>>
>> >>>> Yes, this is what I would suggest (see reason below).
>> >>>>
>> >>>> >
>> >>>> > And the ACL match condition would not be an issue because of the above mentioned
>> >>>> > reason. i.e the ACL flows will be applied only on the chassis binding the external
>> >>>> > port.
>> >>>>
>> >>>> Here is the concern. For example, chassis A has regular port sw0-lsp1
>> >>>> bound. Chassis A is also set as requested-chassis for external port
>> >>>> sw0-ext1. And now the user adds an ACL: to-port, 1001, 'outport ==
>> >>>> "sw0-ext1", drop. This would get translated to something like:
>> >>>> to-port, 1001, 'outport == "ln-public"', drop. Wouldn't this impact
>> >>>> the traffic from sw0-lsp1 to the bridged network? Even if it doesn't
>> >>>> impact because of some subtle reasons of current implementation, I
>> >>>> would say it is risky and could leads to problems under certain
>> >>>> conditions, because the conversion in ovn-controller widens the
>> >>>> original intent. Whereas doing it in northd only for specific lflows
>> >>>> would ensure it has impact only for intended use cases.
>> >>>
>> >>>
>> >>>
>> >>> Thanks for the detailed explanation. I agree. It's clear to me now. I will update accordingly in v6.
>> >>>
>> >>> Regards
>> >>> Numan
>> >>>
>> >>>>
>> >>>> >
>> >>>> > The test case added checks that the OF flows are applied only on the bound chassis.
>> >>>> >
>> >>>> > I think it is better to do it in ovn-controller instead of ovn-northd. Please let me know
>> >>>> > if you still have any concerns.
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> >> 2. A less important reason is, it is better to do it at earlier stage
>> >>>> >> than later. northd handles common processing. This part of logic is
>> >>>> >> common for all chassises, so it would be better if we explicitely
>> >>>> >> handle it in northd, instead of let every chassis to process. And the
>> >>>> >> change in northd would likely be simpler than in ovn-controller.
>> >>>>
>> >>>> This is less critical problem, but I think it is worth consideration,
>> >>>> too. With current logic, although the conversion would take effect
>> >>>> only if "is_chassis_resident()" is true, but the code logic and
>> >>>> processing has to happen on every chassis.
>> >>>>
>> >>>> >>
>> >>>> >> Thanks,
>> >>>> >> Han
>> >
>> >
>> >
>> > --
>> > Miguel Ángel Ajo
>> > OSP / Networking DFG, OVN Squad Engineering
diff mbox series

Patch

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index 021ecddcf..64e605b92 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -471,6 +471,18 @@  consider_local_datapath(struct ovsdb_idl_txn *ovnsb_idl_txn,
          * for them. */
         sset_add(local_lports, binding_rec->logical_port);
         our_chassis = false;
+    } else if (!strcmp(binding_rec->type, "external")) {
+        const char *chassis_id = smap_get(&binding_rec->options,
+                                          "requested-chassis");
+        our_chassis = chassis_id && (
+            !strcmp(chassis_id, chassis_rec->name) ||
+            !strcmp(chassis_id, chassis_rec->hostname));
+        if (our_chassis) {
+            add_local_datapath(sbrec_datapath_binding_by_key,
+                               sbrec_port_binding_by_datapath,
+                               sbrec_port_binding_by_name,
+                               binding_rec->datapath, true, local_datapaths);
+        }
     }
 
     if (our_chassis
diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index 8db81927e..98e8ed3b9 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -52,7 +52,10 @@  lflow_init(void)
 struct lookup_port_aux {
     struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
     struct ovsdb_idl_index *sbrec_port_binding_by_name;
+    struct ovsdb_idl_index *sbrec_port_binding_by_type;
+    struct ovsdb_idl_index *sbrec_datapath_binding_by_key;
     const struct sbrec_datapath_binding *dp;
+    const struct sbrec_chassis *chassis;
 };
 
 struct condition_aux {
@@ -66,6 +69,8 @@  static void consider_logical_flow(
     struct ovsdb_idl_index *sbrec_chassis_by_name,
     struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
     struct ovsdb_idl_index *sbrec_port_binding_by_name,
+    struct ovsdb_idl_index *sbrec_port_binding_by_type,
+    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
     const struct sbrec_logical_flow *,
     const struct hmap *local_datapaths,
     const struct sbrec_chassis *,
@@ -89,8 +94,24 @@  lookup_port_cb(const void *aux_, const char *port_name, unsigned int *portp)
     const struct sbrec_port_binding *pb
         = lport_lookup_by_name(aux->sbrec_port_binding_by_name, port_name);
     if (pb && pb->datapath == aux->dp) {
-        *portp = pb->tunnel_key;
-        return true;
+        if (strcmp(pb->type, "external")) {
+            *portp = pb->tunnel_key;
+            return true;
+        }
+        const char *chassis_id = smap_get(&pb->options,
+                                          "requested-chassis");
+        if (chassis_id && (!strcmp(chassis_id, aux->chassis->name) ||
+                           !strcmp(chassis_id, aux->chassis->hostname))) {
+            const struct sbrec_port_binding *localnet_pb
+                = lport_lookup_by_type(aux->sbrec_datapath_binding_by_key,
+                                       aux->sbrec_port_binding_by_type,
+                                       aux->dp->tunnel_key, "localnet");
+            if (localnet_pb) {
+                *portp = localnet_pb->tunnel_key;
+                return true;
+            }
+        }
+        return false;
     }
 
     const struct sbrec_multicast_group *mg = mcgroup_lookup_by_dp_name(
@@ -144,6 +165,8 @@  add_logical_flows(
     struct ovsdb_idl_index *sbrec_chassis_by_name,
     struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
     struct ovsdb_idl_index *sbrec_port_binding_by_name,
+    struct ovsdb_idl_index *sbrec_port_binding_by_type,
+    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
     const struct sbrec_dhcp_options_table *dhcp_options_table,
     const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
     const struct sbrec_logical_flow_table *logical_flow_table,
@@ -183,6 +206,8 @@  add_logical_flows(
         consider_logical_flow(sbrec_chassis_by_name,
                               sbrec_multicast_group_by_name_datapath,
                               sbrec_port_binding_by_name,
+                              sbrec_port_binding_by_type,
+                              sbrec_datapath_binding_by_key,
                               lflow, local_datapaths,
                               chassis, &dhcp_opts, &dhcpv6_opts, &nd_ra_opts,
                               addr_sets, port_groups, active_tunnels,
@@ -200,6 +225,8 @@  consider_logical_flow(
     struct ovsdb_idl_index *sbrec_chassis_by_name,
     struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
     struct ovsdb_idl_index *sbrec_port_binding_by_name,
+    struct ovsdb_idl_index *sbrec_port_binding_by_type,
+    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
     const struct sbrec_logical_flow *lflow,
     const struct hmap *local_datapaths,
     const struct sbrec_chassis *chassis,
@@ -292,7 +319,10 @@  consider_logical_flow(
         .sbrec_multicast_group_by_name_datapath
             = sbrec_multicast_group_by_name_datapath,
         .sbrec_port_binding_by_name = sbrec_port_binding_by_name,
-        .dp = lflow->logical_datapath
+        .sbrec_port_binding_by_type = sbrec_port_binding_by_type,
+        .sbrec_datapath_binding_by_key = sbrec_datapath_binding_by_key,
+        .dp = lflow->logical_datapath,
+        .chassis = chassis
     };
     struct condition_aux cond_aux = {
         .sbrec_chassis_by_name = sbrec_chassis_by_name,
@@ -463,6 +493,8 @@  void
 lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
           struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
           struct ovsdb_idl_index *sbrec_port_binding_by_name,
+          struct ovsdb_idl_index *sbrec_port_binding_by_type,
+          struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
           const struct sbrec_dhcp_options_table *dhcp_options_table,
           const struct sbrec_dhcpv6_options_table *dhcpv6_options_table,
           const struct sbrec_logical_flow_table *logical_flow_table,
@@ -481,7 +513,8 @@  lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
 
     add_logical_flows(sbrec_chassis_by_name,
                       sbrec_multicast_group_by_name_datapath,
-                      sbrec_port_binding_by_name, dhcp_options_table,
+                      sbrec_port_binding_by_name, sbrec_port_binding_by_type,
+                      sbrec_datapath_binding_by_key, dhcp_options_table,
                       dhcpv6_options_table, logical_flow_table,
                       local_datapaths, chassis, addr_sets, port_groups,
                       active_tunnels, local_lport_ids, flow_table, group_table,
diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
index d19338140..b2911e0eb 100644
--- a/ovn/controller/lflow.h
+++ b/ovn/controller/lflow.h
@@ -68,6 +68,8 @@  void lflow_init(void);
 void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name,
                struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
                struct ovsdb_idl_index *sbrec_port_binding_by_name,
+               struct ovsdb_idl_index *sbrec_port_binding_by_type,
+               struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
                const struct sbrec_dhcp_options_table *,
                const struct sbrec_dhcpv6_options_table *,
                const struct sbrec_logical_flow_table *,
diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
index cc5c5fbb2..9c827d9b0 100644
--- a/ovn/controller/lport.c
+++ b/ovn/controller/lport.c
@@ -64,6 +64,32 @@  lport_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
     return retval;
 }
 
+const struct sbrec_port_binding *
+lport_lookup_by_type(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
+                     struct ovsdb_idl_index *sbrec_port_binding_by_type,
+                     uint64_t dp_key, const char *port_type)
+{
+    /* Lookup datapath corresponding to dp_key. */
+    const struct sbrec_datapath_binding *db = datapath_lookup_by_key(
+        sbrec_datapath_binding_by_key, dp_key);
+    if (!db) {
+        return NULL;
+    }
+
+    /* Build key for an indexed lookup. */
+    struct sbrec_port_binding *pb = sbrec_port_binding_index_init_row(
+            sbrec_port_binding_by_type);
+    sbrec_port_binding_index_set_datapath(pb, db);
+    sbrec_port_binding_index_set_type(pb, port_type);
+
+    const struct sbrec_port_binding *retval = sbrec_port_binding_index_find(
+            sbrec_port_binding_by_type, pb);
+
+    sbrec_port_binding_index_destroy_row(pb);
+
+    return retval;
+}
+
 const struct sbrec_datapath_binding *
 datapath_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
                        uint64_t dp_key)
diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h
index 7dcd5bee0..2d49792f6 100644
--- a/ovn/controller/lport.h
+++ b/ovn/controller/lport.h
@@ -42,6 +42,11 @@  const struct sbrec_port_binding *lport_lookup_by_key(
     struct ovsdb_idl_index *sbrec_port_binding_by_key,
     uint64_t dp_key, uint64_t port_key);
 
+const struct sbrec_port_binding *lport_lookup_by_type(
+    struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
+    struct ovsdb_idl_index *sbrec_port_binding_by_type,
+    uint64_t dp_key, const char *port_type);
+
 const struct sbrec_datapath_binding *datapath_lookup_by_key(
     struct ovsdb_idl_index *sbrec_datapath_binding_by_key, uint64_t dp_key);
 
diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index 4e9a5865f..5aab9142f 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -145,6 +145,7 @@  update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
      * ports that have a Gateway_Chassis that point's to our own
      * chassis */
     sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "chassisredirect");
+    sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "external");
     if (chassis) {
         /* This should be mostly redundant with the other clauses for port
          * bindings, but it allows us to catch any ports that are assigned to
@@ -616,6 +617,9 @@  main(int argc, char *argv[])
     struct ovsdb_idl_index *sbrec_port_binding_by_datapath
         = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
                                   &sbrec_port_binding_col_datapath);
+    struct ovsdb_idl_index *sbrec_port_binding_by_type
+        = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
+                                  &sbrec_port_binding_col_type);
     struct ovsdb_idl_index *sbrec_datapath_binding_by_key
         = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
                                   &sbrec_datapath_binding_col_tunnel_key);
@@ -743,6 +747,8 @@  main(int argc, char *argv[])
                             sbrec_chassis_by_name,
                             sbrec_multicast_group_by_name_datapath,
                             sbrec_port_binding_by_name,
+                            sbrec_port_binding_by_type,
+                            sbrec_datapath_binding_by_key,
                             sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl),
                             sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl),
                             sbrec_logical_flow_table_get(ovnsb_idl_loop.idl),
diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
index aa03919bb..a9d4b8736 100644
--- a/ovn/lib/ovn-util.c
+++ b/ovn/lib/ovn-util.c
@@ -319,6 +319,7 @@  static const char *OVN_NB_LSP_TYPES[] = {
     "localport",
     "router",
     "vtep",
+    "external",
 };
 
 bool
diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index 392a5efc9..c8883d60d 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -626,7 +626,8 @@  nd_na_router {
     <p>
       This table adds the DHCPv4 options to a DHCPv4 packet from the
       logical ports configured with IPv4 address(es) and DHCPv4 options,
-      and similarly for DHCPv6 options.
+      and similarly for DHCPv6 options. This table also adds flows for the
+      logical ports of type <code>external</code>.
     </p>
 
     <ul>
@@ -827,7 +828,39 @@  output;
       </li>
     </ul>
 
-    <h3>Ingress Table 16 Destination Lookup</h3>
+    <h3>Ingress table 16 External ports</h3>
+
+    <p>
+      Traffic from the <code>external</code> logical ports enter the ingress
+      datapath pipeline via the <code>localnet</code> port. This table adds the
+      below logical flows to handle the traffic from these ports.
+    </p>
+
+    <ul>
+      <li>
+        <p>
+          A priority-100 flow is added for each <code>external</code> logical
+          port which doesn't reside on a chassis to drop the ARP/IPv6 NS
+          request to the router IP(s) (of the logical switch) which matches
+          on the <code>inport</code> of the <code>external</code> logical port
+          and the valid <code>eth.src</code> address(es) of the
+          <code>external</code> logical port.
+        </p>
+
+        <p>
+          This flow guarantees that the ARP/NS request to the router IP
+          address from the external ports is responded by only the chassis
+          which has claimed these external ports. All the other chassis,
+          drops these packets.
+        </p>
+      </li>
+
+      <li>
+        A priority-0 flow that matches all packets to advances to table 17.
+      </li>
+    </ul>
+
+    <h3>Ingress Table 17 Destination Lookup</h3>
 
     <p>
       This table implements switching behavior.  It contains these logical
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 3fd8a8757..87208c6c1 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -119,7 +119,8 @@  enum ovn_stage {
     PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13, "ls_in_dhcp_response") \
     PIPELINE_STAGE(SWITCH, IN,  DNS_LOOKUP,    14, "ls_in_dns_lookup")    \
     PIPELINE_STAGE(SWITCH, IN,  DNS_RESPONSE,  15, "ls_in_dns_response")  \
-    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       16, "ls_in_l2_lkup")       \
+    PIPELINE_STAGE(SWITCH, IN,  EXTERNAL_PORT, 16, "ls_in_external_port") \
+    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       17, "ls_in_l2_lkup")       \
                                                                           \
     /* Logical switch egress stages. */                                   \
     PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")         \
@@ -2942,6 +2943,12 @@  lsp_is_up(const struct nbrec_logical_switch_port *lsp)
     return !lsp->up || *lsp->up;
 }
 
+static bool
+lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
+{
+    return !strcmp(nbsp->type, "external");
+}
+
 static bool
 build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
                     struct ds *options_action, struct ds *response_action,
@@ -4185,7 +4192,7 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
          *  - port type is localport
          */
         if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") &&
-            strcmp(op->nbsp->type, "localport")) {
+            strcmp(op->nbsp->type, "localport") && lsp_is_external(op->nbsp)) {
             continue;
         }
 
@@ -4297,6 +4304,13 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
             continue;
         }
 
+        bool is_external = lsp_is_external(op->nbsp);
+        if (is_external && !op->od->localnet_port) {
+            /* If it's an external port and there is no localnet port
+             * ignore it. */
+            continue;
+        }
+
         for (size_t i = 0; i < op->n_lsp_addrs; i++) {
             for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
                 struct ds options_action = DS_EMPTY_INITIALIZER;
@@ -4309,8 +4323,8 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
                     ds_put_format(
                         &match, "inport == %s && eth.src == %s && "
                         "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
-                        "udp.src == 68 && udp.dst == 67", op->json_key,
-                        op->lsp_addrs[i].ea_s);
+                        "udp.src == 68 && udp.dst == 67",
+                        op->json_key, op->lsp_addrs[i].ea_s);
 
                     ovn_lflow_add(lflows, op->od, S_SWITCH_IN_DHCP_OPTIONS,
                                   100, ds_cstr(&match),
@@ -4415,7 +4429,9 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
     /* Ingress table 12 and 13: DHCP options and response, by default goto
      * next. (priority 0).
      * Ingress table 14 and 15: DNS lookup and response, by default goto next.
-     * (priority 0).*/
+     * (priority 0).
+     * Ingress table 16 - External port handling, by default goto next.
+     * (priority 0). */
 
     HMAP_FOR_EACH (od, key_node, datapaths) {
         if (!od->nbs) {
@@ -4426,9 +4442,58 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
         ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1", "next;");
         ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, "1", "next;");
         ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0, "1", "next;");
+        ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0, "1", "next;");
     }
 
-    /* Ingress table 16: Destination lookup, broadcast and multicast handling
+    HMAP_FOR_EACH (op, key_node, ports) {
+        if (!op->nbsp || !lsp_is_external(op->nbsp)) {
+           continue;
+        }
+
+        /* Table 16: External port. Drop ARP request for router ips from
+         * external ports  on chassis not binding those ports.
+         * This makes the router pipeline to be run only on the chassis
+         * binding the external ports. */
+
+        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
+            for (size_t j = 0; j < op->od->n_router_ports; j++) {
+                struct ovn_port *rp = op->od->router_ports[j];
+                for (size_t k = 0; k < rp->n_lsp_addrs; k++) {
+                    for (size_t l = 0; l < rp->lsp_addrs[k].n_ipv4_addrs;
+                         l++) {
+                        ds_clear(&match);
+                        ds_put_cstr(&match, "ip4");
+                        ds_put_format(
+                            &match, "inport == %s && eth.src == %s"
+                            " && !is_chassis_resident(%s)"
+                            " && arp.tpa == %s && arp.op == 1",
+                            op->json_key, op->lsp_addrs[i].ea_s, op->json_key,
+                            rp->lsp_addrs[k].ipv4_addrs[l].addr_s);
+                        ovn_lflow_add(lflows, op->od,
+                                      S_SWITCH_IN_EXTERNAL_PORT, 100,
+                                      ds_cstr(&match), "drop;");
+                    }
+                    for (size_t l = 0; l < rp->lsp_addrs[k].n_ipv6_addrs;
+                         l++) {
+                        ds_clear(&match);
+                        ds_put_format(
+                            &match, "inport == %s && eth.src == %s"
+                            " && !is_chassis_resident(%s)"
+                            " && nd_ns && ip6.dst == {%s, %s} && "
+                            "nd.target == %s",
+                            op->json_key, op->lsp_addrs[i].ea_s, op->json_key,
+                            rp->lsp_addrs[k].ipv6_addrs[l].addr_s,
+                            rp->lsp_addrs[k].ipv6_addrs[l].sn_addr_s,
+                            rp->lsp_addrs[k].ipv6_addrs[l].addr_s);
+                        ovn_lflow_add(lflows, op->od,
+                                      S_SWITCH_IN_EXTERNAL_PORT, 100,
+                                      ds_cstr(&match), "drop;");
+                    }
+                }
+            }
+        }
+    }
+    /* Ingress table 17: Destination lookup, broadcast and multicast handling
      * (priority 100). */
     HMAP_FOR_EACH (op, key_node, ports) {
         if (!op->nbsp) {
@@ -4448,9 +4513,9 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
                       "outport = \""MC_FLOOD"\"; output;");
     }
 
-    /* Ingress table 16: Destination lookup, unicast handling (priority 50), */
+    /* Ingress table 17: Destination lookup, unicast handling (priority 50), */
     HMAP_FOR_EACH (op, key_node, ports) {
-        if (!op->nbsp) {
+        if (!op->nbsp || lsp_is_external(op->nbsp)) {
             continue;
         }
 
@@ -4567,7 +4632,7 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
         }
     }
 
-    /* Ingress table 16: Destination lookup for unknown MACs (priority 0). */
+    /* Ingress table 17: Destination lookup for unknown MACs (priority 0). */
     HMAP_FOR_EACH (od, key_node, datapaths) {
         if (!od->nbs) {
             continue;
@@ -4602,7 +4667,7 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
      * Priority 150 rules drop packets to disabled logical ports, so that they
      * don't even receive multicast or broadcast packets. */
     HMAP_FOR_EACH (op, key_node, ports) {
-        if (!op->nbsp) {
+        if (!op->nbsp || lsp_is_external(op->nbsp)) {
             continue;
         }
 
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index 3936e6016..405975b7b 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -1678,6 +1678,84 @@ 
     </li>
   </ol>
 
+  <h2>Native OVN services for external logical ports</h2>
+
+  <p>
+    To support OVN native services (like DHCP/IPv6 RA/DNS lookup) to the
+    cloud resources which are external, OVN supports <code>external</code>
+    logical ports.
+  </p>
+
+  <p>
+    Below are some of the use cases where <code>external</code> ports can be
+    used.
+  </p>
+
+  <ul>
+    <li>
+      VMs connected to SR-IOV nics - Traffic from these VMs by passes the
+      kernel stack and local <code>ovn-controller</code> do not bind these
+      ports and cannot serve the native services.
+    </li>
+    <li>
+      When CMS supports provisioning baremetal servers.
+    </li>
+  </ul>
+
+  <p>
+    OVN will provide the native services if CMS has done the below
+    configuration in the <dfn>OVN Northbound Database</dfn>.
+  </p>
+
+  <ul>
+    <li>
+      A row is created in <code>Logical_Switch_Port</code>, configuring the
+      <ref column="addresses" table="Logical_Switch_Port" db="OVN_NB"/> column
+      and setting the <ref column="type" table="Logical_Switch_Port"
+      db="OVN_NB"/> to <code>external</code>.
+    </li>
+
+    <li>
+      <ref column="options:requested-chassis" table="Logical_Switch_Port"
+      db="OVN_NB"/> column is configured to a desired chassis.
+    </li>
+
+    <li>
+      The chassis on which this logical port is requested has the
+      <code>ovn-bridge-mappings</code> configured and has proper L2
+      connectivity so that it can receive the DHCP and other related request
+      packets from these external resources.
+    </li>
+
+    <li>
+      The Logical_Switch of this port has a <code>localnet</code> port.
+    </li>
+
+    <li>
+      Native OVN services are enabled by configuring the DHCP and other
+      options like the way it is done for the normal logical ports.
+    </li>
+  </ul>
+
+  <p>
+    OVN doesn't support HA for these <code>external</code> ports. In case
+    the <code>ovn-controller</code> running on the requested chassis goes down,
+    it is the responsiblity of CMS, to reschedule these <code>external</code>
+    ports to other active chassis.
+  </p>
+
+  <p>
+    It is recommended to request the same chassis for all the external ports
+    of a logical switch. Otherwise, the physical switch might see MAC flap
+    issue when different chassis provide the native services. For example when
+    supporting native DHCPv4 service, DHCPv4 server mac (configured in
+    <ref column="options:server_mac" table="DHCP_Options" db="OVN_NB"/> column
+    in table <ref table="DHCP_Options"/>)
+    originating from different ports can cause MAC flap issue. The MAC of the
+    logical router IP(s) can also flap if the same chassis is not requested for
+    all the external ports of a logical switch.
+  </p>
+
   <h1>Security</h1>
 
   <h2>Role-Based Access Controls for the Soutbound DB</h2>
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 6d6fb055a..fdf9adbfa 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -353,6 +353,53 @@ 
           <dd>
             A port to a logical switch on a VTEP gateway.
           </dd>
+
+          <dt><code>external</code></dt>
+          <dd>
+            <p>
+              Represents a logical port which is external and not having
+              an OVS port in the integration bridge.
+              <code>OVN</code> will never receive any traffic from this port or
+              send any traffic to this port. <code>OVN</code> can support
+              native services like DHCPv4/DHCPv6/DNS for this port.
+              If <ref column="options:requested-chassis"/> is defined,
+              <code>ovn-controller</code> running in that chassis will bind
+              this port to provide these native services. It is expected that
+              this port belong to a bridged logical switch
+              (with a <code>localnet</code> port).
+            </p>
+
+            <p>
+              It is recommended to request the same chassis for all the
+              external ports of a logical switch. Otherwise, the physical
+              switch might see MAC flap issue when different chassis provide
+              the native services. For example when supporting native DHCPv4
+              service, DHCPv4 server mac (configured in
+              <ref column="options:server_mac" table="DHCP_Options"
+              db="OVN_NB"/> column in table <ref table="DHCP_Options"/>)
+              originating from different ports can cause MAC flap issue.
+              The MAC of the logical router IP(s) can also flap if the
+              same chassis is not requested for all the external ports
+              of a logical switch.
+            </p>
+
+            <p>
+              Below are some of the use cases where <code>external</code>
+              ports can be used.
+            </p>
+
+            <ul>
+              <li>
+                VMs connected to SR-IOV nics - Traffic from these VMs by passes
+                the kernel stack and local <code>ovn-controller</code> do not
+                bind these ports and cannot serve the native services.
+              </li>
+
+              <li>
+                When CMS supports provisioning baremetal servers.
+              </li>
+            </ul>
+          </dd>
         </dl>
       </column>
     </group>
diff --git a/tests/ovn.at b/tests/ovn.at
index 8bada3241..94c774e8b 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -9594,9 +9594,9 @@  AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=32 | grep active_backup | gre
 sleep 3 # let BFD sessions settle so we get the right flows on the right chassis
 
 # make sure that flows for handling the outside router port reside on gw1
-AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
+AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
 ]])
-AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
+AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
 ]])
 
 # make sure ARP responder flows for outside router port reside on gw1 too
@@ -9686,9 +9686,9 @@  AT_CHECK([ovs-vsctl --bare --columns bfd find Interface name=ovn-hv1-0],[0],
 sleep 3  # let BFD sessions settle so we get the right flows on the right chassis
 
 # make sure that flows for handling the outside router port reside on gw2 now
-AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
+AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
 ]])
-AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
+AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
 ]])
 
 # disconnect GW2 from the network, GW1 should take over
@@ -9700,9 +9700,9 @@  sleep 4
 bfd_dump
 
 # make sure that flows for handling the outside router port reside on gw2 now
-AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
+AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1
 ]])
-AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
+AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0
 ]])
 
 # check that the chassis redirect port has been reclaimed by the gw1 chassis
@@ -11619,6 +11619,524 @@  as hv2 start_daemon ovn-controller
 OVN_CLEANUP([hv1],[hv2])
 AT_CLEANUP
 
+AT_SETUP([ovn -- external logical port])
+AT_SKIP_IF([test $HAVE_PYTHON = no])
+ovn_start
+
+net_add n1
+sim_add hv1
+sim_add hv2
+
+ovn-nbctl ls-add ls1
+ovn-nbctl lsp-add ls1 ls1-lp1 \
+-- lsp-set-addresses ls1-lp1 "f0:00:00:00:00:01 10.0.0.4 ae70::4"
+
+# Add a couple of external logical port
+ovn-nbctl lsp-add ls1 ls1-lp_ext1 \
+-- lsp-set-addresses ls1-lp_ext1 "f0:00:00:00:00:03 10.0.0.6 ae70::6"
+ovn-nbctl lsp-set-port-security ls1-lp_ext1 \
+"f0:00:00:00:00:03 10.0.0.6 ae70::6"
+ovn-nbctl lsp-set-type ls1-lp_ext1 external
+
+ovn-nbctl lsp-add ls1 ls1-lp_ext2 \
+-- lsp-set-addresses ls1-lp_ext2 "f0:00:00:00:00:04 10.0.0.7 ae70::7"
+ovn-nbctl lsp-set-port-security ls1-lp_ext2 \
+"f0:00:00:00:00:04 10.0.0.7 ae70::8"
+ovn-nbctl lsp-set-type ls1-lp_ext2 external
+
+d1="$(ovn-nbctl create DHCP_Options cidr=10.0.0.0/24 \
+options="\"server_id\"=\"10.0.0.1\" \"server_mac\"=\"ff:10:00:00:00:01\" \
+\"lease_time\"=\"3600\" \"router\"=\"10.0.0.1\"")"
+
+d2="$(ovn-nbctl create DHCP_Options cidr="ae70\:\:/64" \
+options="\"server_id\"=\"00:00:00:10:00:01\"")"
+
+ovn-nbctl lsp-set-dhcpv4-options ls1-lp1 ${d1}
+ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext1 ${d1}
+ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext2 ${d1}
+
+ovn-nbctl lsp-set-dhcpv6-options ls1-lp1 ${d2}
+ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext1 ${d2}
+ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext2 ${d2}
+
+# Create a logical router and connect it to ls1
+ovn-nbctl lr-add lr0
+ovn-nbctl lrp-add lr0 lr0-ls1 a0:10:00:00:00:01 10.0.0.1/24
+ovn-nbctl lsp-add ls1 ls1-lr0
+ovn-nbctl set Logical_Switch_Port ls1-lr0 type=router \
+    options:router-port=lr0-ls1 addresses=router
+
+as hv1
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+ovs-vsctl -- add-port br-phys hv1-ext1 -- \
+    set interface hv1-ext1 options:tx_pcap=hv1/ext1-tx.pcap \
+    options:rxq_pcap=hv1/ext1-rx.pcap \
+    ofport-request=2
+ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+
+as hv2
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.2
+ovs-vsctl -- add-port br-phys hv2-ext2 -- \
+    set interface hv2-ext2 options:tx_pcap=hv2/ext2-tx.pcap \
+    options:rxq_pcap=hv2/ext2-rx.pcap \
+    ofport-request=2
+ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+
+ovn-sbctl dump-flows > lflows_n.txt
+
+# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in hv1 and
+# hv2 as requested-chassis option is not set and no localnet port added to ls1.
+AT_CHECK([ovn-sbctl dump-flows ls1 | grep "offerip = 10.0.0.6" | \
+wc -l], [0], [0
+])
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep "0a.00.00.06" | wc -l], [0], [0
+])
+AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep "0a.00.00.06" | wc -l], [0], [0
+])
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep tp_src=546 | grep \
+"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
+])
+AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep tp_src=546 | grep \
+"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
+])
+
+hv1_uuid=$(ovn-sbctl list chassis hv1 | grep uuid | awk '{print $3}')
+
+# The port_binding row for ls1-lp_ext1 should have empty chassis
+chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
+grep -v requested | grep chassis | awk '{print $3}')
+
+AT_CHECK([test $chassis == "[[]]"], [0], [])
+
+# Set the requested-chassis option for ls1-lp_ext1
+ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1
+
+# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in hv1 and hv2
+# as no localnet port added to ls1 yet.
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep "0a.00.00.06" | wc -l], [0], [0
+])
+AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep "0a.00.00.06" | wc -l], [0], [0
+])
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep tp_src=546 | grep \
+"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
+])
+AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep tp_src=546 | grep \
+"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
+])
+
+# Add the localnet port to the logical switch ls1
+ovn-nbctl lsp-add ls1 ln-public
+ovn-nbctl lsp-set-addresses ln-public unknown
+ovn-nbctl lsp-set-type ln-public localnet
+ovn-nbctl --wait=hv lsp-set-options ln-public network_name=phys
+
+ln_public_key=$(ovn-sbctl list port_binding ln-public | grep  tunnel_key | \
+awk '{print $3}')
+
+# The ls1-lp_ext1 should be bound to hv1
+chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
+grep -v requested | grep chassis | awk '{print $3}')
+AT_CHECK([test $chassis == "$hv1_uuid"], [0], [])
+
+# There should be DHCPv4/v6 OF flows for the ls1-lp_ext1 port in hv1
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \
+wc -l], [0], [3
+])
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep tp_src=546 | grep \
+"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
+grep reg14=0x$ln_public_key | wc -l], [0], [1
+])
+
+# There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv2
+AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep "0a.00.00.06" | wc -l], [0], [0
+])
+AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep tp_src=546 | grep \
+"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0
+])
+
+# No DHCPv4/v6 flows for the external port - ls1-lp_ext2 - 10.0.0.7 in hv1 and
+# hv2 as requested-chassis option is not set.
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep "0a.00.00.07" | wc -l], [0], [0
+])
+AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep "0a.00.00.07" | wc -l], [0], [0
+])
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep tp_src=546 | grep \
+"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0
+])
+AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep tp_src=546 | grep \
+"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0
+])
+
+as hv1
+ovs-vsctl show
+
+# This shell function sends a DHCP request packet
+# test_dhcp INPORT SRC_MAC DHCP_TYPE OFFER_IP ...
+test_dhcp() {
+    local inport=$1 src_mac=$2 dhcp_type=$3 offer_ip=$4 use_ip=$5
+    shift; shift; shift; shift; shift;
+    if test $use_ip != 0; then
+        src_ip=$1
+        dst_ip=$2
+        shift; shift;
+    else
+        src_ip=`ip_to_hex 0 0 0 0`
+        dst_ip=`ip_to_hex 255 255 255 255`
+    fi
+    local request=ffffffffffff${src_mac}0800451001100000000080110000${src_ip}${dst_ip}
+    # udp header and dhcp header
+    request=${request}0044004300fc0000
+    request=${request}010106006359aa760000000000000000000000000000000000000000${src_mac}
+    # client hardware padding
+    request=${request}00000000000000000000
+    # server hostname
+    request=${request}0000000000000000000000000000000000000000000000000000000000000000
+    request=${request}0000000000000000000000000000000000000000000000000000000000000000
+    # boot file name
+    request=${request}0000000000000000000000000000000000000000000000000000000000000000
+    request=${request}0000000000000000000000000000000000000000000000000000000000000000
+    request=${request}0000000000000000000000000000000000000000000000000000000000000000
+    request=${request}0000000000000000000000000000000000000000000000000000000000000000
+    # dhcp magic cookie
+    request=${request}63825363
+    # dhcp message type
+    request=${request}3501${dhcp_type}ff
+
+    local srv_mac=$1 srv_ip=$2 expected_dhcp_opts=$3
+    # total IP length will be the IP length of the request packet
+    # (which is 272 in our case) + 8 (padding bytes) + (expected_dhcp_opts / 2)
+    ip_len=`expr 280 + ${#expected_dhcp_opts} / 2`
+    udp_len=`expr $ip_len - 20`
+    ip_len=$(printf "%x" $ip_len)
+    udp_len=$(printf "%x" $udp_len)
+    # $ip_len var will be in 3 digits i.e 134. So adding a '0' before $ip_len
+    local reply=${src_mac}${srv_mac}080045100${ip_len}000000008011XXXX${srv_ip}${offer_ip}
+    # udp header and dhcp header.
+    # $udp_len var will be in 3 digits. So adding a '0' before $udp_len
+    reply=${reply}004300440${udp_len}0000020106006359aa760000000000000000
+    # your ip address
+    reply=${reply}${offer_ip}
+    # next server ip address, relay agent ip address, client mac address
+    reply=${reply}0000000000000000${src_mac}
+    # client hardware padding
+    reply=${reply}00000000000000000000
+    # server hostname
+    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
+    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
+    # boot file name
+    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
+    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
+    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
+    reply=${reply}0000000000000000000000000000000000000000000000000000000000000000
+    # dhcp magic cookie
+    reply=${reply}63825363
+    # dhcp message type
+    local dhcp_reply_type=02
+    if test $dhcp_type = 03; then
+        dhcp_reply_type=05
+    fi
+    reply=${reply}3501${dhcp_reply_type}${expected_dhcp_opts}00000000ff00000000
+    echo $reply >> ext1_v4.expected
+
+    as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport} $request
+}
+
+
+trim_zeros() {
+    sed 's/\(00\)\{1,\}$//'
+}
+
+# This shell function sends a DHCPv6 request packet
+# test_dhcpv6 INPORT SRC_MAC SRC_LLA DHCPv6_MSG_TYPE OFFER_IP OUTPORT...
+# The OUTPORTs (zero or more) list the VIFs on which the original DHCPv6
+# packet should be received twice (one from ovn-controller and the other
+# from the "ovs-ofctl monitor br-int resume"
+test_dhcpv6() {
+    local inport=$1 src_mac=$2 src_lla=$3 msg_code=$4 offer_ip=$5
+    local req_pkt_in_expected=$6
+    local request=ffffffffffff${src_mac}86dd00000000002a1101${src_lla}
+    # dst ip ff02::1:2
+    request=${request}ff020000000000000000000000010002
+    # udp header and dhcpv6 header
+    request=${request}02220223002affff${msg_code}010203
+    # Client identifier
+    request=${request}0001000a00030001${src_mac}
+    # IA-NA (Identity Association for Non Temporary Address)
+    request=${request}0003000c0102030400000e1000001518
+    shift; shift; shift; shift; shift;
+
+    local server_mac=000000100001
+    local server_lla=fe80000000000000020000fffe100001
+    local reply_code=07
+    if test $msg_code = 01; then
+        reply_code=02
+    fi
+    local msg_len=54
+    if test $offer_ip = 1; then
+        msg_len=28
+    fi
+    local reply=${src_mac}${server_mac}86dd0000000000${msg_len}1101
+    reply=${reply}${server_lla}${src_lla}
+
+    # udp header and dhcpv6 header
+    reply=${reply}0223022200${msg_len}ffff${reply_code}010203
+    # Client identifier
+    reply=${reply}0001000a00030001${src_mac}
+    # IA-NA
+    if test $offer_ip != 1; then
+        reply=${reply}0003002801020304ffffffffffffffff00050018${offer_ip}
+        reply=${reply}ffffffffffffffff
+    fi
+    # Server identifier
+    reply=${reply}0002000a00030001${server_mac}
+
+    echo $reply | trim_zeros >> ext${inport}_v6.expected
+    # The inport also receives the request packet since it is connected
+    # to the br-phys.
+    #echo $request >> ext${inport}_v6.expected
+
+    as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport} $request
+}
+
+reset_pcap_file() {
+    local iface=$1
+    local pcap_file=$2
+    ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
+options:rxq_pcap=dummy-rx.pcap
+    rm -f ${pcap_file}*.pcap
+    ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \
+options:rxq_pcap=${pcap_file}-rx.pcap
+}
+
+ip_to_hex() {
+    printf "%02x%02x%02x%02x" "$@"
+}
+
+AT_CAPTURE_FILE([ofctl_monitor0_hv1.log])
+as hv1 ovs-ofctl monitor br-int resume --detach --no-chdir \
+--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv1.log
+
+AT_CAPTURE_FILE([ofctl_monitor0_hv2.log])
+as hv2 ovs-ofctl monitor br-int resume --detach --no-chdir \
+--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv2.log
+
+# Send DHCPDISCOVER.
+offer_ip=`ip_to_hex 10 0 0 6`
+server_ip=`ip_to_hex 10 0 0 1`
+server_mac=ff1000000001
+expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001
+test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \
+$expected_dhcp_opts
+
+# NXT_RESUMEs should be 1 in hv1.
+OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
+
+# NXT_RESUMEs should be 0 in hv2.
+OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
+
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_v4.packets
+cat ext1_v4.expected | cut -c -48 > expout
+AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout])
+# Skipping the IPv4 checksum.
+cat ext1_v4.expected | cut -c 53- > expout
+AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout])
+
+# ovs-ofctl also resumes the packets and this causes other ports to receive
+# the DHCP request packet. So reset the pcap files so that its easier to test.
+reset_pcap_file hv1-ext1 hv1/ext1
+rm -f ext1_v4.expected
+rm -f ext1_v4.packets
+
+# Send DHCPv6 request
+src_mac=f00000000003
+src_lla=fe80000000000000f20000fffe000003
+offer_ip=ae700000000000000000000000000006
+test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip
+
+# NXT_RESUMEs should be 2 in hv1.
+OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
+
+# NXT_RESUMEs should be 0 in hv2.
+OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
+
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \
+sort > ext1_v6.packets
+cat ext1_v6.expected | cut -c -120 > expout
+AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout])
+# Skipping the UDP checksum
+cat ext1_v6.expected | cut -c 125- > expout
+AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout])
+
+rm -f ext1_v6.expected
+rm -f ext1_v6.packets
+reset_pcap_file hv1-ext1 hv1/ext1
+
+# Change the requested-chassis option for ls1-lp_ext1 from hv1 to hv2
+ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv2
+
+hv2_uuid=$(ovn-sbctl list chassis hv2 | grep uuid | awk '{print $3}')
+
+# The ls1-lp_ext1 should be bound to hv2
+chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \
+grep -v requested | grep chassis | awk '{print $3}')
+AT_CHECK([test $chassis == "$hv2_uuid"], [0], [])
+
+# There should be OF flows for DHCP4/v6 for the ls1-lp_ext1 port in hv2
+AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \
+wc -l], [0], [3
+])
+AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep tp_src=546 | grep \
+"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
+grep reg14=0x$ln_public_key | wc -l], [0], [1
+])
+
+# There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv1
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep "0a.00.00.06" | wc -l], [0], [0
+])
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \
+grep controller | grep tp_src=546 | grep \
+"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \
+grep reg14=0x$ln_public_key | wc -l], [0], [0
+])
+
+# Send DHCPDISCOVER again for hv1/ext1. The DHCP response should come from
+# hv2 ovn-controller. Due to the test setup, the port hv1/ext1 is also
+# receiving the expected packet.
+offer_ip=`ip_to_hex 10 0 0 6`
+server_ip=`ip_to_hex 10 0 0 1`
+server_mac=ff1000000001
+expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001
+test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \
+$expected_dhcp_opts
+
+# NXT_RESUMEs should be 2 in hv1.
+OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
+
+# NXT_RESUMEs should be 1 in hv2.
+OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
+
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_v4.packets
+cat ext1_v4.expected | cut -c -48 > expout
+AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout])
+# Skipping the IPv4 checksum.
+cat ext1_v4.expected | cut -c 53- > expout
+AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout])
+
+# ovs-ofctl also resumes the packets and this causes other ports to receive
+# the DHCP request packet. So reset the pcap files so that its easier to test.
+reset_pcap_file hv1-ext1 hv1/ext1
+rm -f ext1_v4.expected
+
+# Send DHCPv6 request again
+src_mac=f00000000003
+src_lla=fe80000000000000f20000fffe000003
+offer_ip=ae700000000000000000000000000006
+test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip 1
+
+# NXT_RESUMEs should be 2 in hv1.
+OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`])
+
+# NXT_RESUMEs should be 2 in hv2.
+OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`])
+
+as hv1
+ovs-vsctl show
+ovs-ofctl dump-flows br-int
+
+as hv2
+ovs-vsctl show
+ovs-ofctl dump-flows br-int
+
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \
+sort > ext1_v6.packets
+cat ext1_v6.expected | cut -c -120 > expout
+AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout])
+# Skipping the UDP checksum
+cat ext1_v6.expected | cut -c 125- > expout
+AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout])
+
+rm -f ext1_v6.expected
+rm -f ext1_v6.packets
+
+as hv1
+ovs-vsctl show
+reset_pcap_file hv1-ext1 hv1/ext1
+reset_pcap_file br-phys_n1 hv1/br-phys_n1
+reset_pcap_file br-phys hv1/br-phys
+
+as hv2
+ovs-vsctl show
+reset_pcap_file hv2-ext2 hv2/ext2
+reset_pcap_file br-phys_n1 hv2/br-phys_n1
+reset_pcap_file br-phys hv2/br-phys
+
+# From  ls1-lp_ext1, send ARP request for the router ip. The ARP
+# response should come from the router pipeline of hv2.
+ext1_mac=f00000000003
+router_mac=a01000000001
+ext1_ip=`ip_to_hex 10 0 0 6`
+router_ip=`ip_to_hex 10 0 0 1`
+arp_request=ffffffffffff${ext1_mac}08060001080006040001${ext1_mac}${ext1_ip}000000000000${router_ip}
+
+as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request
+expected_response=${src_mac}${router_mac}08060001080006040002${router_mac}${router_ip}${ext1_mac}${ext1_ip}
+echo $expected_response > expout
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_arp_resp
+AT_CHECK([cat ext1_arp_resp], [0], [expout])
+
+# Verify that the response came from hv2
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap > ext1_arp_resp
+AT_CHECK([cat ext1_arp_resp], [0], [expout])
+
+
+# # Change the requested-chassis option for ls1-lp_ext1 from hv2 to hv1
+ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1
+
+as hv1
+ovs-vsctl show
+reset_pcap_file hv1-ext1 hv1/ext1
+reset_pcap_file br-phys_n1 hv1/br-phys_n1
+reset_pcap_file br-phys hv1/br-phys
+
+as hv2
+ovs-vsctl show
+reset_pcap_file hv2-ext2 hv2/ext2
+reset_pcap_file br-phys_n1 hv2/br-phys_n1
+reset_pcap_file br-phys hv2/br-phys
+
+as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request
+
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_arp_resp
+AT_CHECK([cat ext1_arp_resp], [0], [expout])
+
+# Verify that the response didn't come from hv2
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap > ext1_arp_resp
+AT_CHECK([cat ext1_arp_resp], [0], [])
+
+OVN_CLEANUP([hv1],[hv2])
+AT_CLEANUP
+
 AT_SETUP([ovn -- ovn-controller restart])
 AT_SKIP_IF([test $HAVE_PYTHON = no])
 ovn_start