Message ID | 20210805153934.3865265-3-hzhou@ovn.org |
---|---|
State | Accepted |
Headers | show |
Series | Multiple distributed gateway port support. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | fail | github build: failed |
ovsrobot/github-robot-_ovn-kubernetes | fail | github build: failed |
On Thu, Aug 5, 2021 at 11:40 AM Han Zhou <hzhou@ovn.org> wrote: > > From: Ankur Sharma <ankurmnnit2004@gmail.com> > > By default, OVN support only one DGP (distributed gateway port) per > logical router. While a single DGP port suffices for most of the North > South connectivity, there are requirements where a logical router could > be connected to multiple external networks and based on routing decision > packet could go to different ones. > > This patch adds flexibility of having multiple DGPs per logical router. > > Changes can classified as following: > a. Data structure changes to allow multiple DGPs per ovn_datapath. > > b. Consumption of new data structure in logical flows for > individual features. > > c. Features that require changes are: > i. Regular NS traffic flow. > ii. Network Address Translation. > iii. Load Balancer > iv. Gateway_mtu. > v. reside-on-redirect-chassis > vi. Misc code sections that assumed a single DGP. > > d. Except for reside-on-redirect-chassis all the other features > could be extended to multiple DGPs. Reside on redirect > chassis with its current specification could not be extended > and hence should be used only with the logical router that > has a single DGP. > > This patch doesn't support NAT & load-balancer features for multiple > DGPs yet, but added validations that disables NAT/load-balancer > features when there are more than one DGP configured per router. > > Signed-off-by: Ankur Sharma <ankurmnnit2004@gmail.com> > Co-authored-by: Dhathri Purohith <dhathri.purohith@nutanix.com> > Signed-off-by: Dhathri Purohith <dhathri.purohith@nutanix.com> > Co-authored-by: Abhiram Sangana <sangana.abhiram@nutanix.com> > Signed-off-by: Abhiram Sangana <sangana.abhiram@nutanix.com> > Co-authored-by: Han Zhou <hzhou@ovn.org> > Signed-off-by: Han Zhou <hzhou@ovn.org> Hi Han, Thanks for v2. I did some testing with this patch in a simple 2 node setup (using ovn-fake-multinode) Below are the logical resources created ----------------------------- [root@ovn-central ~]# ovn-nbctl show switch 3a3c2522-fcce-49e5-8334-8a72547e7da6 (sw0) port sw0-port4 addresses: ["50:54:00:00:00:06 dynamic"] port sw0-port1 addresses: ["50:54:00:00:00:03 10.0.0.3 1000::3"] port sw0-port2 addresses: ["50:54:00:00:00:04 10.0.0.4 1000::4"] port sw0-lr0 type: router router-port: lr0-sw0 port sw0-port3 addresses: ["50:54:00:00:00:05 dynamic"] switch 0573bbd7-fca7-4f06-84ac-1939f879fd5f (sw1) port sw1-lr0 type: router router-port: lr0-sw1 port sw1-port1 addresses: ["40:54:00:00:00:03 20.0.0.3 2000::3"] router c7cd8dab-4e6d-45f3-a8f4-3653d25ab476 (lr0) port lr0-sw1 mac: "00:00:00:00:ff:02" networks: ["20.0.0.1/24", "2000::a/64"] port lr0-sw0 mac: "00:00:00:00:ff:01" networks: ["10.0.0.1/24", "1000::a/64"] gateway chassis: [ovn-chassis-1 ovn-chassis-2] [root@ovn-central ~]# [root@ovn-central ~]# [root@ovn-central ~]# ovn-sbctl show Chassis ovn-chassis-1 hostname: ovn-chassis-1 Encap geneve ip: "170.168.0.4" options: {csum="true"} Port_Binding sw0-port1 Port_Binding sw0-port3 Chassis ovn-gw-1 hostname: ovn-gw-1 Encap geneve ip: "170.168.0.3" options: {csum="true"} Chassis ovn-chassis-2 hostname: ovn-chassis-2 Encap geneve ip: "170.168.0.5" options: {csum="true"} Port_Binding sw1-port1 Port_Binding sw0-port4 Port_Binding cr-lr0-sw0 --------- As you can see the logical router port lr0-sw0 is a distributed gw port scheduled on chassis - ovn-chassis-2. The issue I see is : I'm not able to ping from sw0-port1 (10.0.0.3, claimed by chassis ovn-chassis-1) to 10.0.0.1 and I'm not able to ping to sw1-port1 (20.0.0.3). I'm able to ping the same from sw0-port4 to 10.0.0.1 and 20.0.0.3 (claimed by chassis ovn-chassis-2). If I move cr-lr0-sw0 to ovn-chassis-1, then ping from sw0-port1 works but not from sw0-port4. Is this expected ? Does it mean that if a logical router has multiple gateway router ports connecting to geneve logical switches, then all the logical ports of those switches should be only bound on the corresponding gateway chassis ? I understand this is the case with ovn-k8s. If this is a restriction, I think we should document it. Otherwise the patch LGTM and I'm fine with the feature. The only issue I see is that of semantics. If router port is a gateway router port, then its a peer's logical switch is ideally expected to have a localnet port. But in the case of ovn-k8s, that's not the case. It is for this reason I thought pinning the logical switch to a particular chassis could be better. If you think having multiple gw router ports is better, then I'd suggest documenting the limitations I mentioned above. Also please see a small nit below. And you can consider my Ack with these addressed. Acked-by: Numan Siddique <numans@ovn.org> Thanks Numan Thanks Numan > --- > NEWS | 3 + > northd/lrouter.dl | 103 +++++------- > northd/ovn-northd.8.xml | 6 +- > northd/ovn-northd.c | 356 +++++++++++++++++++++++----------------- > northd/ovn_northd.dl | 169 ++++++++++--------- > ovn-architecture.7.xml | 19 ++- > ovn-nb.xml | 27 ++- > tests/ovn-northd.at | 82 +++++++++ > tests/ovn.at | 307 ++++++++++++++++++++++++++++++++++ > 9 files changed, 771 insertions(+), 301 deletions(-) > > diff --git a/NEWS b/NEWS > index f328666da..9f701caa7 100644 > --- a/NEWS > +++ b/NEWS > @@ -35,6 +35,9 @@ OVN v21.06.0 - 18 Jun 2021 > "ovn-trim-limit-lflow-cache" and "ovn-trim-wmark-perc-lflow-cache", to > allow enforcing a lflow cache size limit and high watermark percentage > for which automatic memory trimming is performed. > + - Support multiple distributed gateway ports on a single logical router. > + (NAT and load-balancer are not supported yet when there are multiple > + distributed gateway ports). > > OVN v21.03.0 - 12 Mar 2021 > ------------------------- > diff --git a/northd/lrouter.dl b/northd/lrouter.dl > index 4a24f3f61..d37350ab8 100644 > --- a/northd/lrouter.dl > +++ b/northd/lrouter.dl > @@ -138,14 +138,14 @@ Warning[message] :- > var message = "Bad configuration: distributed gateway port configured on " > "port ${lrp.name} on L3 gateway router". > > -/* DistributedGatewayPortCandidate. > +/* Distributed gateway ports. > * > - * Each row pairs a logical router with its distributed gateway port, > - * but without checking that there is at most one DGP per LR. > + * Each row means 'lrp' is a distributed gateway port on 'lr_uuid'. > * > - * (Use DistributedGatewayPort instead, since it guarantees uniqueness.) */ > -relation DistributedGatewayPortCandidate(lr_uuid: uuid, lrp_uuid: uuid) > -DistributedGatewayPortCandidate(lr_uuid, lrp_uuid) :- > + * A logical router can have multiple distributed gateway ports. */ > +relation DistributedGatewayPort(lrp: Intern<nb::Logical_Router_Port>, > + lr_uuid: uuid) > +DistributedGatewayPort(lrp, lr_uuid) :- > lr in nb::Logical_Router(._uuid = lr_uuid), > LogicalRouterPort(lrp_uuid, lr._uuid), > lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid), > @@ -153,30 +153,10 @@ DistributedGatewayPortCandidate(lr_uuid, lrp_uuid) :- > var has_hcg = lrp.ha_chassis_group.is_some(), > var has_gc = not lrp.gateway_chassis.is_empty(), > has_hcg or has_gc. > -Warning[message] :- > - DistributedGatewayPortCandidate(lr_uuid, lrp_uuid), > - var lrps = lrp_uuid.group_by(lr_uuid).to_set(), > - lrps.size() > 1, > - lr in nb::Logical_Router(._uuid = lr_uuid), > - var message = "Bad configuration: multiple distributed gateway ports on " > - "logical router ${lr.name}; ignoring all of them". > - > -/* Distributed gateway ports. > - * > - * Each row means 'lrp' is the distributed gateway port on 'lr_uuid'. > - * > - * There is at most one distributed gateway port per logical router. */ > -relation DistributedGatewayPort(lrp: Intern<nb::Logical_Router_Port>, lr_uuid: uuid) > -DistributedGatewayPort(lrp, lr_uuid) :- > - DistributedGatewayPortCandidate(lr_uuid, lrp_uuid), > - var lrps = lrp_uuid.group_by(lr_uuid).to_set(), > - lrps.size() == 1, > - Some{var lrp_uuid} = lrps.nth(0), > - lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid). > > /* HAChassis is an abstraction over nb::Gateway_Chassis and nb::HA_Chassis, which > * are different ways to represent the same configuration. Each row is > - * effectively one HA_Chassis record. (Usually, we could associated each > + * effectively one HA_Chassis record. (Usually, we could associate each > * row with a particular 'lr_uuid', but it's permissible for more than one > * logical router to use a HA chassis group, so we omit it so that multiple > * references get merged.) > @@ -236,18 +216,20 @@ HAChassisGroup(ha_chassis_group_uuid(hac_group_uuid), > .name = name, > .external_ids = external_ids). > > -/* Each row maps from a logical router to the name of its HAChassisGroup. > - * This level of indirection is needed because multiple logical routers > - * are allowed to reference a given HAChassisGroup. */ > -relation LogicalRouterHAChassisGroup(lr_uuid: uuid, > - hacg_uuid: uuid) > -LogicalRouterHAChassisGroup(lr_uuid, ha_chassis_group_uuid(lrp._uuid)) :- > - DistributedGatewayPort(lrp, lr_uuid), > +/* Each row maps from a distributed gateway logical router port to the name of > + * its HAChassisGroup. > + * This level of indirection is needed because multiple distributed gateway > + * logical router ports are allowed to reference a given HAChassisGroup. */ > +relation DistributedGatewayPortHAChassisGroup( > + lrp: Intern<nb::Logical_Router_Port>, > + hacg_uuid: uuid) > +DistributedGatewayPortHAChassisGroup(lrp, ha_chassis_group_uuid(lrp._uuid)) :- > + DistributedGatewayPort(.lrp = lrp), > lrp.ha_chassis_group == None, > lrp.gateway_chassis.size() > 0. > -LogicalRouterHAChassisGroup(lr_uuid, > - ha_chassis_group_uuid(hac_group_uuid)) :- > - DistributedGatewayPort(lrp, lr_uuid), > +DistributedGatewayPortHAChassisGroup(lrp, > + ha_chassis_group_uuid(hac_group_uuid)) :- > + DistributedGatewayPort(.lrp = lrp), > Some{var hac_group_uuid} = lrp.ha_chassis_group, > nb::HA_Chassis_Group(._uuid = hac_group_uuid). > > @@ -259,14 +241,19 @@ RouterPortIsRedirect(lrp, false) :- > &nb::Logical_Router_Port(._uuid = lrp), > not DistributedGatewayPort(&nb::Logical_Router_Port{._uuid = lrp}, _). > > -relation LogicalRouterRedirectPort(lr: uuid, has_redirect_port: Option<Intern<nb::Logical_Router_Port>>) > - > -LogicalRouterRedirectPort(lr, Some{lrp}) :- > - DistributedGatewayPort(lrp, lr). > - > -LogicalRouterRedirectPort(lr, None) :- > - nb::Logical_Router(._uuid = lr), > - not DistributedGatewayPort(_, lr). > +/* > + * LogicalRouterDGWPorts maps from each logical router UUID > + * to the logical router's set of distributed gateway (or redirect) ports. */ > +relation LogicalRouterDGWPorts( > + lr_uuid: uuid, > + l3dgw_ports: Vec<Intern<nb::Logical_Router_Port>>) > +LogicalRouterDGWPorts(lr_uuid, l3dgw_ports) :- > + DistributedGatewayPort(lrp, lr_uuid), > + var l3dgw_ports = lrp.group_by(lr_uuid).to_vec(). > +LogicalRouterDGWPorts(lr_uuid, vec_empty()) :- > + lr in nb::Logical_Router(), > + var lr_uuid = lr._uuid, > + not DistributedGatewayPort(_, lr_uuid). > > typedef ExceptionalExtIps = AllowedExtIps{ips: Intern<nb::Address_Set>} > | ExemptedExtIps{ips: Intern<nb::Address_Set>} > @@ -450,9 +437,7 @@ LogicalRouterCopp0(lr, meters) :- > > /* Router relation collects all attributes of a logical router. > * > - * `l3dgw_port` - optional redirect port (see `DistributedGatewayPort`) > - * `redirect_port_name` - derived redirect port name (or empty string if > - * router does not have a redirect port) > + * `l3dgw_ports` - optional redirect ports (see `DistributedGatewayPort`) > * `is_gateway` - true iff the router is a gateway router. Together with > * `l3dgw_port`, this flag affects the generation of various flows > * related to NAT and load balancing. > @@ -474,8 +459,7 @@ typedef Router = Router { > external_ids: Map<string,string>, > > /* Additional computed fields. */ > - l3dgw_port: Option<Intern<nb::Logical_Router_Port>>, > - redirect_port_name: string, > + l3dgw_ports: Vec<Intern<nb::Logical_Router_Port>>, > is_gateway: bool, > nats: Vec<NAT>, > snat_ips: Map<v46_ip, Set<NAT>>, > @@ -498,23 +482,18 @@ Router[Router{ > .options = lr.options, > .external_ids = lr.external_ids, > > - .l3dgw_port = l3dgw_port, > - .redirect_port_name = > - match (l3dgw_port) { > - Some{rport} -> json_string_escape(chassis_redirect_name(rport.name)), > - _ -> "" > - }, > - .is_gateway = lr.options.contains_key("chassis"), > - .nats = nats, > - .snat_ips = snat_ips, > - .lbs = lbs, > - .mcast_cfg = mcast_cfg, > + .l3dgw_ports = l3dgw_ports, > + .is_gateway = lr.options.contains_key("chassis"), > + .nats = nats, > + .snat_ips = snat_ips, > + .lbs = lbs, > + .mcast_cfg = mcast_cfg, > .learn_from_arp_request = learn_from_arp_request, > .force_lb_snat = force_lb_snat, > .copp = copp}.intern()] :- > lr in nb::Logical_Router(), > lr.is_enabled(), > - LogicalRouterRedirectPort(lr._uuid, l3dgw_port), > + LogicalRouterDGWPorts(lr._uuid, l3dgw_ports), > LogicalRouterNATs(lr._uuid, nats), > LogicalRouterLBs(lr._uuid, lbs), > LogicalRouterSnatIPs(lr._uuid, snat_ips), > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > index de4fe90c7..5afae743f 100644 > --- a/northd/ovn-northd.8.xml > +++ b/northd/ovn-northd.8.xml > @@ -3815,10 +3815,10 @@ icmp6 { > <h3>Ingress Table 17: Gateway Redirect</h3> > > <p> > - For distributed logical routers where one of the logical router > + For distributed logical routers where one or more of the logical router > ports specifies a gateway chassis, this table redirects > - certain packets to the distributed gateway port instance on the > - gateway chassis. This table has the following flows: > + certain packets to the distributed gateway port instances on the > + gateway chassises. This table has the following flows: > </p> > > <ul> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index 605e33486..b7398004d 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -655,13 +655,12 @@ struct ovn_datapath { > bool is_gw_router; > > /* OVN northd only needs to know about the logical router gateway port for > - * NAT on a distributed router. This "distributed gateway port" is > - * populated only when there is a gateway chassis specified for one of > - * the ports on the logical router. Otherwise this will be NULL. */ > - struct ovn_port *l3dgw_port; > - /* The "derived" OVN port representing the instance of l3dgw_port on > - * the gateway chassis. */ > - struct ovn_port *l3redirect_port; > + * NAT on a distributed router. The "distributed gateway ports" are > + * populated only when there is a gateway chassis or ha chassis group > + * specified for some of the ports on the logical router. Otherwise this > + * will be NULL. */ > + struct ovn_port **l3dgw_ports; > + size_t n_l3dgw_ports; > > /* NAT entries configured on the router. */ > struct ovn_nat *nat_entries; > @@ -802,6 +801,16 @@ init_nat_entries(struct ovn_datapath *od) > return; > } > > + if (od->n_l3dgw_ports > 1) { > + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); > + VLOG_WARN_RL(&rl, "NAT is configured on logical router %s, which has %" > + PRIuSIZE" distributed gateway ports. NAT is not supported" > + " yet when there is more than one distributed gateway " > + "port on the router.", > + od->nbr->name, od->n_l3dgw_ports); > + return; > + } > + > od->nat_entries = xmalloc(od->nbr->n_nat * sizeof *od->nat_entries); > > for (size_t i = 0; i < od->nbr->n_nat; i++) { > @@ -941,6 +950,7 @@ ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od) > destroy_lb_ips(od); > free(od->nat_entries); > free(od->localnet_ports); > + free(od->l3dgw_ports); > ovn_ls_port_group_destroy(&od->nb_pgs); > destroy_mcast_info_for_datapath(od); > > @@ -1489,9 +1499,18 @@ struct ovn_port { > /* Logical port multicast data. */ > struct mcast_port_info mcast_info; > > - /* This is ordinarily false. It is true if and only if this ovn_port is > - * derived from a chassis-redirect port. */ > - bool derived; > + /* At most one of l3dgw_port and cr_port can be not NULL. */ > + > + /* This is set to a distributed gateway port if and only if this ovn_port > + * is "derived" from it. Otherwise this is set to NULL. The derived > + * ovn_port represents the instance of distributed gateway port on the > + * gateway chassis.*/ > + struct ovn_port *l3dgw_port; > + > + /* This is set to the "derived" chassis-redirect port of this port if and > + * only if this port is a distributed gateway port. Otherwise this is set > + * to NULL. */ > + struct ovn_port *cr_port; > > bool has_unknown; /* If the addresses have 'unknown' defined. */ > > @@ -1512,6 +1531,18 @@ struct ovn_port { > struct ovs_list list; /* In list of similar records. */ > }; > > +static bool > +is_l3dgw_port(const struct ovn_port *op) > +{ > + return op->cr_port; Since the return type is bool, I'd suggest - return !!op->cr_port; > +} > + > +static bool > +is_cr_port(const struct ovn_port *op) > +{ > + return op->l3dgw_port; Same as above. > +} > + > static void > destroy_routable_addresses(struct ovn_port_routable_addresses *ra) > { > @@ -1578,7 +1609,7 @@ ovn_port_create(struct hmap *ports, const char *key, > op->key = xstrdup(key); > op->sb = sb; > ovn_port_set_nb(op, nbsp, nbrp); > - op->derived = false; > + op->l3dgw_port = op->cr_port = NULL; > hmap_insert(ports, &op->key_node, hash_string(op->key, 0)); > return op; > } > @@ -1682,7 +1713,7 @@ lrport_is_enabled(const struct nbrec_logical_router_port *lrport) > static struct ovn_port * > ovn_port_get_peer(struct hmap *ports, struct ovn_port *op) > { > - if (!op->nbsp || !lsp_is_router(op->nbsp) || op->derived) { > + if (!op->nbsp || !lsp_is_router(op->nbsp) || op->l3dgw_port) { > return NULL; > } > > @@ -2426,6 +2457,7 @@ join_logical_ports(struct northd_context *ctx, > tag_alloc_add_existing_tags(tag_alloc_table, nbsp); > } > } else { > + size_t n_allocated_l3dgw_ports = 0; > for (size_t i = 0; i < od->nbr->n_ports; i++) { > const struct nbrec_logical_router_port *nbrp > = od->nbr->ports[i]; > @@ -2481,36 +2513,32 @@ join_logical_ports(struct northd_context *ctx, > "on L3 gateway router", nbrp->name); > continue; > } > - if (od->l3dgw_port || od->l3redirect_port) { > - static struct vlog_rate_limit rl > - = VLOG_RATE_LIMIT_INIT(1, 1); > - VLOG_WARN_RL(&rl, "Bad configuration: multiple " > - "distributed gateway ports on logical " > - "router %s", od->nbr->name); > - continue; > - } > > char *redirect_name = > ovn_chassis_redirect_name(nbrp->name); > struct ovn_port *crp = ovn_port_find(ports, redirect_name); > if (crp && crp->sb && crp->sb->datapath == od->sb) { > - crp->derived = true; > ovn_port_set_nb(crp, NULL, nbrp); > ovs_list_remove(&crp->list); > ovs_list_push_back(both, &crp->list); > } else { > crp = ovn_port_create(ports, redirect_name, > NULL, nbrp, NULL); > - crp->derived = true; > ovs_list_push_back(nb_only, &crp->list); > } > + crp->l3dgw_port = op; > + op->cr_port = crp; > crp->od = od; > free(redirect_name); > > - /* Set l3dgw_port and l3redirect_port in od, for later > - * use during flow creation. */ > - od->l3dgw_port = op; > - od->l3redirect_port = crp; > + /* Add to l3dgw_ports in od, for later use during flow > + * creation. */ > + if (od->n_l3dgw_ports == n_allocated_l3dgw_ports) { > + od->l3dgw_ports = x2nrealloc(od->l3dgw_ports, > + &n_allocated_l3dgw_ports, > + sizeof *od->l3dgw_ports); > + } > + od->l3dgw_ports[od->n_l3dgw_ports++] = op; > > assign_routable_addresses(op); > } > @@ -2522,7 +2550,7 @@ join_logical_ports(struct northd_context *ctx, > * to their peers. */ > struct ovn_port *op; > HMAP_FOR_EACH (op, key_node, ports) { > - if (op->nbsp && lsp_is_router(op->nbsp) && !op->derived) { > + if (op->nbsp && lsp_is_router(op->nbsp) && !op->l3dgw_port) { > struct ovn_port *peer = ovn_port_get_peer(ports, op); > if (!peer || !peer->nbrp) { > continue; > @@ -2553,7 +2581,7 @@ join_logical_ports(struct northd_context *ctx, > if (peer->od && peer->od->mcast_info.rtr.relay) { > op->od->mcast_info.sw.flood_relay = true; > } > - } else if (op->nbrp && op->nbrp->peer && !op->derived) { > + } else if (op->nbrp && op->nbrp->peer && !op->l3dgw_port) { > struct ovn_port *peer = ovn_port_find(ports, op->nbrp->peer); > if (peer) { > if (peer->nbrp) { > @@ -2598,7 +2626,8 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, bool routable_only) > struct eth_addr mac; > if (!op || !op->nbrp || !op->od || !op->od->nbr > || (!op->od->nbr->n_nat && !op->od->nbr->n_load_balancer) > - || !eth_addr_from_string(op->nbrp->mac, &mac)) { > + || !eth_addr_from_string(op->nbrp->mac, &mac) > + || op->od->n_l3dgw_ports > 1) { > *n = n_nats; > return NULL; > } > @@ -2629,7 +2658,7 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, bool routable_only) > > /* Determine whether this NAT rule satisfies the conditions for > * distributed NAT processing. */ > - if (op->od->l3redirect_port && !strcmp(nat->type, "dnat_and_snat") > + if (op->od->n_l3dgw_ports && !strcmp(nat->type, "dnat_and_snat") > && nat->logical_port && nat->external_mac) { > /* Distributed NAT rule. */ > if (eth_addr_from_string(nat->external_mac, &mac)) { > @@ -2695,9 +2724,9 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, bool routable_only) > if (central_ip_address) { > /* Gratuitous ARP for centralized NAT rules on distributed gateway > * ports should be restricted to the gateway chassis. */ > - if (op->od->l3redirect_port) { > + if (op->od->n_l3dgw_ports) { > ds_put_format(&c_addresses, " is_chassis_resident(%s)", > - op->od->l3redirect_port->json_key); > + op->od->l3dgw_ports[0]->cr_port->json_key); > } > > addresses[n_nats++] = ds_steal_cstr(&c_addresses); > @@ -3010,7 +3039,7 @@ ovn_port_update_sbrec(struct northd_context *ctx, > /* If the router is for l3 gateway, it resides on a chassis > * and its port type is "l3gateway". */ > const char *chassis_name = smap_get(&op->od->nbr->options, "chassis"); > - if (op->derived) { > + if (is_cr_port(op)) { > sbrec_port_binding_set_type(op->sb, "chassisredirect"); > } else if (chassis_name) { > sbrec_port_binding_set_type(op->sb, "l3gateway"); > @@ -3020,7 +3049,7 @@ ovn_port_update_sbrec(struct northd_context *ctx, > > struct smap new; > smap_init(&new); > - if (op->derived) { > + if (is_cr_port(op)) { > const char *redirect_type = smap_get(&op->nbrp->options, > "redirect-type"); > > @@ -3200,7 +3229,7 @@ ovn_port_update_sbrec(struct northd_context *ctx, > char **nats = NULL; > if (nat_addresses && !strcmp(nat_addresses, "router")) { > if (op->peer && op->peer->od > - && (chassis || op->peer->od->l3redirect_port)) { > + && (chassis || op->peer->od->n_l3dgw_ports)) { > nats = get_nat_addresses(op->peer, &n_nats, false); > } > /* Only accept manual specification of ethernet address > @@ -3236,12 +3265,26 @@ ovn_port_update_sbrec(struct northd_context *ctx, > * sending the GARPs for the router port IPs. > * */ > bool add_router_port_garp = false; > - if (op->peer && op->peer->nbrp && op->peer->od->l3dgw_port && > - op->peer->od->l3redirect_port && > - (smap_get_bool(&op->peer->nbrp->options, > - "reside-on-redirect-chassis", false) || > - op->peer == op->peer->od->l3dgw_port)) { > - add_router_port_garp = true; > + if (op->peer && op->peer->nbrp && op->peer->od->n_l3dgw_ports) { > + if (is_l3dgw_port(op->peer)) { > + add_router_port_garp = true; > + } else if (smap_get_bool(&op->peer->nbrp->options, > + "reside-on-redirect-chassis", false)) { > + if (op->peer->od->n_l3dgw_ports == 1) { > + add_router_port_garp = true; > + } else { > + static struct vlog_rate_limit rl = > + VLOG_RATE_LIMIT_INIT(1, 1); > + VLOG_WARN_RL(&rl, "\"reside-on-redirect-chassis\" is " > + "set on logical router port %s, which " > + "is on logical router %s, which has %" > + PRIuSIZE" distributed gateway ports. This" > + "option can only be used when there is " > + "a single distributed gateway port.", > + op->peer->key, op->peer->od->nbr->name, > + op->peer->od->n_l3dgw_ports); > + } > + } > } else if (chassis && op->od->n_localnet_ports) { > add_router_port_garp = true; > } > @@ -3256,9 +3299,10 @@ ovn_port_update_sbrec(struct northd_context *ctx, > op->peer->lrp_networks.ipv4_addrs[i].addr_s); > } > > - if (op->peer->od->l3redirect_port) { > + if (op->peer->od->n_l3dgw_ports) { > ds_put_format(&garp_info, " is_chassis_resident(%s)", > - op->peer->od->l3redirect_port->json_key); > + op->peer->od->l3dgw_ports[0] > + ->cr_port->json_key); > } > > n_nats++; > @@ -3531,7 +3575,17 @@ build_ovn_lr_lbs(struct hmap *datapaths, struct hmap *lbs) > if (!od->nbr) { > continue; > } > - if (!smap_get(&od->nbr->options, "chassis") && !od->l3dgw_port) { > + if (!smap_get(&od->nbr->options, "chassis") > + && od->n_l3dgw_ports != 1) { > + if (od->n_l3dgw_ports > 1 && od->nbr->n_load_balancer) { > + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); > + VLOG_WARN_RL(&rl, "Load-balancers are configured on logical " > + "router %s, which has %"PRIuSIZE" distributed " > + "gateway ports. Load-balancer is not supported " > + "yet when there is more than one distributed " > + "gateway port on the router.", > + od->nbr->name, od->n_l3dgw_ports); > + } > continue; > } > > @@ -6433,13 +6487,14 @@ build_lrouter_groups__(struct hmap *ports, struct ovn_datapath *od) > { > ovs_assert((od && od->nbr && od->lr_group)); > > - if (od->l3dgw_port && od->l3redirect_port) { > - /* It's a logical router with gateway port. If it > - * has HA_Chassis_Group associated to it in SB DB, then store the > - * ha chassis group name. */ > - if (od->l3redirect_port->sb->ha_chassis_group) { > + /* For logical router with distributed gateway ports. If it > + * has HA_Chassis_Group associated to it in SB DB, then store the > + * ha chassis group name. */ > + for (size_t i = 0; i < od->n_l3dgw_ports; i++) { > + struct ovn_port *crp = od->l3dgw_ports[i]->cr_port; > + if (crp->sb->ha_chassis_group) { > sset_add(&od->lr_group->ha_chassis_groups, > - od->l3redirect_port->sb->ha_chassis_group->name); > + crp->sb->ha_chassis_group->name); > } > } > > @@ -7800,16 +7855,17 @@ build_lswitch_ip_unicast_lookup(struct ovn_port *op, > ds_clear(match); > ds_put_format(match, "eth.dst == "ETH_ADDR_FMT, > ETH_ADDR_ARGS(mac)); > - if (op->peer->od->l3dgw_port > - && op->peer->od->l3redirect_port > + if (op->peer->od->n_l3dgw_ports > && op->od->n_localnet_ports) { > bool add_chassis_resident_check = false; > - if (op->peer == op->peer->od->l3dgw_port) { > + const char *json_key; > + if (is_l3dgw_port(op->peer)) { > /* The peer of this port represents a distributed > * gateway port. The destination lookup flow for the > * router's distributed gateway port MAC address should > * only be programmed on the gateway chassis. */ > add_chassis_resident_check = true; > + json_key = op->peer->cr_port->json_key; > } else { > /* Check if the option 'reside-on-redirect-chassis' > * is set to true on the peer port. If set to true > @@ -7820,12 +7876,15 @@ build_lswitch_ip_unicast_lookup(struct ovn_port *op, > */ > add_chassis_resident_check = smap_get_bool( > &op->peer->nbrp->options, > - "reside-on-redirect-chassis", false); > + "reside-on-redirect-chassis", false) && > + op->peer->od->n_l3dgw_ports == 1; > + json_key = > + op->peer->od->l3dgw_ports[0]->cr_port->json_key; > } > > if (add_chassis_resident_check) { > ds_put_format(match, " && is_chassis_resident(%s)", > - op->peer->od->l3redirect_port->json_key); > + json_key); > } > } > > @@ -7838,8 +7897,7 @@ build_lswitch_ip_unicast_lookup(struct ovn_port *op, > > /* Add ethernet addresses specified in NAT rules on > * distributed logical routers. */ > - if (op->peer->od->l3dgw_port > - && op->peer == op->peer->od->l3dgw_port) { > + if (is_l3dgw_port(op->peer)) { > for (int j = 0; j < op->peer->od->nbr->n_nat; j++) { > const struct nbrec_nat *nat > = op->peer->od->nbr->nat[j]; > @@ -9106,14 +9164,14 @@ build_lrouter_nat_flows_for_lb(struct ovn_lb_vip *lb_vip, > &lb->nlb->header_); > } > > - if (od->l3redirect_port && > + if (od->n_l3dgw_ports && > (lb_vip->n_backends || !lb_vip->empty_backend_rej)) { > new_match_p = xasprintf("%s && is_chassis_resident(%s)", > new_match, > - od->l3redirect_port->json_key); > + od->l3dgw_ports[0]->cr_port->json_key); > est_match_p = xasprintf("%s && is_chassis_resident(%s)", > est_match, > - od->l3redirect_port->json_key); > + od->l3dgw_ports[0]->cr_port->json_key); > } > > if (snat_type == NO_FORCE_SNAT && > @@ -9158,15 +9216,15 @@ build_lrouter_nat_flows_for_lb(struct ovn_lb_vip *lb_vip, > free(est_match_p); > } > > - if (!od->l3dgw_port || !od->l3redirect_port || !lb_vip->n_backends) { > + if (!od->n_l3dgw_ports || !lb_vip->n_backends) { > goto next; > } > > - char *undnat_match_p = xasprintf("%s) && outport == %s && " > - "is_chassis_resident(%s)", > - ds_cstr(&undnat_match), > - od->l3dgw_port->json_key, > - od->l3redirect_port->json_key); > + char *undnat_match_p = xasprintf( > + "%s) && outport == %s && is_chassis_resident(%s)", > + ds_cstr(&undnat_match), > + od->l3dgw_ports[0]->json_key, > + od->l3dgw_ports[0]->cr_port->json_key); > if (snat_type == SKIP_SNAT) { > ovn_lflow_add_with_hint(lflows, od, S_ROUTER_OUT_UNDNAT, 120, > undnat_match_p, skip_snat_est_action, > @@ -9662,9 +9720,9 @@ build_lrouter_port_nat_arp_nd_flow(struct ovn_port *op, > * upstream MAC learning points to the gateway chassis. > * Also need to avoid generation of multiple ARP responses > * from different chassis. */ > - if (op->od->l3redirect_port) { > + if (op->od->n_l3dgw_ports) { > ds_put_format(&match, "is_chassis_resident(%s)", > - op->od->l3redirect_port->json_key); > + op->od->l3dgw_ports[0]->cr_port->json_key); > } > } > > @@ -9938,7 +9996,7 @@ build_adm_ctrl_flows_for_lrouter_port( > return; > } > > - if (op->derived) { > + if (is_cr_port(op)) { > /* No ingress packets should be received on a chassisredirect > * port. */ > return; > @@ -9963,12 +10021,11 @@ build_adm_ctrl_flows_for_lrouter_port( > ds_clear(match); > ds_put_format(match, "eth.dst == %s && inport == %s", > op->lrp_networks.ea_s, op->json_key); > - if (op->od->l3dgw_port && op == op->od->l3dgw_port > - && op->od->l3redirect_port) { > + if (is_l3dgw_port(op)) { > /* Traffic with eth.dst = l3dgw_port->lrp_networks.ea_s > * should only be received on the gateway chassis. */ > ds_put_format(match, " && is_chassis_resident(%s)", > - op->od->l3redirect_port->json_key); > + op->cr_port->json_key); > } > ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_ADMISSION, 50, > ds_cstr(match), ds_cstr(actions), > @@ -10107,10 +10164,9 @@ build_neigh_learning_flows_for_lrouter_port( > op->lrp_networks.ipv4_addrs[i].network_s, > op->lrp_networks.ipv4_addrs[i].plen, > op->lrp_networks.ipv4_addrs[i].addr_s); > - if (op->od->l3dgw_port && op == op->od->l3dgw_port > - && op->od->l3redirect_port) { > + if (is_l3dgw_port(op)) { > ds_put_format(match, " && is_chassis_resident(%s)", > - op->od->l3redirect_port->json_key); > + op->cr_port->json_key); > } > const char *actions_s = REGBIT_LOOKUP_NEIGHBOR_RESULT > " = lookup_arp(inport, arp.spa, arp.sha); " > @@ -10127,10 +10183,9 @@ build_neigh_learning_flows_for_lrouter_port( > op->json_key, > op->lrp_networks.ipv4_addrs[i].network_s, > op->lrp_networks.ipv4_addrs[i].plen); > - if (op->od->l3dgw_port && op == op->od->l3dgw_port > - && op->od->l3redirect_port) { > + if (is_l3dgw_port(op)) { > ds_put_format(match, " && is_chassis_resident(%s)", > - op->od->l3redirect_port->json_key); > + op->cr_port->json_key); > } > ds_clear(actions); > ds_put_format(actions, REGBIT_LOOKUP_NEIGHBOR_RESULT > @@ -10620,7 +10675,7 @@ build_arp_resolve_flows_for_lrouter_port( > } > } > > - if (!op->derived && op->od->l3redirect_port) { > + if (is_l3dgw_port(op)) { > const char *redirect_type = smap_get(&op->nbrp->options, > "redirect-type"); > if (redirect_type && !strcasecmp(redirect_type, "bridged")) { > @@ -10633,7 +10688,7 @@ build_arp_resolve_flows_for_lrouter_port( > ds_clear(match); > ds_put_format(match, "outport == %s && " > "!is_chassis_resident(%s)", op->json_key, > - op->od->l3redirect_port->json_key); > + op->cr_port->json_key); > ds_clear(actions); > ds_put_format(actions, "eth.dst = %s; next;", > op->lrp_networks.ea_s); > @@ -10881,8 +10936,8 @@ build_arp_resolve_flows_for_lrouter_port( > &op->nbsp->header_); > } > > - if (smap_get(&peer->od->nbr->options, "chassis") || > - (peer->od->l3dgw_port && peer == peer->od->l3dgw_port)) { > + if (smap_get(&peer->od->nbr->options, "chassis") > + || peer->cr_port) { > routable_addresses_to_lflows(lflows, router_port, peer, > match, actions); > } > @@ -11079,32 +11134,32 @@ build_gateway_redirect_flows_for_lrouter( > struct ovn_datapath *od, struct hmap *lflows, > struct ds *match, struct ds *actions) > { > - if (od->nbr) { > - if (od->l3dgw_port && od->l3redirect_port) { > - const struct ovsdb_idl_row *stage_hint = NULL; > - > - if (od->l3dgw_port->nbrp) { > - stage_hint = &od->l3dgw_port->nbrp->header_; > - } > + if (!od->nbr) { > + return; > + } > + for (size_t i = 0; i < od->n_l3dgw_ports; i++) { > + const struct ovsdb_idl_row *stage_hint = NULL; > > - /* For traffic with outport == l3dgw_port, if the > - * packet did not match any higher priority redirect > - * rule, then the traffic is redirected to the central > - * instance of the l3dgw_port. */ > - ds_clear(match); > - ds_put_format(match, "outport == %s", > - od->l3dgw_port->json_key); > - ds_clear(actions); > - ds_put_format(actions, "outport = %s; next;", > - od->l3redirect_port->json_key); > - ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50, > - ds_cstr(match), ds_cstr(actions), > - stage_hint); > + if (od->l3dgw_ports[i]->nbrp) { > + stage_hint = &od->l3dgw_ports[i]->nbrp->header_; > } > > - /* Packets are allowed by default. */ > - ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0, "1", "next;"); > + /* For traffic with outport == l3dgw_port, if the > + * packet did not match any higher priority redirect > + * rule, then the traffic is redirected to the central > + * instance of the l3dgw_port. */ > + ds_clear(match); > + ds_put_format(match, "outport == %s", > + od->l3dgw_ports[i]->json_key); > + ds_clear(actions); > + ds_put_format(actions, "outport = %s; next;", > + od->l3dgw_ports[i]->cr_port->json_key); > + ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50, > + ds_cstr(match), ds_cstr(actions), > + stage_hint); > } > + /* Packets are allowed by default. */ > + ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0, "1", "next;"); > } > > /* Local router ingress table ARP_REQUEST: ARP request. > @@ -11203,7 +11258,7 @@ build_egress_delivery_flows_for_lrouter_port( > return; > } > > - if (op->derived) { > + if (is_cr_port(op)) { > /* No egress packets should be processed in the context of > * a chassisredirect port. The chassisredirect port should > * be replaced by the l3dgw port in the local output > @@ -11293,7 +11348,7 @@ build_dhcpv6_reply_flows_for_lrouter_port( > struct ovn_port *op, struct hmap *lflows, > struct ds *match) > { > - if (op->nbrp && (!op->derived)) { > + if (op->nbrp && (!op->l3dgw_port)) { > for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { > ds_clear(match); > ds_put_format(match, "ip6.dst == %s && udp.src == 547 &&" > @@ -11313,7 +11368,7 @@ build_ipv6_input_flows_for_lrouter_port( > struct ds *match, struct ds *actions, > struct shash *meter_groups) > { > - if (op->nbrp && (!op->derived)) { > + if (op->nbrp && (!op->l3dgw_port)) { > /* No ingress packets are accepted on a chassisredirect > * port, so no need to program flows for that port. */ > if (op->lrp_networks.n_ipv6_addrs) { > @@ -11339,15 +11394,14 @@ build_ipv6_input_flows_for_lrouter_port( > * router's own IP address. */ > for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { > ds_clear(match); > - if (op->od->l3dgw_port && op == op->od->l3dgw_port > - && op->od->l3redirect_port) { > + if (is_l3dgw_port(op)) { > /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s > * should only be sent from the gateway chassi, so that > * upstream MAC learning points to the gateway chassis. > * Also need to avoid generation of multiple ND replies > * from different chassis. */ > ds_put_format(match, "is_chassis_resident(%s)", > - op->od->l3redirect_port->json_key); > + op->cr_port->json_key); > } > > build_lrouter_nd_flow(op->od, op, "nd_na_router", > @@ -11358,7 +11412,7 @@ build_ipv6_input_flows_for_lrouter_port( > } > > /* UDP/TCP/SCTP port unreachable */ > - if (!op->od->is_gw_router && !op->od->l3dgw_port) { > + if (!op->od->is_gw_router && !op->od->n_l3dgw_ports) { > for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { > ds_clear(match); > ds_put_format(match, > @@ -11528,7 +11582,7 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, > { > /* No ingress packets are accepted on a chassisredirect > * port, so no need to program flows for that port. */ > - if (op->nbrp && (!op->derived)) { > + if (op->nbrp && (!op->l3dgw_port)) { > if (op->lrp_networks.n_ipv4_addrs) { > /* L3 admission control: drop packets that originate from an > * IPv4 address owned by the router or a broadcast address > @@ -11598,16 +11652,18 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, > op->lrp_networks.ipv4_addrs[i].network_s, > op->lrp_networks.ipv4_addrs[i].plen); > > - if (op->od->l3dgw_port && op->od->l3redirect_port && op->peer > + if (op->od->n_l3dgw_ports && op->peer > && op->peer->od->n_localnet_ports) { > bool add_chassis_resident_check = false; > - if (op == op->od->l3dgw_port) { > + const char *json_key; > + if (is_l3dgw_port(op)) { > /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s > * should only be sent from the gateway chassis, so that > * upstream MAC learning points to the gateway chassis. > * Also need to avoid generation of multiple ARP responses > * from different chassis. */ > add_chassis_resident_check = true; > + json_key = op->cr_port->json_key; > } else { > /* Check if the option 'reside-on-redirect-chassis' > * is set to true on the router port. If set to true > @@ -11619,12 +11675,14 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, > */ > add_chassis_resident_check = smap_get_bool( > &op->nbrp->options, > - "reside-on-redirect-chassis", false); > + "reside-on-redirect-chassis", false) && > + op->od->n_l3dgw_ports == 1; > + json_key = op->od->l3dgw_ports[0]->cr_port->json_key; > } > > if (add_chassis_resident_check) { > ds_put_format(match, " && is_chassis_resident(%s)", > - op->od->l3redirect_port->json_key); > + json_key); > } > } > > @@ -11637,9 +11695,9 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, > const char *ip_address; > if (sset_count(&op->od->lb_ips_v4)) { > ds_clear(match); > - if (op == op->od->l3dgw_port) { > + if (is_l3dgw_port(op)) { > ds_put_format(match, "is_chassis_resident(%s)", > - op->od->l3redirect_port->json_key); > + op->cr_port->json_key); > } > > struct ds load_balancer_ips_v4 = DS_EMPTY_INITIALIZER; > @@ -11657,9 +11715,9 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, > > SSET_FOR_EACH (ip_address, &op->od->lb_ips_v6) { > ds_clear(match); > - if (op == op->od->l3dgw_port) { > + if (is_l3dgw_port(op)) { > ds_put_format(match, "is_chassis_resident(%s)", > - op->od->l3redirect_port->json_key); > + op->cr_port->json_key); > } > > build_lrouter_nd_flow(op->od, op, "nd_na", > @@ -11668,7 +11726,7 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, > lflows, meter_groups); > } > > - if (!op->od->is_gw_router && !op->od->l3dgw_port) { > + if (!op->od->is_gw_router && !op->od->n_l3dgw_ports) { > /* UDP/TCP/SCTP port unreachable. */ > for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) { > ds_clear(match); > @@ -11765,7 +11823,7 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, > * exception is on the l3dgw_port where we might need to use a > * different ETH address. > */ > - if (op != op->od->l3dgw_port) { > + if (!is_l3dgw_port(op)) { > return; > } > > @@ -11847,12 +11905,12 @@ build_lrouter_in_unsnat_flow(struct hmap *lflows, struct ovn_datapath *od, > ds_clear(actions); > ds_put_format(match, "ip && ip%s.dst == %s && inport == %s", > is_v6 ? "6" : "4", nat->external_ip, > - od->l3dgw_port->json_key); > - if (!distributed && od->l3redirect_port) { > + od->l3dgw_ports[0]->json_key); > + if (!distributed && od->n_l3dgw_ports) { > /* Flows for NAT rules that are centralized are only > * programmed on the gateway chassis. */ > ds_put_format(match, " && is_chassis_resident(%s)", > - od->l3redirect_port->json_key); > + od->l3dgw_ports[0]->cr_port->json_key); > } > > if (!strcmp(nat->type, "dnat_and_snat") && stateless) { > @@ -11924,12 +11982,12 @@ build_lrouter_in_dnat_flow(struct hmap *lflows, struct ovn_datapath *od, > ds_clear(match); > ds_put_format(match, "ip && ip%s.dst == %s && inport == %s", > is_v6 ? "6" : "4", nat->external_ip, > - od->l3dgw_port->json_key); > - if (!distributed && od->l3redirect_port) { > + od->l3dgw_ports[0]->json_key); > + if (!distributed && od->n_l3dgw_ports) { > /* Flows for NAT rules that are centralized are only > * programmed on the gateway chassis. */ > ds_put_format(match, " && is_chassis_resident(%s)", > - od->l3redirect_port->json_key); > + od->l3dgw_ports[0]->cr_port->json_key); > } > ds_clear(actions); > if (nat->allowed_ext_ips || nat->exempted_ext_ips) { > @@ -11968,7 +12026,7 @@ build_lrouter_out_undnat_flow(struct hmap *lflows, struct ovn_datapath *od, > * > * Note that this only applies for NAT on a distributed router. > */ > - if (!od->l3dgw_port || > + if (!od->n_l3dgw_ports || > (strcmp(nat->type, "dnat") && strcmp(nat->type, "dnat_and_snat"))) { > return; > } > @@ -11976,12 +12034,12 @@ build_lrouter_out_undnat_flow(struct hmap *lflows, struct ovn_datapath *od, > ds_clear(match); > ds_put_format(match, "ip && ip%s.src == %s && outport == %s", > is_v6 ? "6" : "4", nat->logical_ip, > - od->l3dgw_port->json_key); > - if (!distributed && od->l3redirect_port) { > + od->l3dgw_ports[0]->json_key); > + if (!distributed && od->n_l3dgw_ports) { > /* Flows for NAT rules that are centralized are only > * programmed on the gateway chassis. */ > ds_put_format(match, " && is_chassis_resident(%s)", > - od->l3redirect_port->json_key); > + od->l3dgw_ports[0]->cr_port->json_key); > } > ds_clear(actions); > if (distributed) { > @@ -12054,13 +12112,13 @@ build_lrouter_out_snat_flow(struct hmap *lflows, struct ovn_datapath *od, > ds_clear(match); > ds_put_format(match, "ip && ip%s.src == %s && outport == %s", > is_v6 ? "6" : "4", nat->logical_ip, > - od->l3dgw_port->json_key); > - if (!distributed && od->l3redirect_port) { > + od->l3dgw_ports[0]->json_key); > + if (!distributed && od->n_l3dgw_ports) { > /* Flows for NAT rules that are centralized are only > * programmed on the gateway chassis. */ > priority += 128; > ds_put_format(match, " && is_chassis_resident(%s)", > - od->l3redirect_port->json_key); > + od->l3dgw_ports[0]->cr_port->json_key); > } > ds_clear(actions); > > @@ -12101,11 +12159,11 @@ build_lrouter_ingress_flow(struct hmap *lflows, struct ovn_datapath *od, > struct ds *actions, struct eth_addr mac, > bool distributed, bool is_v6) > { > - if (od->l3dgw_port && !strcmp(nat->type, "snat")) { > + if (od->n_l3dgw_ports && !strcmp(nat->type, "snat")) { > ds_clear(match); > ds_put_format( > match, "inport == %s && %s == %s", > - od->l3dgw_port->json_key, > + od->l3dgw_ports[0]->json_key, > is_v6 ? "ip6.src" : "ip4.src", nat->external_ip); > ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_IP_INPUT, > 120, ds_cstr(match), "next;", > @@ -12123,16 +12181,16 @@ build_lrouter_ingress_flow(struct hmap *lflows, struct ovn_datapath *od, > */ > ds_clear(actions); > > - build_check_pkt_len_action_string(od->l3dgw_port, actions); > + build_check_pkt_len_action_string(od->l3dgw_ports[0], actions); > ds_put_format(actions, REG_INPORT_ETH_ADDR " = %s; next;", > - od->l3dgw_port->lrp_networks.ea_s); > + od->l3dgw_ports[0]->lrp_networks.ea_s); > > ds_clear(match); > ds_put_format(match, > "eth.dst == "ETH_ADDR_FMT" && inport == %s" > " && is_chassis_resident(\"%s\")", > ETH_ADDR_ARGS(mac), > - od->l3dgw_port->json_key, > + od->l3dgw_ports[0]->json_key, > nat->logical_port); > ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_ADMISSION, 50, > ds_cstr(match), ds_cstr(actions), > @@ -12214,7 +12272,7 @@ lrouter_check_nat_entry(struct ovn_datapath *od, const struct nbrec_nat *nat, > /* For distributed router NAT, determine whether this NAT rule > * satisfies the conditions for distributed NAT processing. */ > *distributed = false; > - if (od->l3dgw_port && !strcmp(nat->type, "dnat_and_snat") && > + if (od->n_l3dgw_ports && !strcmp(nat->type, "dnat_and_snat") && > nat->logical_port && nat->external_mac) { > if (eth_addr_from_string(nat->external_mac, mac)) { > *distributed = true; > @@ -12259,7 +12317,7 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows, > * not committed, it would produce ongoing datapath flows with the ct.new > * flag set. Some NICs are unable to offload these flows. > */ > - if ((od->is_gw_router || od->l3dgw_port) && > + if ((od->is_gw_router || od->n_l3dgw_ports) && > (od->nbr->n_nat || od->nbr->n_load_balancer)) { > ovn_lflow_add(lflows, od, S_ROUTER_OUT_UNDNAT, 50, > "ip", "flags.loopback = 1; ct_dnat;"); > @@ -12275,7 +12333,7 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows, > /* NAT rules are only valid on Gateway routers and routers with > * l3dgw_port (router has a port with gateway chassis > * specified). */ > - if (!od->is_gw_router && !od->l3dgw_port) { > + if (!od->is_gw_router && !od->n_l3dgw_ports) { > return; > } > > @@ -12316,14 +12374,14 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows, > ds_clear(match); > ds_put_format( > match, "outport == %s && %s == %s", > - od->l3dgw_port->json_key, > + od->l3dgw_ports[0]->json_key, > is_v6 ? REG_NEXT_HOP_IPV6 : REG_NEXT_HOP_IPV4, > nat->external_ip); > ds_clear(actions); > ds_put_format( > actions, "eth.dst = %s; next;", > distributed ? nat->external_mac : > - od->l3dgw_port->lrp_networks.ea_s); > + od->l3dgw_ports[0]->lrp_networks.ea_s); > ovn_lflow_add_with_hint(lflows, od, > S_ROUTER_IN_ARP_RESOLVE, > 100, ds_cstr(match), > @@ -12359,7 +12417,7 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows, > ds_put_format(match, > "ip%s.src == %s && outport == %s", > is_v6 ? "6" : "4", nat->logical_ip, > - od->l3dgw_port->json_key); > + od->l3dgw_ports[0]->json_key); > /* Add a rule to drop traffic from a distributed NAT if > * the virtual port has not claimed yet becaused otherwise > * the traffic will be centralized misconfiguring the TOR switch. > @@ -12386,16 +12444,16 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows, > * gateway port have ip.dst matching a NAT external IP, then > * loop a clone of the packet back to the beginning of the > * ingress pipeline with inport = outport. */ > - if (od->l3dgw_port) { > + if (od->n_l3dgw_ports) { > /* Distributed router. */ > ds_clear(match); > ds_put_format(match, "ip%s.dst == %s && outport == %s", > is_v6 ? "6" : "4", > nat->external_ip, > - od->l3dgw_port->json_key); > + od->l3dgw_ports[0]->json_key); > if (!distributed) { > ds_put_format(match, " && is_chassis_resident(%s)", > - od->l3redirect_port->json_key); > + od->l3dgw_ports[0]->cr_port->json_key); > } else { > ds_put_format(match, " && is_chassis_resident(\"%s\")", > nat->logical_port); > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl > index d7141294e..9c5576d16 100644 > --- a/northd/ovn_northd.dl > +++ b/northd/ovn_northd.dl > @@ -185,7 +185,7 @@ OutProxy_Port_Binding(._uuid = lsp._uuid, > }, > Some{var router_port} = lsp.options.get("router-port"), > var opt_chassis = peer.and_then(|p| p.router.options.get("chassis")), > - var l3dgw_port = peer.and_then(|p| p.router.l3dgw_port), > + var l3dgw_port = peer.and_then(|p| p.router.l3dgw_ports.nth(0)), > (var __type, var options) = { > var options = ["peer" -> router_port]; > match (opt_chassis) { > @@ -241,7 +241,7 @@ OutProxy_Port_Binding(._uuid = lsp._uuid, > Some{rport} -> match ( > (rport.lrp.options.get_bool_def("reside-on-redirect-chassis", false) > and l3dgw_port.is_some()) or > - Some{rport.lrp} == l3dgw_port or > + rport.is_redirect or > (rport.router.options.contains_key("chassis") and > not sw.localnet_ports.is_empty())) { > false -> set_empty(), > @@ -335,7 +335,7 @@ function get_router_load_balancer_ips(router: Intern<Router>, > function get_nat_addresses(rport: Intern<RouterPort>, routable_only: bool): Set<string> = > { > var addresses = set_empty(); > - var has_redirect = rport.router.l3dgw_port.is_some(); > + var has_redirect = not rport.router.l3dgw_ports.is_empty(); > match (eth_addr_from_string(rport.lrp.mac)) { > None -> addresses, > Some{mac} -> { > @@ -402,7 +402,10 @@ function get_nat_addresses(rport: Intern<RouterPort>, routable_only: bool): Set< > /* Gratuitous ARP for centralized NAT rules on distributed gateway > * ports should be restricted to the gateway chassis. */ > if (has_redirect) { > - c_addresses = c_addresses ++ " is_chassis_resident(${rport.router.redirect_port_name})" > + c_addresses = c_addresses ++ match (rport.router.l3dgw_ports.nth(0)) { > + None -> "", > + Some {var gw_port} -> " is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})" > + } > } else (); > > addresses.insert(c_addresses) > @@ -417,8 +420,10 @@ function get_garp_nat_addresses(rport: Intern<RouterPort>): string = { > for (ipv4_addr in rport.networks.ipv4_addrs) { > garp_info.push("${ipv4_addr.addr}") > }; > - if (rport.router.redirect_port_name != "") { > - garp_info.push("is_chassis_resident(${rport.router.redirect_port_name})") > + match (rport.router.l3dgw_ports.nth(0)) { > + None -> (), > + Some {var gw_port} -> garp_info.push( > + "is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})") > }; > garp_info.join(" ") > } > @@ -455,7 +460,7 @@ OutProxy_Port_Binding(// lrp._uuid is already in use; generate a new UUID by > .nat_addresses = set_empty(), > .external_ids = lrp.external_ids) :- > DistributedGatewayPort(lrp, lr_uuid), > - LogicalRouterHAChassisGroup(lr_uuid, hacg_uuid), > + DistributedGatewayPortHAChassisGroup(lrp, hacg_uuid), > var redirect_type = match (lrp.options.get("redirect-type")) { > Some{var value} -> ["redirect-type" -> value], > _ -> map_empty() > @@ -511,7 +516,8 @@ sb::Out_Port_Binding(._uuid = pbinding._uuid, > * chassis. RefChassisSet has a row for every logical router. */ > relation RefChassis(lr_uuid: uuid, chassis_uuid: uuid) > RefChassis(lr_uuid, chassis_uuid) :- > - LogicalRouterHAChassisGroup(lr_uuid, _), > + DistributedGatewayPortHAChassisGroup(lrp, _), > + DistributedGatewayPort(lrp, lr_uuid), > ConnectedLogicalRouter[(lr_uuid, set_uuid)], > ConnectedLogicalRouter[(lr2_uuid, set_uuid)], > FirstHopLogicalRouter(lr2_uuid, ls_uuid), > @@ -538,7 +544,8 @@ RefChassisSet(lr_uuid, set_empty()) :- > relation HAChassisGroupRefChassisSet(hacg_uuid: uuid, > chassis_uuids: Set<uuid>) > HAChassisGroupRefChassisSet(hacg_uuid, chassis_uuids) :- > - LogicalRouterHAChassisGroup(lr_uuid, hacg_uuid), > + DistributedGatewayPortHAChassisGroup(lrp, hacg_uuid), > + DistributedGatewayPort(lrp, lr_uuid), > RefChassisSet(lr_uuid, chassis_uuids), > var chassis_uuids = chassis_uuids.group_by(hacg_uuid).union(). > > @@ -4451,7 +4458,7 @@ for (&SwitchPort(.lsp = lsp, > .peer = Some{&RouterPort{.lrp = lrp, > .is_redirect = is_redirect, > .router = &Router{._uuid = lr_uuid, > - .redirect_port_name = redirect_port_name}}}) > + .l3dgw_ports = l3dgw_ports}}}) > if (lsp.addresses.contains("router") and lsp.__type != "external")) > { > Some{var mac} = scan_eth_addr(lrp.mac) in { > @@ -4471,6 +4478,14 @@ for (&SwitchPort(.lsp = lsp, > */ > lrp.options.get_bool_def("reside-on-redirect-chassis", false)) in > var __match = if (add_chassis_resident_check) { > + var redirect_port_name = if (is_redirect) { > + json_string_escape(chassis_redirect_name(lrp.name)) > + } else { > + match (l3dgw_ports.nth(0)) { > + Some {var gw_port} -> json_string_escape(chassis_redirect_name(gw_port.name)), > + None -> "" > + } > + }; > /* The destination lookup flow for the router's > * distributed gateway port MAC address should only be > * programmed on the "redirect-chassis". */ > @@ -4876,13 +4891,8 @@ var rLNIR = rEGBIT_LOOKUP_NEIGHBOR_IP_RESULT() in > > /* Check if we need to learn mac-binding from ARP requests. */ > for (RouterPortNetworksIPv4Addr(rp@&RouterPort{.router = router}, addr)) { > - var is_l3dgw_port = match (router.l3dgw_port) { > - Some{l3dgw_lrp} -> l3dgw_lrp._uuid == rp.lrp._uuid, > - None -> false > - } in > - var has_redirect_port = router.redirect_port_name != "" in > - var chassis_residence = match (is_l3dgw_port and has_redirect_port) { > - true -> " && is_chassis_resident(${router.redirect_port_name})", > + var chassis_residence = match (rp.is_redirect) { > + true -> " && is_chassis_resident(${json_string_escape(chassis_redirect_name(rp.lrp.name))})", > false -> "" > } in > var rLNR = rEGBIT_LOOKUP_NEIGHBOR_RESULT() in > @@ -5042,7 +5052,7 @@ relation AddChassisResidentCheck_(lrp: uuid, add_check: bool) > AddChassisResidentCheck_(lrp._uuid, res) :- > &SwitchPort(.peer = Some{&RouterPort{.lrp = lrp, .router = router, .is_redirect = is_redirect}}, > .sw = sw), > - router.l3dgw_port.is_some(), > + not router.l3dgw_ports.is_empty(), > not sw.localnet_ports.is_empty(), > var res = if (is_redirect) { > /* Traffic with eth.src = l3dgw_port->lrp_networks.ea > @@ -5147,7 +5157,8 @@ LogicalRouterArpNdFlow(router, nat, None, rEG_INPORT_ETH_ADDR(), None, false, 90 > * different ETH address. > */ > LogicalRouterPortNatArpNdFlow(router, nat, l3dgw_port) :- > - router in &Router(._uuid = lr_uuid, .l3dgw_port = Some{l3dgw_port}), > + router in &Router(._uuid = lr_uuid, .l3dgw_ports = l3dgw_ports), > + Some {var l3dgw_port} = l3dgw_ports.nth(0), > LogicalRouterNAT(lr_uuid, nat), > /* Skip SNAT entries for now, we handle unique SNAT IPs separately > * below. > @@ -5155,7 +5166,8 @@ LogicalRouterPortNatArpNdFlow(router, nat, l3dgw_port) :- > nat.nat.__type != "snat". > /* Now handle SNAT entries too, one per unique SNAT IP. */ > LogicalRouterPortNatArpNdFlow(router, nat, l3dgw_port) :- > - router in &Router(.l3dgw_port = Some{l3dgw_port}, .snat_ips = snat_ips), > + router in &Router(.l3dgw_ports = l3dgw_ports, .snat_ips = snat_ips), > + Some {var l3dgw_port} = l3dgw_ports.nth(0), > var snat_ip = FlatMap(snat_ips), > (var ip, var nats) = snat_ip, > Some{var nat} = nats.nth(0). > @@ -5185,9 +5197,9 @@ LogicalRouterArpNdFlow(router, nat, Some{lrp}, mac, None, true, 91) :- > * upstream MAC learning points to the gateway chassis. > * Also need to avoid generation of multiple ARP responses > * from different chassis. */ > - match (router.redirect_port_name) { > - "" -> "", > - s -> "is_chassis_resident(${s})" > + match (router.l3dgw_ports.nth(0)) { > + None -> "", > + Some {var gw_port} -> "is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})" > } > ) > }. > @@ -5332,7 +5344,15 @@ for (RouterPortNetworksIPv4Addr(.port = &RouterPort{.lrp = lrp, > var __match = > "arp.spa == ${addr.match_network()}" ++ > if (add_chassis_resident_check) { > - " && is_chassis_resident(${router.redirect_port_name})" > + var redirect_port_name = if (is_redirect) { > + json_string_escape(chassis_redirect_name(lrp.name)) > + } else { > + match (router.l3dgw_ports.nth(0)) { > + None -> "", > + Some {var gw_port} -> json_string_escape(chassis_redirect_name(gw_port.name)) > + } > + }; > + " && is_chassis_resident(${redirect_port_name})" > } else "" in > LogicalRouterArpFlow(.lr = router, > .lrp = Some{lrp}, > @@ -5351,7 +5371,7 @@ for (&RouterPort(.lrp = lrp, > .networks = networks, > .is_redirect = is_redirect)) > var residence_check = match (is_redirect) { > - true -> Some{"is_chassis_resident(${router.redirect_port_name})"}, > + true -> Some{"is_chassis_resident(${json_string_escape(chassis_redirect_name(lrp.name))})"}, > false -> None > } in { > (var all_ips_v4, _) = get_router_load_balancer_ips(router, false) in { > @@ -5421,7 +5441,7 @@ Flow(.logical_datapath = lr_uuid, > for (RouterPortNetworksIPv4Addr( > .port = &RouterPort{ > .router = &Router{._uuid = lr_uuid, > - .l3dgw_port = None, > + .l3dgw_ports = vec_empty(), > .is_gateway = false, > .copp = copp}, > .lrp = lrp}, > @@ -5557,7 +5577,7 @@ for (RouterPortNetworksIPv6Addr(.port = &RouterPort{.lrp = lrp, > /* UDP/TCP/SCTP port unreachable */ > for (RouterPortNetworksIPv6Addr( > .port = &RouterPort{.router = &Router{._uuid = lr_uuid, > - .l3dgw_port = None, > + .l3dgw_ports = vec_empty(), > .is_gateway = false, > .copp = copp}, > .lrp = lrp, > @@ -5685,11 +5705,11 @@ for (r in &Router(._uuid = lr_uuid)) { > } > > for (r in &Router(._uuid = lr_uuid, > - .l3dgw_port = l3dgw_port, > + .l3dgw_ports = l3dgw_ports, > .is_gateway = is_gateway, > .nat = nat, > .load_balancer = load_balancer) > - if (l3dgw_port.is_some() or is_gateway) and (not is_empty(nat) or not is_empty(load_balancer))) { > + if (l3dgw_ports.len() > 0 or is_gateway) and (not is_empty(nat) or not is_empty(load_balancer))) { > /* If the router has load balancer or DNAT rules, re-circulate every packet > * through the DNAT zone so that packets that need to be unDNATed in the > * reverse direction get unDNATed. > @@ -5772,7 +5792,7 @@ function lrouter_nat_add_ext_ip_match( > }, > false -> { > /* S_ROUTER_OUT_SNAT uses priority (mask + 1 + 128 + 1) */ > - var is_gw_router = router.l3dgw_port == None; > + var is_gw_router = router.l3dgw_ports.is_empty(); > var mask_1bits = mask.cidr_bits().unwrap_or(8'd0) as integer; > mask_1bits + 2 + { if (not is_gw_router) 128 else 0 } > } > @@ -5877,10 +5897,9 @@ VirtualLogicalPort(Some{logical_port}) :- > * l3dgw_port (router has a port with "redirect-chassis" > * specified). */ > for (r in &Router(._uuid = lr_uuid, > - .l3dgw_port = l3dgw_port, > - .redirect_port_name = redirect_port_name, > + .l3dgw_ports = l3dgw_ports, > .is_gateway = is_gateway) > - if l3dgw_port.is_some() or is_gateway) > + if not l3dgw_ports.is_empty() or is_gateway) > { > for (LogicalRouterNAT(.lr = lr_uuid, .nat = nat)) { > var ipX = nat.external_ip.ipX() in > @@ -5898,7 +5917,7 @@ for (r in &Router(._uuid = lr_uuid, > } in > /* For distributed router NAT, determine whether this NAT rule > * satisfies the conditions for distributed NAT processing. */ > - var mac = match ((l3dgw_port.is_some() and nat.nat.__type == "dnat_and_snat", > + var mac = match ((not l3dgw_ports.is_empty() and nat.nat.__type == "dnat_and_snat", > nat.nat.logical_port, nat.external_mac)) { > (true, Some{_}, Some{mac}) -> Some{mac}, > _ -> None > @@ -5916,7 +5935,7 @@ for (r in &Router(._uuid = lr_uuid, > * not know about the possibility of eventual additional SNAT in > * egress pipeline. */ > if (nat.nat.__type == "snat" or nat.nat.__type == "dnat_and_snat") { > - if (l3dgw_port == None) { > + if (l3dgw_ports.is_empty()) { > /* Gateway router. */ > var actions = if (stateless) { > "${ipX}.dst=${nat.nat.logical_ip}; next;" > @@ -5930,7 +5949,7 @@ for (r in &Router(._uuid = lr_uuid, > .actions = actions, > .external_ids = stage_hint(nat.nat._uuid)) > }; > - Some{var gwport} = l3dgw_port in { > + Some {var gwport} = l3dgw_ports.nth(0) in { > /* Distributed router. */ > > /* Traffic received on l3dgw_port is subject to NAT. */ > @@ -5940,7 +5959,7 @@ for (r in &Router(._uuid = lr_uuid, > if (mac == None) { > /* Flows for NAT rules that are centralized are only > * programmed on the "redirect-chassis". */ > - " && is_chassis_resident(${redirect_port_name})" > + " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gwport.name))})" > } else { "" } in > var actions = if (stateless) { > "${ipX}.dst=${nat.nat.logical_ip}; next;" > @@ -5966,7 +5985,7 @@ for (r in &Router(._uuid = lr_uuid, > "" > } in > if (nat.nat.__type == "dnat" or nat.nat.__type == "dnat_and_snat") { > - None = l3dgw_port in > + l3dgw_ports.is_empty() in > var __match = "ip && ${ipX}.dst == ${nat.nat.external_ip}" in > (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match( > r, nat, __match, ipX, true, mask) in > @@ -5998,14 +6017,14 @@ for (r in &Router(._uuid = lr_uuid, > .external_ids = stage_hint(nat.nat._uuid)) > }; > > - Some{var gwport} = l3dgw_port in > + Some {var gwport} = l3dgw_ports.nth(0) in > var __match = > "ip && ${ipX}.dst == ${nat.nat.external_ip}" > " && inport == ${json_string_escape(gwport.name)}" ++ > if (mac == None) { > /* Flows for NAT rules that are centralized are only > * programmed on the "redirect-chassis". */ > - " && is_chassis_resident(${redirect_port_name})" > + " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gwport.name))})" > } else { "" } in > (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match( > r, nat, __match, ipX, true, mask) in > @@ -6029,7 +6048,7 @@ for (r in &Router(._uuid = lr_uuid, > }; > > /* ARP resolve for NAT IPs. */ > - Some{var gwport} = l3dgw_port in { > + Some {var gwport} = l3dgw_ports.nth(0) in { > var gwport_name = json_string_escape(gwport.name) in { > if (nat.nat.__type == "snat") { > var __match = "inport == ${gwport_name} && " > @@ -6066,14 +6085,14 @@ for (r in &Router(._uuid = lr_uuid, > * Note that this only applies for NAT on a distributed router. > */ > if ((nat.nat.__type == "dnat" or nat.nat.__type == "dnat_and_snat")) { > - Some{var gwport} = l3dgw_port in > + Some {var gwport} = l3dgw_ports.nth(0) in > var __match = > "ip && ${ipX}.src == ${nat.nat.logical_ip}" > " && outport == ${json_string_escape(gwport.name)}" ++ > if (mac == None) { > /* Flows for NAT rules that are centralized are only > * programmed on the "redirect-chassis". */ > - " && is_chassis_resident(${redirect_port_name})" > + " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gwport.name))})" > } else { "" } in > var actions = > match (mac) { > @@ -6103,7 +6122,7 @@ for (r in &Router(._uuid = lr_uuid, > "" > } in > if (nat.nat.__type == "snat" or nat.nat.__type == "dnat_and_snat") { > - None = l3dgw_port in > + l3dgw_ports.is_empty() in > var __match = "ip && ${ipX}.src == ${nat.nat.logical_ip}" in > (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match( > r, nat, __match, ipX, false, mask) in > @@ -6128,14 +6147,14 @@ for (r in &Router(._uuid = lr_uuid, > .external_ids = stage_hint(nat.nat._uuid)) > }; > > - Some{var gwport} = l3dgw_port in > + Some {var gwport} = l3dgw_ports.nth(0) in > var __match = > "ip && ${ipX}.src == ${nat.nat.logical_ip}" > " && outport == ${json_string_escape(gwport.name)}" ++ > if (mac == None) { > /* Flows for NAT rules that are centralized are only > * programmed on the "redirect-chassis". */ > - " && is_chassis_resident(${redirect_port_name})" > + " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gwport.name))})" > } else { "" } in > (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match( > r, nat, __match, ipX, false, mask) in > @@ -6173,7 +6192,7 @@ for (r in &Router(._uuid = lr_uuid, > * on the l3dgw_port instance where nat->logical_port is > * resident. */ > Some{var mac_addr} = mac in > - Some{var gwport} = l3dgw_port in > + Some{var gwport} = l3dgw_ports.nth(0) in > Some{var logical_port} = nat.nat.logical_port in > var __match = > "eth.dst == ${mac_addr} && inport == ${json_string_escape(gwport.name)}" > @@ -6199,7 +6218,7 @@ for (r in &Router(._uuid = lr_uuid, > * stage is sent out with proper IP/MAC src addresses > */ > Some{var mac_addr} = mac in > - Some{var gwport} = l3dgw_port in > + Some{var gwport} = l3dgw_ports.nth(0) in > Some{var logical_port} = nat.nat.logical_port in > Some{var external_mac} = nat.nat.external_mac in > var __match = > @@ -6218,7 +6237,7 @@ for (r in &Router(._uuid = lr_uuid, > .external_ids = stage_hint(nat.nat._uuid)); > > for (VirtualLogicalPort(nat.nat.logical_port)) { > - Some{var gwport} = l3dgw_port in > + Some{var gwport} = l3dgw_ports.nth(0) in > Flow(.logical_datapath = lr_uuid, > .stage = s_ROUTER_IN_GW_REDIRECT(), > .priority = 80, > @@ -6233,14 +6252,14 @@ for (r in &Router(._uuid = lr_uuid, > * gateway port have ip.dst matching a NAT external IP, then > * loop a clone of the packet back to the beginning of the > * ingress pipeline with inport = outport. */ > - Some{var gwport} = l3dgw_port in > + Some{var gwport} = l3dgw_ports.nth(0) in > /* Distributed router. */ > Some{var port} = match (mac) { > Some{_} -> match (nat.nat.logical_port) { > Some{name} -> Some{json_string_escape(name)}, > None -> None: Option<string> > }, > - None -> Some{redirect_port_name} > + None -> Some{json_string_escape(chassis_redirect_name(gwport.name))} > } in > var __match = "${ipX}.dst == ${nat.nat.external_ip} && outport == ${json_string_escape(gwport.name)} && is_chassis_resident(${port})" in > var regs = { > @@ -6268,7 +6287,7 @@ for (r in &Router(._uuid = lr_uuid, > }; > > /* Handle force SNAT options set in the gateway router. */ > - if (l3dgw_port == None) { > + if (l3dgw_ports.is_empty()) { > var dnat_force_snat_ips = get_force_snat_ip(r.options, "dnat") in > if (not dnat_force_snat_ips.is_empty()) > LogicalRouterForceSnatFlows(.logical_router = lr_uuid, > @@ -6296,14 +6315,13 @@ function nats_contain_vip(nats: Vec<NAT>, vip: v46_ip): bool { > * Gateway routers or router with gateway port. */ > for (RouterLBVIP( > .router = r@&Router{._uuid = lr_uuid, > - .l3dgw_port = l3dgw_port, > - .redirect_port_name = redirect_port_name, > + .l3dgw_ports = l3dgw_ports, > .is_gateway = is_gateway, > .nats = nats}, > .lb = lb, > .vip = vip, > .backends = backends) > - if l3dgw_port.is_some() or is_gateway) > + if not l3dgw_ports.is_empty() or is_gateway) > { > if (backends == "" and not lb.options.get_bool_def("reject", false)) { > for (LoadBalancerEmptyEvents(lb)) { > @@ -6372,8 +6390,8 @@ for (RouterLBVIP( > (110, "") > } in > var __match = match1 ++ match2 ++ > - match ((l3dgw_port, backends != "" or lb.options.get_bool_def("reject", false))) { > - (Some{gwport}, true) -> " && is_chassis_resident(${redirect_port_name})", > + match ((l3dgw_ports.nth(0), backends != "" or lb.options.get_bool_def("reject", false))) { > + (Some{gw_port}, true) -> " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})", > _ -> "" > } in > var snat_for_lb = snat_for_lb(r.options, lb) in > @@ -6385,8 +6403,8 @@ for (RouterLBVIP( > } else { > "" > } ++ > - match ((l3dgw_port, backends != "" or lb.options.get_bool_def("reject", false))) { > - (Some{gwport}, true) -> " && is_chassis_resident(${redirect_port_name})", > + match ((l3dgw_ports.nth(0), backends != "" or lb.options.get_bool_def("reject", false))) { > + (Some {var gw_port}, true) -> " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})", > _ -> "" > } in > var actions = > @@ -6425,7 +6443,7 @@ for (RouterLBVIP( > .external_ids = stage_hint(lb._uuid)) > }; > > - Some{var gwport} = l3dgw_port in > + Some{var gwport} = l3dgw_ports.nth(0) in > /* Add logical flows to UNDNAT the load balanced reverse traffic in > * the router egress pipleine stage - S_ROUTER_OUT_UNDNAT if the logical > * router has a gateway router port associated. > @@ -6450,7 +6468,7 @@ for (RouterLBVIP( > var undnat_match = > "${ip_address.ipX()} && (" ++ conds.join(" || ") ++ > ") && outport == ${json_string_escape(gwport.name)} && " > - "is_chassis_resident(${redirect_port_name})" in > + "is_chassis_resident(${json_string_escape(chassis_redirect_name(gwport.name))})" in > var action = > match (snat_for_lb) { > SkipSNAT -> "flags.skip_snat_for_lb = 1; ct_dnat;", > @@ -6481,14 +6499,14 @@ MeteredFlow(.logical_datapath = r._uuid, > .controller_meter = meter, > .external_ids = stage_hint(lb._uuid)) :- > r in &Router(), > - r.l3dgw_port.is_some() or r.is_gateway, > + r.l3dgw_ports.len() > 0 or r.is_gateway, > LBVIPWithStatus[lbvip@&LBVIPWithStatus{.lb = lb}], > r.load_balancer.contains(lb._uuid), > var __match > = "ct.new && " ++ > get_match_for_lb_key(lbvip.vip_addr, lbvip.vip_port, lb.protocol, true, true) ++ > - match (r.l3dgw_port) { > - Some{gwport} -> " && is_chassis_resident(${r.redirect_port_name})", > + match (r.l3dgw_ports.nth(0)) { > + Some{gw_port} -> " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})", > _ -> "" > }, > var priority = if (lbvip.vip_port != 0) 120 else 110, > @@ -7294,11 +7312,11 @@ Flow(.logical_datapath = router._uuid, > .stage = s_ROUTER_IN_ARP_RESOLVE(), > .priority = 50, > .__match = "outport == ${rp.json_name} && " > - "!is_chassis_resident(${router.redirect_port_name})", > + "!is_chassis_resident(${json_string_escape(chassis_redirect_name(l3dgw_port.name))})", > .actions = "eth.dst = ${rp.networks.ea}; next;", > .external_ids = stage_hint(lrp._uuid)) :- > rp in &RouterPort(.lrp = lrp, .router = router), > - router.redirect_port_name != "", > + Some{var l3dgw_port} = router.l3dgw_ports.nth(0), > Some{"bridged"} = lrp.options.get("redirect-type"). > > > @@ -7672,21 +7690,20 @@ MeteredFlow(.logical_datapath = lr_uuid, > * of the traffic to the l3redirect_port which represents > * the central instance of the l3dgw_port. > */ > -for (&Router(._uuid = lr_uuid, > - .l3dgw_port = l3dgw_port, > - .redirect_port_name = redirect_port_name)) > +for (&Router(._uuid = lr_uuid)) > { > /* For traffic with outport == l3dgw_port, if the > * packet did not match any higher priority redirect > * rule, then the traffic is redirected to the central > * instance of the l3dgw_port. */ > - Some{var gwport} = l3dgw_port in > - Flow(.logical_datapath = lr_uuid, > - .stage = s_ROUTER_IN_GW_REDIRECT(), > - .priority = 50, > - .__match = "outport == ${json_string_escape(gwport.name)}", > - .actions = "outport = ${redirect_port_name}; next;", > - .external_ids = stage_hint(gwport._uuid)); > + for (DistributedGatewayPort(lrp, lr_uuid)) { > + Flow(.logical_datapath = lr_uuid, > + .stage = s_ROUTER_IN_GW_REDIRECT(), > + .priority = 50, > + .__match = "outport == ${json_string_escape(lrp.name)}", > + .actions = "outport = ${json_string_escape(chassis_redirect_name(lrp.name))}; next;", > + .external_ids = stage_hint(lrp._uuid)) > + }; > > /* Packets are allowed by default. */ > Flow(.logical_datapath = lr_uuid, > diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml > index 0eef9b739..3598b5073 100644 > --- a/ovn-architecture.7.xml > +++ b/ovn-architecture.7.xml > @@ -731,6 +731,13 @@ > highest-priority gateway that is online. > </p> > > + <p> > + A logical router can have multiple distributed gateway ports, each > + connecting different external networks. However, some features, such as NAT > + and load balancers, are not supported yet for logical routers with more > + than one distributed gateway port configured. > + </p> > + > <h4>Physical VLAN MTU Issues</h4> > > <p> > @@ -1968,8 +1975,9 @@ > > <p> > If the logical router doesn't have a distributed gateway port connecting > - to the localnet logical switch which provides external connectivity, > - then this option is ignored by <code>OVN</code>. > + to the localnet logical switch which provides external connectivity, or > + if it has more than one distributed gateway ports, then this option is > + ignored by <code>OVN</code>. > </p> > > <p> > @@ -2086,6 +2094,13 @@ > a tunnel. > </p> > > + <p> > + If the logical router doesn't have a distributed gateway port connecting > + to the localnet logical switch which provides external connectivity, or > + if it has more than one distributed gateway ports, then this option is > + ignored by <code>OVN</code>. > + </p> > + > <p> > Following happens for bridged redirection: > </p> > diff --git a/ovn-nb.xml b/ovn-nb.xml > index c1176e81f..ec51b5608 100644 > --- a/ovn-nb.xml > +++ b/ovn-nb.xml > @@ -2032,13 +2032,14 @@ > > <column name="nat"> > One or more NAT rules for the router. NAT rules only work on > - Gateway routers, and on distributed routers with logical gateway ports. > + Gateway routers, and on distributed routers with one and only one > + distributed gateway port. > </column> > > <column name="load_balancer"> > Load balance a virtual ip address to a set of logical port ip > addresses. Load balancer rules only work on the Gateway routers or > - routers with distributed gateway ports. > + routers with one and only one distributed gateway port. > </column> > > <group title="Naming"> > @@ -2453,8 +2454,7 @@ > If either of these are set, this logical router port represents a > distributed gateway port that connects this router to a > logical switch with a <code>localnet</code> port or a > - connection to another OVN deployment. There may be at most > - one such logical router port on each logical router. > + connection to another OVN deployment. > </p> > > <p> > @@ -2476,8 +2476,16 @@ > </p> > > <p> > - When more than one gateway chassis is specified, OVN only uses > - one at a time. OVN can rely on OVS BFD implementation to monitor > + There can be more than one distributed gateway ports configured > + on each logical router, each connecting to different L2 segments. > + However, features such as NAT and load-balancer are not supported > + on logical routers with more than one distributed gateway ports. > + </p> > + > + <p> > + For each distributed gateway port, it may have more than one gateway > + chassises. When more than one gateway chassis is specified, OVN only > + uses one at a time. OVN can rely on OVS BFD implementation to monitor > gateway connectivity, preferring the highest-priority gateway > that is online. Priorities are specified in the <code>priority</code> > column of <ref table="Gateway_Chassis"/> or <ref table="HA_Chassis"/>. > @@ -2563,8 +2571,8 @@ > </p> > > <p> > - OVN honors this option only if the logical router has a distributed > - gateway port and if the LRP's peer switch has a > + OVN honors this option only if the logical router has one and only > + one distributed gateway port and if the LRP's peer switch has a > <code>localnet</code> port. > </p> > </column> > @@ -2588,7 +2596,8 @@ > <p> > Setting this option to <code>overlay</code> or leaving it unset has > no effect. This option may usefully be set only on a distributed > - gateway port. It is otherwise ignored. > + gateway port when there is one and only one distributed gateway > + port on the logical router. It is otherwise ignored. > </p> > </column> > </group> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > index 2098b1c19..27c93a8b9 100644 > --- a/tests/ovn-northd.at > +++ b/tests/ovn-northd.at > @@ -4968,3 +4968,85 @@ AT_CHECK([grep -e "chk_pkt_len" -e "lr_in_larger_pkts" lr0flows | sort], [0], [d > > AT_CLEANUP > ]) > + > +OVN_FOR_EACH_NORTHD([ > +AT_SETUP([ovn-northd -- lr multiple gw ports]) > +AT_KEYWORDS([multiple-l3dgw-ports]) > +ovn_start > + > +# Logical network: > +# 1 Logical Router, 3 bridged Logical Switches, > +# 1 gateway chassis attached to each corresponding LRP. > +# > +# | S1 (gw1) > +# | > +# ls ---- DR -- S3 (gw3) > +# (20.0.0.0/24) | > +# | S2 (gw2) > +# > +# Validate basic LR logical flows. > + > +check ovn-sbctl chassis-add gw1 geneve 127.0.0.1 > +check ovn-sbctl chassis-add gw2 geneve 128.0.0.1 > +check ovn-sbctl chassis-add gw3 geneve 129.0.0.1 > + > +check ovn-nbctl lr-add DR > +check ovn-nbctl lrp-add DR DR-S1 02:ac:10:01:00:01 172.16.1.1/24 > +check ovn-nbctl lrp-add DR DR-S2 03:ac:10:01:00:01 172.16.2.1/24 > +check ovn-nbctl lrp-add DR DR-S3 04:ac:10:01:00:01 172.16.3.1/24 > +check ovn-nbctl lrp-add DR DR-ls 05:ac:10:01:00:01 20.0.0.1/24 > + > +check ovn-nbctl ls-add S1 > +check ovn-nbctl lsp-add S1 S1-DR > +check ovn-nbctl lsp-set-type S1-DR router > +check ovn-nbctl lsp-set-addresses S1-DR router > +check ovn-nbctl --wait=sb lsp-set-options S1-DR router-port=DR-S1 > + > +check ovn-nbctl ls-add S2 > +check ovn-nbctl lsp-add S2 S2-DR > +check ovn-nbctl lsp-set-type S2-DR router > +check ovn-nbctl lsp-set-addresses S2-DR router > +check ovn-nbctl --wait=sb lsp-set-options S2-DR router-port=DR-S2 > + > +check ovn-nbctl ls-add S3 > +check ovn-nbctl lsp-add S3 S3-DR > +check ovn-nbctl lsp-set-type S3-DR router > +check ovn-nbctl lsp-set-addresses S3-DR router > +check ovn-nbctl --wait=sb lsp-set-options S3-DR router-port=DR-S3 > + > +check ovn-nbctl ls-add ls > +check ovn-nbctl lsp-add ls ls-DR > +check ovn-nbctl lsp-set-type ls-DR router > +check ovn-nbctl lsp-set-addresses ls-DR router > +check ovn-nbctl --wait=sb lsp-set-options ls-DR router-port=DR-ls > + > +check ovn-nbctl lrp-set-gateway-chassis DR-S1 gw1 > +check ovn-nbctl lrp-set-gateway-chassis DR-S2 gw2 > +check ovn-nbctl lrp-set-gateway-chassis DR-S3 gw3 > + > +check ovn-nbctl --wait=sb sync > + > +ovn-sbctl dump-flows DR > lrflows > +AT_CAPTURE_FILE([lrflows]) > + > +# Check the flows in lr_in_admission stage > +AT_CHECK([grep lr_in_admission lrflows | grep cr-DR | sort], [0], [dnl > + table=0 (lr_in_admission ), priority=50 , match=(eth.dst == 02:ac:10:01:00:01 && inport == "DR-S1" && is_chassis_resident("cr-DR-S1")), action=(xreg0[[0..47]] = 02:ac:10:01:00:01; next;) > + table=0 (lr_in_admission ), priority=50 , match=(eth.dst == 03:ac:10:01:00:01 && inport == "DR-S2" && is_chassis_resident("cr-DR-S2")), action=(xreg0[[0..47]] = 03:ac:10:01:00:01; next;) > + table=0 (lr_in_admission ), priority=50 , match=(eth.dst == 04:ac:10:01:00:01 && inport == "DR-S3" && is_chassis_resident("cr-DR-S3")), action=(xreg0[[0..47]] = 04:ac:10:01:00:01; next;) > +]) > +# Check the flows in lr_in_lookup_neighbor stage > +AT_CHECK([grep lr_in_lookup_neighbor lrflows | grep cr-DR | sort], [0], [dnl > + table=1 (lr_in_lookup_neighbor), priority=100 , match=(inport == "DR-S1" && arp.spa == 172.16.1.0/24 && arp.op == 1 && is_chassis_resident("cr-DR-S1")), action=(reg9[[2]] = lookup_arp(inport, arp.spa, arp.sha); next;) > + table=1 (lr_in_lookup_neighbor), priority=100 , match=(inport == "DR-S2" && arp.spa == 172.16.2.0/24 && arp.op == 1 && is_chassis_resident("cr-DR-S2")), action=(reg9[[2]] = lookup_arp(inport, arp.spa, arp.sha); next;) > + table=1 (lr_in_lookup_neighbor), priority=100 , match=(inport == "DR-S3" && arp.spa == 172.16.3.0/24 && arp.op == 1 && is_chassis_resident("cr-DR-S3")), action=(reg9[[2]] = lookup_arp(inport, arp.spa, arp.sha); next;) > +]) > +# Check the flows in lr_in_gw_redirect stage > +AT_CHECK([grep lr_in_gw_redirect lrflows | grep cr-DR | sort], [0], [dnl > + table=17(lr_in_gw_redirect ), priority=50 , match=(outport == "DR-S1"), action=(outport = "cr-DR-S1"; next;) > + table=17(lr_in_gw_redirect ), priority=50 , match=(outport == "DR-S2"), action=(outport = "cr-DR-S2"; next;) > + table=17(lr_in_gw_redirect ), priority=50 , match=(outport == "DR-S3"), action=(outport = "cr-DR-S3"; next;) > +]) > + > +AT_CLEANUP > +]) > diff --git a/tests/ovn.at b/tests/ovn.at > index 7ae136ad9..b571bbb49 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -27529,3 +27529,310 @@ AT_CHECK([ovs-ofctl dump-flows br-int table=44 | grep 10.0.0.144], [0], [ignore] > OVN_CLEANUP([hv1]) > AT_CLEANUP > ]) > + > +OVN_FOR_EACH_NORTHD([ > +AT_SETUP([ovn -- lr multiple gw ports]) > +AT_KEYWORDS([multiple-l3dgw-ports]) > +ovn_start > + > +# Logical network: > +# 1 LR, 3 Logical Switches, > +# 1 gateway chassis attached to each corresponding LRP. > +# > +# | S1 (gw1) > +# | > +# ls ---- DR -- S3 (gw3) > +# (20.0.0.0/24) | > +# | S2 (gw2) > +# > +# S1 - VLAN 1000 > +# S2 - VLAN 2000 > +# S3 - VLAN 3000 > +# > +# 5 chassis(s), HV1----HV5 > +# > +# HV1 - VIF11 > +# HV2 - Gateway chassis gw1 > +# HV3 - Gateway chassis gw2 > +# HV4 - Gateway chassis gw3 > +# HV5 - North endpoint > + > +ovn-nbctl lr-add DR > +ovn-nbctl lrp-add DR DR-S1 02:ac:10:01:00:01 172.16.1.1/24 > +ovn-nbctl lrp-add DR DR-S2 08:ac:10:01:00:01 10.0.0.1/24 > +ovn-nbctl lrp-add DR DR-S3 04:ac:10:01:00:01 192.168.0.1/24 > +ovn-nbctl lrp-add DR DR-ls 06:ac:10:01:00:01 20.0.0.1/24 > + > +ovn-nbctl ls-add S1 > +ovn-nbctl lsp-add S1 S1-DR > +ovn-nbctl lsp-set-type S1-DR router > +ovn-nbctl lsp-set-addresses S1-DR router > +ovn-nbctl --wait=sb lsp-set-options S1-DR router-port=DR-S1 > +ovn-nbctl lsp-add S1 ln1 "" 1000 > +ovn-nbctl lsp-set-addresses ln1 unknown > +ovn-nbctl lsp-set-type ln1 localnet > +ovn-nbctl lsp-set-options ln1 network_name=phys > + > +ovn-nbctl ls-add S2 > +ovn-nbctl lsp-add S2 S2-DR > +ovn-nbctl lsp-set-type S2-DR router > +ovn-nbctl lsp-set-addresses S2-DR router > +ovn-nbctl --wait=sb lsp-set-options S2-DR router-port=DR-S2 > +ovn-nbctl lsp-add S2 ln2 "" 2000 > +ovn-nbctl lsp-set-addresses ln2 unknown > +ovn-nbctl lsp-set-type ln2 localnet > +ovn-nbctl lsp-set-options ln2 network_name=phys > + > +ovn-nbctl ls-add S3 > +ovn-nbctl lsp-add S3 S3-DR > +ovn-nbctl lsp-set-type S3-DR router > +ovn-nbctl lsp-set-addresses S3-DR router > +ovn-nbctl --wait=sb lsp-set-options S3-DR router-port=DR-S3 > +ovn-nbctl lsp-add S3 ln3 "" 3000 > +ovn-nbctl lsp-set-addresses ln3 unknown > +ovn-nbctl lsp-set-type ln3 localnet > +ovn-nbctl lsp-set-options ln3 network_name=phys > + > +ovn-nbctl ls-add ls > +ovn-nbctl lsp-add ls ls-DR > +ovn-nbctl lsp-set-type ls-DR router > +ovn-nbctl lsp-set-addresses ls-DR router > +ovn-nbctl --wait=sb lsp-set-options ls-DR router-port=DR-ls > + > +# Add the lsp lp11 to ls. This will map to VIF11. > +ovn-nbctl lsp-add ls lp11 > +ovn-nbctl lsp-set-addresses lp11 "f0:00:00:00:00:10 20.0.0.10" > +ovn-nbctl lsp-set-port-security lp11 f0:00:00:00:00:10 > + > +# Add the Northbound endpoint, lp-north1 > +ovn-nbctl ls-add ls-north1 > +ovn-nbctl lsp-add ls-north1 ln4 "" 1000 > +ovn-nbctl lsp-set-addresses ln4 unknown > +ovn-nbctl lsp-set-type ln4 localnet > +ovn-nbctl lsp-set-options ln4 network_name=phys > + > +ovn-nbctl lsp-add ls-north1 lp-north1 > +ovn-nbctl lsp-set-addresses lp-north1 "f0:f0:00:00:00:11 172.16.1.10" > +ovn-nbctl lsp-set-port-security lp-north1 f0:f0:00:00:00:11 > + > +# Add the Northbound endpoint, lp-north2 > +ovn-nbctl ls-add ls-north2 > +ovn-nbctl lsp-add ls-north2 ln5 "" 2000 > +ovn-nbctl lsp-set-addresses ln5 unknown > +ovn-nbctl lsp-set-type ln5 localnet > +ovn-nbctl lsp-set-options ln5 network_name=phys > + > +ovn-nbctl lsp-add ls-north2 lp-north2 > +ovn-nbctl lsp-set-addresses lp-north2 "f0:f0:00:00:00:22 10.0.0.10" > +ovn-nbctl lsp-set-port-security lp-north2 f0:f0:00:00:00:22 > + > +# Add the Northbound endpoint, lp-north3 > +ovn-nbctl ls-add ls-north3 > +ovn-nbctl lsp-add ls-north3 ln6 "" 3000 > +ovn-nbctl lsp-set-addresses ln6 unknown > +ovn-nbctl lsp-set-type ln6 localnet > +ovn-nbctl lsp-set-options ln6 network_name=phys > + > +ovn-nbctl lsp-add ls-north3 lp-north3 > +ovn-nbctl lsp-set-addresses lp-north3 "f0:f0:00:00:00:33 192.168.0.10" > +ovn-nbctl lsp-set-port-security lp-north3 f0:f0:00:00:00:33 > + > +# Add 5 chassis > +net_add n1 > +for i in 1 2 3 4 5; do > + sim_add hv$i > + as hv$i > + ovs-vsctl add-br br-phys > + ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys > + ovn_attach n1 br-phys 192.168.0.$i 24 $encap > +done > + > +# Add a vif on HV1 > +as hv1 ovs-vsctl add-port br-int vif11 -- \ > + set Interface vif11 external-ids:iface-id=lp11 \ > + options:tx_pcap=hv1/vif11-tx.pcap \ > + options:rxq_pcap=hv1/vif11-rx.pcap \ > + ofport-request=11 > +OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up lp11` = xup]) > + > +as hv5 ovs-vsctl add-port br-int vif-north1 -- \ > + set Interface vif-north1 external-ids:iface-id=lp-north1 \ > + options:tx_pcap=hv5/vif-north1-tx.pcap \ > + options:rxq_pcap=hv5/vif-north1-rx.pcap \ > + ofport-request=44 > + > +as hv5 ovs-vsctl add-port br-int vif-north2 -- \ > + set Interface vif-north2 external-ids:iface-id=lp-north2 \ > + options:tx_pcap=hv5/vif-north2-tx.pcap \ > + options:rxq_pcap=hv5/vif-north2-rx.pcap \ > + ofport-request=45 > + > +as hv5 ovs-vsctl add-port br-int vif-north3 -- \ > + set Interface vif-north3 external-ids:iface-id=lp-north3 \ > + options:tx_pcap=hv5/vif-north3-tx.pcap \ > + options:rxq_pcap=hv5/vif-north3-rx.pcap \ > + ofport-request=46 > + > +ovn-nbctl lrp-set-gateway-chassis DR-S1 hv2 > +ovn-nbctl lrp-set-gateway-chassis DR-S2 hv3 > +ovn-nbctl lrp-set-gateway-chassis DR-S3 hv4 > + > +ovn-nbctl --wait=sb sync > +OVN_POPULATE_ARP > + > +vif_to_ls () { > + case ${1} in dnl ( > + vif?[[11]]) echo ls ;; dnl ( > + vif-north1) echo ls-north1 ;; dnl ( > + vif-north2) echo ls-north2 ;; dnl ( > + vif-north3) echo ls-north3 ;; dnl ( > + *) AT_FAIL_IF([:]) ;; > + esac > +} > + > +vif_to_hv () { > + case ${1} in dnl ( > + vif[[1]]?) echo hv1 ;; dnl ( > + vif-north1) echo hv5 ;; dnl ( > + vif-north2) echo hv5 ;; dnl ( > + vif-north3) echo hv5 ;; dnl ( > + *) AT_FAIL_IF([:]) ;; > + esac > +} > + > +vif_to_lrp () { > + case ${1} in dnl ( > + vif?[[11]]) echo DR-ls ;; dnl ( > + *) AT_FAIL_IF([:]) ;; > + esac > + > +} > + > +ip_to_hex() { > + printf "%02x%02x%02x%02x" "${@}" > +} > + > +# test_arp INPORT SHA SPA TPA > +# > +# Causes a packet to be received on INPORT. The packet is an ARP > +# request with SHA, SPA, and TPA as specified. > +test_arp() { > + local inport=$1 sha=$2 spa=$3 tpa=$4 > + local request=ffffffffffff${sha}08060001080006040001${sha}${spa}ffffffffffff${tpa} > + hv=`vif_to_hv $inport` > + as $hv ovs-appctl netdev-dummy/receive $inport $request > +} > + > + > +test_ip() { > + # This packet has bad checksums but logical L3 routing doesn't check. > + local inport=${1} src_mac=${2} dst_mac=${3} src_ip=${4} dst_ip=${5} outport=${6} > + local packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 > + shift; shift; shift; shift; shift > + hv=`vif_to_hv $inport` > + as $hv ovs-appctl netdev-dummy/receive $inport $packet > + in_ls=`vif_to_ls $inport` > + for outport; do > + out_ls=`vif_to_ls $outport` > + if test $in_ls = $out_ls; then > + # Ports on the same logical switch receive exactly the same packet. > + echo $packet > + else > + # Routing decrements TTL and updates source and dest MAC > + # (and checksum). > + # For North-South, packet will come via gateway chassis, i.e hv3 > + if test $inport = vif-north1; then > + echo f0000000001006ac1001000108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected > + fi > + if test $outport = vif-north1; then > + echo f0f00000001102ac1001000108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected > + fi > + if test $outport = vif-north2; then > + echo f0f00000002208ac1001000108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected > + fi > + if test $outport = vif-north3; then > + echo f0f00000003304ac1001000108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected > + fi > + fi >> $outport.expected > + done > +} > + > +echo "------ OVN dump ------" > +ovn-nbctl show > +ovn-sbctl show > +ovn-sbctl list port_binding > +ovn-sbctl list mac_binding > +ovn-sbctl list datapath_binding > + > +ovn-sbctl dump-flows DR > +ovn-sbctl dump-flows S1 > +ovn-sbctl dump-flows ls > + > +echo "------ hv1 dump ------" > +as hv1 ovs-vsctl show > +as hv1 ovs-vsctl list Open_Vswitch > +as hv1 ovs-ofctl dump-flows br-int > + > +echo "------ hv2 dump ------" > +as hv2 ovs-vsctl show > +as hv2 ovs-vsctl list Open_Vswitch > +as hv2 ovs-ofctl dump-flows br-int > + > +echo "------ hv3 dump ------" > +as hv3 ovs-vsctl show > +as hv3 ovs-vsctl list Open_Vswitch > +as hv3 ovs-ofctl dump-flows br-int > + > +echo "------ hv4 dump ------" > +as hv4 ovs-vsctl show > +as hv4 ovs-vsctl list Open_Vswitch > +as hv5 ovs-ofctl dump-flows br-int > + > +# N-S with lp-north1 > +echo "Send Dummy ARP" > +sip=`ip_to_hex 172 16 1 10` > +tip=`ip_to_hex 172 16 1 50` > +test_arp vif-north1 f0f000000011 $sip $tip > + > +echo "Send traffic North to South" > +sip=`ip_to_hex 172 16 1 10` > +dip=`ip_to_hex 20 0 0 10` > +test_ip vif-north1 f0f000000011 02ac10010001 $sip $dip vif11 > +# Confirm that North to south traffic works fine. > +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv1/vif11-tx.pcap], [vif11.expected]) > + > +echo "Send traffic South to North1" > +sip=`ip_to_hex 20 0 0 10` > +dip=`ip_to_hex 172 16 1 10` > +test_ip vif11 f00000000010 06ac10010001 $sip $dip vif-north1 > +# Confirm that South to North traffic works fine. > +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv5/vif-north1-tx.pcap], [vif-north1.expected]) > + > +# N-S with lp-north2 > +echo "Send Dummy ARP" > +sip=`ip_to_hex 10 0 0 10` > +tip=`ip_to_hex 10 0 0 50` > +test_arp vif-north2 f0f000000022 $sip $tip > + > +echo "Send traffic South to North2" > +sip=`ip_to_hex 20 0 0 10` > +dip=`ip_to_hex 10 0 0 10` > +test_ip vif11 f00000000010 06ac10010001 $sip $dip vif-north2 > +# Confirm that South to North traffic works fine. > +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv5/vif-north2-tx.pcap], [vif-north2.expected]) > + > +# N-S with lp-north3 > +echo "Send Dummy ARP" > +sip=`ip_to_hex 192 168 0 10` > +tip=`ip_to_hex 192 168 0 50` > +test_arp vif-north3 f0f000000033 $sip $tip > + > +echo "Send traffic South to North3" > +sip=`ip_to_hex 20 0 0 10` > +dip=`ip_to_hex 192 168 0 10` > +test_ip vif11 f00000000010 06ac10010001 $sip $dip vif-north3 > +# Confirm that South to North traffic works fine. > +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv5/vif-north3-tx.pcap], [vif-north3.expected]) > + > +AT_CLEANUP > +]) > -- > 2.30.2 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Fri, Aug 6, 2021 at 2:13 PM Numan Siddique <numans@ovn.org> wrote: > > On Thu, Aug 5, 2021 at 11:40 AM Han Zhou <hzhou@ovn.org> wrote: > > > > From: Ankur Sharma <ankurmnnit2004@gmail.com> > > > > By default, OVN support only one DGP (distributed gateway port) per > > logical router. While a single DGP port suffices for most of the North > > South connectivity, there are requirements where a logical router could > > be connected to multiple external networks and based on routing decision > > packet could go to different ones. > > > > This patch adds flexibility of having multiple DGPs per logical router. > > > > Changes can classified as following: > > a. Data structure changes to allow multiple DGPs per ovn_datapath. > > > > b. Consumption of new data structure in logical flows for > > individual features. > > > > c. Features that require changes are: > > i. Regular NS traffic flow. > > ii. Network Address Translation. > > iii. Load Balancer > > iv. Gateway_mtu. > > v. reside-on-redirect-chassis > > vi. Misc code sections that assumed a single DGP. > > > > d. Except for reside-on-redirect-chassis all the other features > > could be extended to multiple DGPs. Reside on redirect > > chassis with its current specification could not be extended > > and hence should be used only with the logical router that > > has a single DGP. > > > > This patch doesn't support NAT & load-balancer features for multiple > > DGPs yet, but added validations that disables NAT/load-balancer > > features when there are more than one DGP configured per router. > > > > Signed-off-by: Ankur Sharma <ankurmnnit2004@gmail.com> > > Co-authored-by: Dhathri Purohith <dhathri.purohith@nutanix.com> > > Signed-off-by: Dhathri Purohith <dhathri.purohith@nutanix.com> > > Co-authored-by: Abhiram Sangana <sangana.abhiram@nutanix.com> > > Signed-off-by: Abhiram Sangana <sangana.abhiram@nutanix.com> > > Co-authored-by: Han Zhou <hzhou@ovn.org> > > Signed-off-by: Han Zhou <hzhou@ovn.org> > > > Hi Han, > > Thanks for v2. I did some testing with this patch in a simple 2 node > setup (using ovn-fake-multinode) > > Below are the logical resources created > > ----------------------------- > [root@ovn-central ~]# ovn-nbctl show > switch 3a3c2522-fcce-49e5-8334-8a72547e7da6 (sw0) > port sw0-port4 > addresses: ["50:54:00:00:00:06 dynamic"] > port sw0-port1 > addresses: ["50:54:00:00:00:03 10.0.0.3 1000::3"] > port sw0-port2 > addresses: ["50:54:00:00:00:04 10.0.0.4 1000::4"] > port sw0-lr0 > type: router > router-port: lr0-sw0 > port sw0-port3 > addresses: ["50:54:00:00:00:05 dynamic"] > switch 0573bbd7-fca7-4f06-84ac-1939f879fd5f (sw1) > port sw1-lr0 > type: router > router-port: lr0-sw1 > port sw1-port1 > addresses: ["40:54:00:00:00:03 20.0.0.3 2000::3"] > router c7cd8dab-4e6d-45f3-a8f4-3653d25ab476 (lr0) > port lr0-sw1 > mac: "00:00:00:00:ff:02" > networks: ["20.0.0.1/24", "2000::a/64"] > port lr0-sw0 > mac: "00:00:00:00:ff:01" > networks: ["10.0.0.1/24", "1000::a/64"] > gateway chassis: [ovn-chassis-1 ovn-chassis-2] > [root@ovn-central ~]# > [root@ovn-central ~]# > [root@ovn-central ~]# ovn-sbctl show > Chassis ovn-chassis-1 > hostname: ovn-chassis-1 > Encap geneve > ip: "170.168.0.4" > options: {csum="true"} > Port_Binding sw0-port1 > Port_Binding sw0-port3 > Chassis ovn-gw-1 > hostname: ovn-gw-1 > Encap geneve > ip: "170.168.0.3" > options: {csum="true"} > Chassis ovn-chassis-2 > hostname: ovn-chassis-2 > Encap geneve > ip: "170.168.0.5" > options: {csum="true"} > Port_Binding sw1-port1 > Port_Binding sw0-port4 > Port_Binding cr-lr0-sw0 > --------- > > > As you can see the logical router port lr0-sw0 is a distributed gw > port scheduled on chassis - ovn-chassis-2. > > The issue I see is : I'm not able to ping from sw0-port1 (10.0.0.3, > claimed by chassis ovn-chassis-1) to 10.0.0.1 and I'm not able to ping > to sw1-port1 (20.0.0.3). > I'm able to ping the same from sw0-port4 to 10.0.0.1 and 20.0.0.3 > (claimed by chassis ovn-chassis-2). > > If I move cr-lr0-sw0 to ovn-chassis-1, then ping from sw0-port1 works > but not from sw0-port4. > > Is this expected ? > Yes, this is expected. I think what you are experiencing is that when sw0-port1 is pinging 10.0.0.1 or 20.0.0.3 it need s to send a ARP to resolve 10.0.0.1 first and by design of the chassis-redirect port it doesn't send out ARP response if is_chassis_resident() check fails for the DGP. I think this is not a problem. In addition, as you see your test only has a single DGP, so it is not related to this patch which just adds multiple DGP support and doesn't change this behavior. > Does it mean that if a logical router has multiple gateway router > ports connecting to geneve logical switches, then all the logical > ports of those switches > should be only bound on the corresponding gateway chassis ? I > understand this is the case with ovn-k8s. > > If this is a restriction, I think we should document it. Yes, maybe it is helpful to make it more clear in the documentation for DGP (regardless of one DGP or multiple DGPs). I can add it in a separate patch since it is not related to this patch. > > Otherwise the patch LGTM and I'm fine with the feature. The only > issue I see is that of semantics. If router port is a gateway router > port, then its > a peer's logical switch is ideally expected to have a localnet port. > But in the case of ovn-k8s, that's not the case. It is for this > reason I thought > pinning the logical switch to a particular chassis could be better. > > If you think having multiple gw router ports is better, then I'd > suggest documenting the limitations I mentioned above. Also please > see > a small nit below. And you can consider my Ack with these addressed. > > Acked-by: Numan Siddique <numans@ovn.org> > > Thanks > Numan > Thanks Numan! I applied the patch as is. Please see my explanation below. > > > > +static bool > > +is_l3dgw_port(const struct ovn_port *op) > > +{ > > + return op->cr_port; > > Since the return type is bool, I'd suggest - return !!op->cr_port; > For my understanding "!!" is not needed here since we don't require it to be 0/1. In addition, there was a discussion regarding "!!" recently that even if we need it to be 0/1, there is no need to add "!!" any more: https://mail.openvswitch.org/pipermail/ovs-dev/2021-May/382820.html Thanks, Han
On Wed, Aug 11, 2021 at 3:48 PM Han Zhou <hzhou@ovn.org> wrote: > > > > On Fri, Aug 6, 2021 at 2:13 PM Numan Siddique <numans@ovn.org> wrote: > > > > On Thu, Aug 5, 2021 at 11:40 AM Han Zhou <hzhou@ovn.org> wrote: > > > > > > From: Ankur Sharma <ankurmnnit2004@gmail.com> > > > > > > By default, OVN support only one DGP (distributed gateway port) per > > > logical router. While a single DGP port suffices for most of the North > > > South connectivity, there are requirements where a logical router could > > > be connected to multiple external networks and based on routing decision > > > packet could go to different ones. > > > > > > This patch adds flexibility of having multiple DGPs per logical router. > > > > > > Changes can classified as following: > > > a. Data structure changes to allow multiple DGPs per ovn_datapath. > > > > > > b. Consumption of new data structure in logical flows for > > > individual features. > > > > > > c. Features that require changes are: > > > i. Regular NS traffic flow. > > > ii. Network Address Translation. > > > iii. Load Balancer > > > iv. Gateway_mtu. > > > v. reside-on-redirect-chassis > > > vi. Misc code sections that assumed a single DGP. > > > > > > d. Except for reside-on-redirect-chassis all the other features > > > could be extended to multiple DGPs. Reside on redirect > > > chassis with its current specification could not be extended > > > and hence should be used only with the logical router that > > > has a single DGP. > > > > > > This patch doesn't support NAT & load-balancer features for multiple > > > DGPs yet, but added validations that disables NAT/load-balancer > > > features when there are more than one DGP configured per router. > > > > > > Signed-off-by: Ankur Sharma <ankurmnnit2004@gmail.com> > > > Co-authored-by: Dhathri Purohith <dhathri.purohith@nutanix.com> > > > Signed-off-by: Dhathri Purohith <dhathri.purohith@nutanix.com> > > > Co-authored-by: Abhiram Sangana <sangana.abhiram@nutanix.com> > > > Signed-off-by: Abhiram Sangana <sangana.abhiram@nutanix.com> > > > Co-authored-by: Han Zhou <hzhou@ovn.org> > > > Signed-off-by: Han Zhou <hzhou@ovn.org> > > > > > > Hi Han, > > > > Thanks for v2. I did some testing with this patch in a simple 2 node > > setup (using ovn-fake-multinode) > > > > Below are the logical resources created > > > > ----------------------------- > > [root@ovn-central ~]# ovn-nbctl show > > switch 3a3c2522-fcce-49e5-8334-8a72547e7da6 (sw0) > > port sw0-port4 > > addresses: ["50:54:00:00:00:06 dynamic"] > > port sw0-port1 > > addresses: ["50:54:00:00:00:03 10.0.0.3 1000::3"] > > port sw0-port2 > > addresses: ["50:54:00:00:00:04 10.0.0.4 1000::4"] > > port sw0-lr0 > > type: router > > router-port: lr0-sw0 > > port sw0-port3 > > addresses: ["50:54:00:00:00:05 dynamic"] > > switch 0573bbd7-fca7-4f06-84ac-1939f879fd5f (sw1) > > port sw1-lr0 > > type: router > > router-port: lr0-sw1 > > port sw1-port1 > > addresses: ["40:54:00:00:00:03 20.0.0.3 2000::3"] > > router c7cd8dab-4e6d-45f3-a8f4-3653d25ab476 (lr0) > > port lr0-sw1 > > mac: "00:00:00:00:ff:02" > > networks: ["20.0.0.1/24", "2000::a/64"] > > port lr0-sw0 > > mac: "00:00:00:00:ff:01" > > networks: ["10.0.0.1/24", "1000::a/64"] > > gateway chassis: [ovn-chassis-1 ovn-chassis-2] > > [root@ovn-central ~]# > > [root@ovn-central ~]# > > [root@ovn-central ~]# ovn-sbctl show > > Chassis ovn-chassis-1 > > hostname: ovn-chassis-1 > > Encap geneve > > ip: "170.168.0.4" > > options: {csum="true"} > > Port_Binding sw0-port1 > > Port_Binding sw0-port3 > > Chassis ovn-gw-1 > > hostname: ovn-gw-1 > > Encap geneve > > ip: "170.168.0.3" > > options: {csum="true"} > > Chassis ovn-chassis-2 > > hostname: ovn-chassis-2 > > Encap geneve > > ip: "170.168.0.5" > > options: {csum="true"} > > Port_Binding sw1-port1 > > Port_Binding sw0-port4 > > Port_Binding cr-lr0-sw0 > > --------- > > > > > > As you can see the logical router port lr0-sw0 is a distributed gw > > port scheduled on chassis - ovn-chassis-2. > > > > The issue I see is : I'm not able to ping from sw0-port1 (10.0.0.3, > > claimed by chassis ovn-chassis-1) to 10.0.0.1 and I'm not able to ping > > to sw1-port1 (20.0.0.3). > > I'm able to ping the same from sw0-port4 to 10.0.0.1 and 20.0.0.3 > > (claimed by chassis ovn-chassis-2). > > > > If I move cr-lr0-sw0 to ovn-chassis-1, then ping from sw0-port1 works > > but not from sw0-port4. > > > > Is this expected ? > > > > Yes, this is expected. I think what you are experiencing is that when sw0-port1 is pinging 10.0.0.1 or 20.0.0.3 it need s to send a ARP to resolve 10.0.0.1 first and by design of the chassis-redirect port it doesn't send out ARP response if is_chassis_resident() check fails for the DGP. I think this is not a problem. In addition, as you see your test only has a single DGP, so it is not related to this patch which just adds multiple DGP support and doesn't change this behavior. > > > Does it mean that if a logical router has multiple gateway router > > ports connecting to geneve logical switches, then all the logical > > ports of those switches > > should be only bound on the corresponding gateway chassis ? I > > understand this is the case with ovn-k8s. > > > > If this is a restriction, I think we should document it. > > Yes, maybe it is helpful to make it more clear in the documentation for DGP (regardless of one DGP or multiple DGPs). I can add it in a separate patch since it is not related to this patch. > > > > > Otherwise the patch LGTM and I'm fine with the feature. The only > > issue I see is that of semantics. If router port is a gateway router > > port, then its > > a peer's logical switch is ideally expected to have a localnet port. > > But in the case of ovn-k8s, that's not the case. It is for this > > reason I thought > > pinning the logical switch to a particular chassis could be better. > > > > If you think having multiple gw router ports is better, then I'd > > suggest documenting the limitations I mentioned above. Also please > > see > > a small nit below. And you can consider my Ack with these addressed. > > > > Acked-by: Numan Siddique <numans@ovn.org> > > > > Thanks > > Numan > > > > Thanks Numan! > I applied the patch as is. Please see my explanation below. + Abhiram Thanks Ankur, Dhathri and Abhiram for co-authoring! > > > > > > > +static bool > > > +is_l3dgw_port(const struct ovn_port *op) > > > +{ > > > + return op->cr_port; > > > > Since the return type is bool, I'd suggest - return !!op->cr_port; > > > > For my understanding "!!" is not needed here since we don't require it to be 0/1. In addition, there was a discussion regarding "!!" recently that even if we need it to be 0/1, there is no need to add "!!" any more: > https://mail.openvswitch.org/pipermail/ovs-dev/2021-May/382820.html > > Thanks, > Han
diff --git a/NEWS b/NEWS index f328666da..9f701caa7 100644 --- a/NEWS +++ b/NEWS @@ -35,6 +35,9 @@ OVN v21.06.0 - 18 Jun 2021 "ovn-trim-limit-lflow-cache" and "ovn-trim-wmark-perc-lflow-cache", to allow enforcing a lflow cache size limit and high watermark percentage for which automatic memory trimming is performed. + - Support multiple distributed gateway ports on a single logical router. + (NAT and load-balancer are not supported yet when there are multiple + distributed gateway ports). OVN v21.03.0 - 12 Mar 2021 ------------------------- diff --git a/northd/lrouter.dl b/northd/lrouter.dl index 4a24f3f61..d37350ab8 100644 --- a/northd/lrouter.dl +++ b/northd/lrouter.dl @@ -138,14 +138,14 @@ Warning[message] :- var message = "Bad configuration: distributed gateway port configured on " "port ${lrp.name} on L3 gateway router". -/* DistributedGatewayPortCandidate. +/* Distributed gateway ports. * - * Each row pairs a logical router with its distributed gateway port, - * but without checking that there is at most one DGP per LR. + * Each row means 'lrp' is a distributed gateway port on 'lr_uuid'. * - * (Use DistributedGatewayPort instead, since it guarantees uniqueness.) */ -relation DistributedGatewayPortCandidate(lr_uuid: uuid, lrp_uuid: uuid) -DistributedGatewayPortCandidate(lr_uuid, lrp_uuid) :- + * A logical router can have multiple distributed gateway ports. */ +relation DistributedGatewayPort(lrp: Intern<nb::Logical_Router_Port>, + lr_uuid: uuid) +DistributedGatewayPort(lrp, lr_uuid) :- lr in nb::Logical_Router(._uuid = lr_uuid), LogicalRouterPort(lrp_uuid, lr._uuid), lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid), @@ -153,30 +153,10 @@ DistributedGatewayPortCandidate(lr_uuid, lrp_uuid) :- var has_hcg = lrp.ha_chassis_group.is_some(), var has_gc = not lrp.gateway_chassis.is_empty(), has_hcg or has_gc. -Warning[message] :- - DistributedGatewayPortCandidate(lr_uuid, lrp_uuid), - var lrps = lrp_uuid.group_by(lr_uuid).to_set(), - lrps.size() > 1, - lr in nb::Logical_Router(._uuid = lr_uuid), - var message = "Bad configuration: multiple distributed gateway ports on " - "logical router ${lr.name}; ignoring all of them". - -/* Distributed gateway ports. - * - * Each row means 'lrp' is the distributed gateway port on 'lr_uuid'. - * - * There is at most one distributed gateway port per logical router. */ -relation DistributedGatewayPort(lrp: Intern<nb::Logical_Router_Port>, lr_uuid: uuid) -DistributedGatewayPort(lrp, lr_uuid) :- - DistributedGatewayPortCandidate(lr_uuid, lrp_uuid), - var lrps = lrp_uuid.group_by(lr_uuid).to_set(), - lrps.size() == 1, - Some{var lrp_uuid} = lrps.nth(0), - lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid). /* HAChassis is an abstraction over nb::Gateway_Chassis and nb::HA_Chassis, which * are different ways to represent the same configuration. Each row is - * effectively one HA_Chassis record. (Usually, we could associated each + * effectively one HA_Chassis record. (Usually, we could associate each * row with a particular 'lr_uuid', but it's permissible for more than one * logical router to use a HA chassis group, so we omit it so that multiple * references get merged.) @@ -236,18 +216,20 @@ HAChassisGroup(ha_chassis_group_uuid(hac_group_uuid), .name = name, .external_ids = external_ids). -/* Each row maps from a logical router to the name of its HAChassisGroup. - * This level of indirection is needed because multiple logical routers - * are allowed to reference a given HAChassisGroup. */ -relation LogicalRouterHAChassisGroup(lr_uuid: uuid, - hacg_uuid: uuid) -LogicalRouterHAChassisGroup(lr_uuid, ha_chassis_group_uuid(lrp._uuid)) :- - DistributedGatewayPort(lrp, lr_uuid), +/* Each row maps from a distributed gateway logical router port to the name of + * its HAChassisGroup. + * This level of indirection is needed because multiple distributed gateway + * logical router ports are allowed to reference a given HAChassisGroup. */ +relation DistributedGatewayPortHAChassisGroup( + lrp: Intern<nb::Logical_Router_Port>, + hacg_uuid: uuid) +DistributedGatewayPortHAChassisGroup(lrp, ha_chassis_group_uuid(lrp._uuid)) :- + DistributedGatewayPort(.lrp = lrp), lrp.ha_chassis_group == None, lrp.gateway_chassis.size() > 0. -LogicalRouterHAChassisGroup(lr_uuid, - ha_chassis_group_uuid(hac_group_uuid)) :- - DistributedGatewayPort(lrp, lr_uuid), +DistributedGatewayPortHAChassisGroup(lrp, + ha_chassis_group_uuid(hac_group_uuid)) :- + DistributedGatewayPort(.lrp = lrp), Some{var hac_group_uuid} = lrp.ha_chassis_group, nb::HA_Chassis_Group(._uuid = hac_group_uuid). @@ -259,14 +241,19 @@ RouterPortIsRedirect(lrp, false) :- &nb::Logical_Router_Port(._uuid = lrp), not DistributedGatewayPort(&nb::Logical_Router_Port{._uuid = lrp}, _). -relation LogicalRouterRedirectPort(lr: uuid, has_redirect_port: Option<Intern<nb::Logical_Router_Port>>) - -LogicalRouterRedirectPort(lr, Some{lrp}) :- - DistributedGatewayPort(lrp, lr). - -LogicalRouterRedirectPort(lr, None) :- - nb::Logical_Router(._uuid = lr), - not DistributedGatewayPort(_, lr). +/* + * LogicalRouterDGWPorts maps from each logical router UUID + * to the logical router's set of distributed gateway (or redirect) ports. */ +relation LogicalRouterDGWPorts( + lr_uuid: uuid, + l3dgw_ports: Vec<Intern<nb::Logical_Router_Port>>) +LogicalRouterDGWPorts(lr_uuid, l3dgw_ports) :- + DistributedGatewayPort(lrp, lr_uuid), + var l3dgw_ports = lrp.group_by(lr_uuid).to_vec(). +LogicalRouterDGWPorts(lr_uuid, vec_empty()) :- + lr in nb::Logical_Router(), + var lr_uuid = lr._uuid, + not DistributedGatewayPort(_, lr_uuid). typedef ExceptionalExtIps = AllowedExtIps{ips: Intern<nb::Address_Set>} | ExemptedExtIps{ips: Intern<nb::Address_Set>} @@ -450,9 +437,7 @@ LogicalRouterCopp0(lr, meters) :- /* Router relation collects all attributes of a logical router. * - * `l3dgw_port` - optional redirect port (see `DistributedGatewayPort`) - * `redirect_port_name` - derived redirect port name (or empty string if - * router does not have a redirect port) + * `l3dgw_ports` - optional redirect ports (see `DistributedGatewayPort`) * `is_gateway` - true iff the router is a gateway router. Together with * `l3dgw_port`, this flag affects the generation of various flows * related to NAT and load balancing. @@ -474,8 +459,7 @@ typedef Router = Router { external_ids: Map<string,string>, /* Additional computed fields. */ - l3dgw_port: Option<Intern<nb::Logical_Router_Port>>, - redirect_port_name: string, + l3dgw_ports: Vec<Intern<nb::Logical_Router_Port>>, is_gateway: bool, nats: Vec<NAT>, snat_ips: Map<v46_ip, Set<NAT>>, @@ -498,23 +482,18 @@ Router[Router{ .options = lr.options, .external_ids = lr.external_ids, - .l3dgw_port = l3dgw_port, - .redirect_port_name = - match (l3dgw_port) { - Some{rport} -> json_string_escape(chassis_redirect_name(rport.name)), - _ -> "" - }, - .is_gateway = lr.options.contains_key("chassis"), - .nats = nats, - .snat_ips = snat_ips, - .lbs = lbs, - .mcast_cfg = mcast_cfg, + .l3dgw_ports = l3dgw_ports, + .is_gateway = lr.options.contains_key("chassis"), + .nats = nats, + .snat_ips = snat_ips, + .lbs = lbs, + .mcast_cfg = mcast_cfg, .learn_from_arp_request = learn_from_arp_request, .force_lb_snat = force_lb_snat, .copp = copp}.intern()] :- lr in nb::Logical_Router(), lr.is_enabled(), - LogicalRouterRedirectPort(lr._uuid, l3dgw_port), + LogicalRouterDGWPorts(lr._uuid, l3dgw_ports), LogicalRouterNATs(lr._uuid, nats), LogicalRouterLBs(lr._uuid, lbs), LogicalRouterSnatIPs(lr._uuid, snat_ips), diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index de4fe90c7..5afae743f 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -3815,10 +3815,10 @@ icmp6 { <h3>Ingress Table 17: Gateway Redirect</h3> <p> - For distributed logical routers where one of the logical router + For distributed logical routers where one or more of the logical router ports specifies a gateway chassis, this table redirects - certain packets to the distributed gateway port instance on the - gateway chassis. This table has the following flows: + certain packets to the distributed gateway port instances on the + gateway chassises. This table has the following flows: </p> <ul> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 605e33486..b7398004d 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -655,13 +655,12 @@ struct ovn_datapath { bool is_gw_router; /* OVN northd only needs to know about the logical router gateway port for - * NAT on a distributed router. This "distributed gateway port" is - * populated only when there is a gateway chassis specified for one of - * the ports on the logical router. Otherwise this will be NULL. */ - struct ovn_port *l3dgw_port; - /* The "derived" OVN port representing the instance of l3dgw_port on - * the gateway chassis. */ - struct ovn_port *l3redirect_port; + * NAT on a distributed router. The "distributed gateway ports" are + * populated only when there is a gateway chassis or ha chassis group + * specified for some of the ports on the logical router. Otherwise this + * will be NULL. */ + struct ovn_port **l3dgw_ports; + size_t n_l3dgw_ports; /* NAT entries configured on the router. */ struct ovn_nat *nat_entries; @@ -802,6 +801,16 @@ init_nat_entries(struct ovn_datapath *od) return; } + if (od->n_l3dgw_ports > 1) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + VLOG_WARN_RL(&rl, "NAT is configured on logical router %s, which has %" + PRIuSIZE" distributed gateway ports. NAT is not supported" + " yet when there is more than one distributed gateway " + "port on the router.", + od->nbr->name, od->n_l3dgw_ports); + return; + } + od->nat_entries = xmalloc(od->nbr->n_nat * sizeof *od->nat_entries); for (size_t i = 0; i < od->nbr->n_nat; i++) { @@ -941,6 +950,7 @@ ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od) destroy_lb_ips(od); free(od->nat_entries); free(od->localnet_ports); + free(od->l3dgw_ports); ovn_ls_port_group_destroy(&od->nb_pgs); destroy_mcast_info_for_datapath(od); @@ -1489,9 +1499,18 @@ struct ovn_port { /* Logical port multicast data. */ struct mcast_port_info mcast_info; - /* This is ordinarily false. It is true if and only if this ovn_port is - * derived from a chassis-redirect port. */ - bool derived; + /* At most one of l3dgw_port and cr_port can be not NULL. */ + + /* This is set to a distributed gateway port if and only if this ovn_port + * is "derived" from it. Otherwise this is set to NULL. The derived + * ovn_port represents the instance of distributed gateway port on the + * gateway chassis.*/ + struct ovn_port *l3dgw_port; + + /* This is set to the "derived" chassis-redirect port of this port if and + * only if this port is a distributed gateway port. Otherwise this is set + * to NULL. */ + struct ovn_port *cr_port; bool has_unknown; /* If the addresses have 'unknown' defined. */ @@ -1512,6 +1531,18 @@ struct ovn_port { struct ovs_list list; /* In list of similar records. */ }; +static bool +is_l3dgw_port(const struct ovn_port *op) +{ + return op->cr_port; +} + +static bool +is_cr_port(const struct ovn_port *op) +{ + return op->l3dgw_port; +} + static void destroy_routable_addresses(struct ovn_port_routable_addresses *ra) { @@ -1578,7 +1609,7 @@ ovn_port_create(struct hmap *ports, const char *key, op->key = xstrdup(key); op->sb = sb; ovn_port_set_nb(op, nbsp, nbrp); - op->derived = false; + op->l3dgw_port = op->cr_port = NULL; hmap_insert(ports, &op->key_node, hash_string(op->key, 0)); return op; } @@ -1682,7 +1713,7 @@ lrport_is_enabled(const struct nbrec_logical_router_port *lrport) static struct ovn_port * ovn_port_get_peer(struct hmap *ports, struct ovn_port *op) { - if (!op->nbsp || !lsp_is_router(op->nbsp) || op->derived) { + if (!op->nbsp || !lsp_is_router(op->nbsp) || op->l3dgw_port) { return NULL; } @@ -2426,6 +2457,7 @@ join_logical_ports(struct northd_context *ctx, tag_alloc_add_existing_tags(tag_alloc_table, nbsp); } } else { + size_t n_allocated_l3dgw_ports = 0; for (size_t i = 0; i < od->nbr->n_ports; i++) { const struct nbrec_logical_router_port *nbrp = od->nbr->ports[i]; @@ -2481,36 +2513,32 @@ join_logical_ports(struct northd_context *ctx, "on L3 gateway router", nbrp->name); continue; } - if (od->l3dgw_port || od->l3redirect_port) { - static struct vlog_rate_limit rl - = VLOG_RATE_LIMIT_INIT(1, 1); - VLOG_WARN_RL(&rl, "Bad configuration: multiple " - "distributed gateway ports on logical " - "router %s", od->nbr->name); - continue; - } char *redirect_name = ovn_chassis_redirect_name(nbrp->name); struct ovn_port *crp = ovn_port_find(ports, redirect_name); if (crp && crp->sb && crp->sb->datapath == od->sb) { - crp->derived = true; ovn_port_set_nb(crp, NULL, nbrp); ovs_list_remove(&crp->list); ovs_list_push_back(both, &crp->list); } else { crp = ovn_port_create(ports, redirect_name, NULL, nbrp, NULL); - crp->derived = true; ovs_list_push_back(nb_only, &crp->list); } + crp->l3dgw_port = op; + op->cr_port = crp; crp->od = od; free(redirect_name); - /* Set l3dgw_port and l3redirect_port in od, for later - * use during flow creation. */ - od->l3dgw_port = op; - od->l3redirect_port = crp; + /* Add to l3dgw_ports in od, for later use during flow + * creation. */ + if (od->n_l3dgw_ports == n_allocated_l3dgw_ports) { + od->l3dgw_ports = x2nrealloc(od->l3dgw_ports, + &n_allocated_l3dgw_ports, + sizeof *od->l3dgw_ports); + } + od->l3dgw_ports[od->n_l3dgw_ports++] = op; assign_routable_addresses(op); } @@ -2522,7 +2550,7 @@ join_logical_ports(struct northd_context *ctx, * to their peers. */ struct ovn_port *op; HMAP_FOR_EACH (op, key_node, ports) { - if (op->nbsp && lsp_is_router(op->nbsp) && !op->derived) { + if (op->nbsp && lsp_is_router(op->nbsp) && !op->l3dgw_port) { struct ovn_port *peer = ovn_port_get_peer(ports, op); if (!peer || !peer->nbrp) { continue; @@ -2553,7 +2581,7 @@ join_logical_ports(struct northd_context *ctx, if (peer->od && peer->od->mcast_info.rtr.relay) { op->od->mcast_info.sw.flood_relay = true; } - } else if (op->nbrp && op->nbrp->peer && !op->derived) { + } else if (op->nbrp && op->nbrp->peer && !op->l3dgw_port) { struct ovn_port *peer = ovn_port_find(ports, op->nbrp->peer); if (peer) { if (peer->nbrp) { @@ -2598,7 +2626,8 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, bool routable_only) struct eth_addr mac; if (!op || !op->nbrp || !op->od || !op->od->nbr || (!op->od->nbr->n_nat && !op->od->nbr->n_load_balancer) - || !eth_addr_from_string(op->nbrp->mac, &mac)) { + || !eth_addr_from_string(op->nbrp->mac, &mac) + || op->od->n_l3dgw_ports > 1) { *n = n_nats; return NULL; } @@ -2629,7 +2658,7 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, bool routable_only) /* Determine whether this NAT rule satisfies the conditions for * distributed NAT processing. */ - if (op->od->l3redirect_port && !strcmp(nat->type, "dnat_and_snat") + if (op->od->n_l3dgw_ports && !strcmp(nat->type, "dnat_and_snat") && nat->logical_port && nat->external_mac) { /* Distributed NAT rule. */ if (eth_addr_from_string(nat->external_mac, &mac)) { @@ -2695,9 +2724,9 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, bool routable_only) if (central_ip_address) { /* Gratuitous ARP for centralized NAT rules on distributed gateway * ports should be restricted to the gateway chassis. */ - if (op->od->l3redirect_port) { + if (op->od->n_l3dgw_ports) { ds_put_format(&c_addresses, " is_chassis_resident(%s)", - op->od->l3redirect_port->json_key); + op->od->l3dgw_ports[0]->cr_port->json_key); } addresses[n_nats++] = ds_steal_cstr(&c_addresses); @@ -3010,7 +3039,7 @@ ovn_port_update_sbrec(struct northd_context *ctx, /* If the router is for l3 gateway, it resides on a chassis * and its port type is "l3gateway". */ const char *chassis_name = smap_get(&op->od->nbr->options, "chassis"); - if (op->derived) { + if (is_cr_port(op)) { sbrec_port_binding_set_type(op->sb, "chassisredirect"); } else if (chassis_name) { sbrec_port_binding_set_type(op->sb, "l3gateway"); @@ -3020,7 +3049,7 @@ ovn_port_update_sbrec(struct northd_context *ctx, struct smap new; smap_init(&new); - if (op->derived) { + if (is_cr_port(op)) { const char *redirect_type = smap_get(&op->nbrp->options, "redirect-type"); @@ -3200,7 +3229,7 @@ ovn_port_update_sbrec(struct northd_context *ctx, char **nats = NULL; if (nat_addresses && !strcmp(nat_addresses, "router")) { if (op->peer && op->peer->od - && (chassis || op->peer->od->l3redirect_port)) { + && (chassis || op->peer->od->n_l3dgw_ports)) { nats = get_nat_addresses(op->peer, &n_nats, false); } /* Only accept manual specification of ethernet address @@ -3236,12 +3265,26 @@ ovn_port_update_sbrec(struct northd_context *ctx, * sending the GARPs for the router port IPs. * */ bool add_router_port_garp = false; - if (op->peer && op->peer->nbrp && op->peer->od->l3dgw_port && - op->peer->od->l3redirect_port && - (smap_get_bool(&op->peer->nbrp->options, - "reside-on-redirect-chassis", false) || - op->peer == op->peer->od->l3dgw_port)) { - add_router_port_garp = true; + if (op->peer && op->peer->nbrp && op->peer->od->n_l3dgw_ports) { + if (is_l3dgw_port(op->peer)) { + add_router_port_garp = true; + } else if (smap_get_bool(&op->peer->nbrp->options, + "reside-on-redirect-chassis", false)) { + if (op->peer->od->n_l3dgw_ports == 1) { + add_router_port_garp = true; + } else { + static struct vlog_rate_limit rl = + VLOG_RATE_LIMIT_INIT(1, 1); + VLOG_WARN_RL(&rl, "\"reside-on-redirect-chassis\" is " + "set on logical router port %s, which " + "is on logical router %s, which has %" + PRIuSIZE" distributed gateway ports. This" + "option can only be used when there is " + "a single distributed gateway port.", + op->peer->key, op->peer->od->nbr->name, + op->peer->od->n_l3dgw_ports); + } + } } else if (chassis && op->od->n_localnet_ports) { add_router_port_garp = true; } @@ -3256,9 +3299,10 @@ ovn_port_update_sbrec(struct northd_context *ctx, op->peer->lrp_networks.ipv4_addrs[i].addr_s); } - if (op->peer->od->l3redirect_port) { + if (op->peer->od->n_l3dgw_ports) { ds_put_format(&garp_info, " is_chassis_resident(%s)", - op->peer->od->l3redirect_port->json_key); + op->peer->od->l3dgw_ports[0] + ->cr_port->json_key); } n_nats++; @@ -3531,7 +3575,17 @@ build_ovn_lr_lbs(struct hmap *datapaths, struct hmap *lbs) if (!od->nbr) { continue; } - if (!smap_get(&od->nbr->options, "chassis") && !od->l3dgw_port) { + if (!smap_get(&od->nbr->options, "chassis") + && od->n_l3dgw_ports != 1) { + if (od->n_l3dgw_ports > 1 && od->nbr->n_load_balancer) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + VLOG_WARN_RL(&rl, "Load-balancers are configured on logical " + "router %s, which has %"PRIuSIZE" distributed " + "gateway ports. Load-balancer is not supported " + "yet when there is more than one distributed " + "gateway port on the router.", + od->nbr->name, od->n_l3dgw_ports); + } continue; } @@ -6433,13 +6487,14 @@ build_lrouter_groups__(struct hmap *ports, struct ovn_datapath *od) { ovs_assert((od && od->nbr && od->lr_group)); - if (od->l3dgw_port && od->l3redirect_port) { - /* It's a logical router with gateway port. If it - * has HA_Chassis_Group associated to it in SB DB, then store the - * ha chassis group name. */ - if (od->l3redirect_port->sb->ha_chassis_group) { + /* For logical router with distributed gateway ports. If it + * has HA_Chassis_Group associated to it in SB DB, then store the + * ha chassis group name. */ + for (size_t i = 0; i < od->n_l3dgw_ports; i++) { + struct ovn_port *crp = od->l3dgw_ports[i]->cr_port; + if (crp->sb->ha_chassis_group) { sset_add(&od->lr_group->ha_chassis_groups, - od->l3redirect_port->sb->ha_chassis_group->name); + crp->sb->ha_chassis_group->name); } } @@ -7800,16 +7855,17 @@ build_lswitch_ip_unicast_lookup(struct ovn_port *op, ds_clear(match); ds_put_format(match, "eth.dst == "ETH_ADDR_FMT, ETH_ADDR_ARGS(mac)); - if (op->peer->od->l3dgw_port - && op->peer->od->l3redirect_port + if (op->peer->od->n_l3dgw_ports && op->od->n_localnet_ports) { bool add_chassis_resident_check = false; - if (op->peer == op->peer->od->l3dgw_port) { + const char *json_key; + if (is_l3dgw_port(op->peer)) { /* The peer of this port represents a distributed * gateway port. The destination lookup flow for the * router's distributed gateway port MAC address should * only be programmed on the gateway chassis. */ add_chassis_resident_check = true; + json_key = op->peer->cr_port->json_key; } else { /* Check if the option 'reside-on-redirect-chassis' * is set to true on the peer port. If set to true @@ -7820,12 +7876,15 @@ build_lswitch_ip_unicast_lookup(struct ovn_port *op, */ add_chassis_resident_check = smap_get_bool( &op->peer->nbrp->options, - "reside-on-redirect-chassis", false); + "reside-on-redirect-chassis", false) && + op->peer->od->n_l3dgw_ports == 1; + json_key = + op->peer->od->l3dgw_ports[0]->cr_port->json_key; } if (add_chassis_resident_check) { ds_put_format(match, " && is_chassis_resident(%s)", - op->peer->od->l3redirect_port->json_key); + json_key); } } @@ -7838,8 +7897,7 @@ build_lswitch_ip_unicast_lookup(struct ovn_port *op, /* Add ethernet addresses specified in NAT rules on * distributed logical routers. */ - if (op->peer->od->l3dgw_port - && op->peer == op->peer->od->l3dgw_port) { + if (is_l3dgw_port(op->peer)) { for (int j = 0; j < op->peer->od->nbr->n_nat; j++) { const struct nbrec_nat *nat = op->peer->od->nbr->nat[j]; @@ -9106,14 +9164,14 @@ build_lrouter_nat_flows_for_lb(struct ovn_lb_vip *lb_vip, &lb->nlb->header_); } - if (od->l3redirect_port && + if (od->n_l3dgw_ports && (lb_vip->n_backends || !lb_vip->empty_backend_rej)) { new_match_p = xasprintf("%s && is_chassis_resident(%s)", new_match, - od->l3redirect_port->json_key); + od->l3dgw_ports[0]->cr_port->json_key); est_match_p = xasprintf("%s && is_chassis_resident(%s)", est_match, - od->l3redirect_port->json_key); + od->l3dgw_ports[0]->cr_port->json_key); } if (snat_type == NO_FORCE_SNAT && @@ -9158,15 +9216,15 @@ build_lrouter_nat_flows_for_lb(struct ovn_lb_vip *lb_vip, free(est_match_p); } - if (!od->l3dgw_port || !od->l3redirect_port || !lb_vip->n_backends) { + if (!od->n_l3dgw_ports || !lb_vip->n_backends) { goto next; } - char *undnat_match_p = xasprintf("%s) && outport == %s && " - "is_chassis_resident(%s)", - ds_cstr(&undnat_match), - od->l3dgw_port->json_key, - od->l3redirect_port->json_key); + char *undnat_match_p = xasprintf( + "%s) && outport == %s && is_chassis_resident(%s)", + ds_cstr(&undnat_match), + od->l3dgw_ports[0]->json_key, + od->l3dgw_ports[0]->cr_port->json_key); if (snat_type == SKIP_SNAT) { ovn_lflow_add_with_hint(lflows, od, S_ROUTER_OUT_UNDNAT, 120, undnat_match_p, skip_snat_est_action, @@ -9662,9 +9720,9 @@ build_lrouter_port_nat_arp_nd_flow(struct ovn_port *op, * upstream MAC learning points to the gateway chassis. * Also need to avoid generation of multiple ARP responses * from different chassis. */ - if (op->od->l3redirect_port) { + if (op->od->n_l3dgw_ports) { ds_put_format(&match, "is_chassis_resident(%s)", - op->od->l3redirect_port->json_key); + op->od->l3dgw_ports[0]->cr_port->json_key); } } @@ -9938,7 +9996,7 @@ build_adm_ctrl_flows_for_lrouter_port( return; } - if (op->derived) { + if (is_cr_port(op)) { /* No ingress packets should be received on a chassisredirect * port. */ return; @@ -9963,12 +10021,11 @@ build_adm_ctrl_flows_for_lrouter_port( ds_clear(match); ds_put_format(match, "eth.dst == %s && inport == %s", op->lrp_networks.ea_s, op->json_key); - if (op->od->l3dgw_port && op == op->od->l3dgw_port - && op->od->l3redirect_port) { + if (is_l3dgw_port(op)) { /* Traffic with eth.dst = l3dgw_port->lrp_networks.ea_s * should only be received on the gateway chassis. */ ds_put_format(match, " && is_chassis_resident(%s)", - op->od->l3redirect_port->json_key); + op->cr_port->json_key); } ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_ADMISSION, 50, ds_cstr(match), ds_cstr(actions), @@ -10107,10 +10164,9 @@ build_neigh_learning_flows_for_lrouter_port( op->lrp_networks.ipv4_addrs[i].network_s, op->lrp_networks.ipv4_addrs[i].plen, op->lrp_networks.ipv4_addrs[i].addr_s); - if (op->od->l3dgw_port && op == op->od->l3dgw_port - && op->od->l3redirect_port) { + if (is_l3dgw_port(op)) { ds_put_format(match, " && is_chassis_resident(%s)", - op->od->l3redirect_port->json_key); + op->cr_port->json_key); } const char *actions_s = REGBIT_LOOKUP_NEIGHBOR_RESULT " = lookup_arp(inport, arp.spa, arp.sha); " @@ -10127,10 +10183,9 @@ build_neigh_learning_flows_for_lrouter_port( op->json_key, op->lrp_networks.ipv4_addrs[i].network_s, op->lrp_networks.ipv4_addrs[i].plen); - if (op->od->l3dgw_port && op == op->od->l3dgw_port - && op->od->l3redirect_port) { + if (is_l3dgw_port(op)) { ds_put_format(match, " && is_chassis_resident(%s)", - op->od->l3redirect_port->json_key); + op->cr_port->json_key); } ds_clear(actions); ds_put_format(actions, REGBIT_LOOKUP_NEIGHBOR_RESULT @@ -10620,7 +10675,7 @@ build_arp_resolve_flows_for_lrouter_port( } } - if (!op->derived && op->od->l3redirect_port) { + if (is_l3dgw_port(op)) { const char *redirect_type = smap_get(&op->nbrp->options, "redirect-type"); if (redirect_type && !strcasecmp(redirect_type, "bridged")) { @@ -10633,7 +10688,7 @@ build_arp_resolve_flows_for_lrouter_port( ds_clear(match); ds_put_format(match, "outport == %s && " "!is_chassis_resident(%s)", op->json_key, - op->od->l3redirect_port->json_key); + op->cr_port->json_key); ds_clear(actions); ds_put_format(actions, "eth.dst = %s; next;", op->lrp_networks.ea_s); @@ -10881,8 +10936,8 @@ build_arp_resolve_flows_for_lrouter_port( &op->nbsp->header_); } - if (smap_get(&peer->od->nbr->options, "chassis") || - (peer->od->l3dgw_port && peer == peer->od->l3dgw_port)) { + if (smap_get(&peer->od->nbr->options, "chassis") + || peer->cr_port) { routable_addresses_to_lflows(lflows, router_port, peer, match, actions); } @@ -11079,32 +11134,32 @@ build_gateway_redirect_flows_for_lrouter( struct ovn_datapath *od, struct hmap *lflows, struct ds *match, struct ds *actions) { - if (od->nbr) { - if (od->l3dgw_port && od->l3redirect_port) { - const struct ovsdb_idl_row *stage_hint = NULL; - - if (od->l3dgw_port->nbrp) { - stage_hint = &od->l3dgw_port->nbrp->header_; - } + if (!od->nbr) { + return; + } + for (size_t i = 0; i < od->n_l3dgw_ports; i++) { + const struct ovsdb_idl_row *stage_hint = NULL; - /* For traffic with outport == l3dgw_port, if the - * packet did not match any higher priority redirect - * rule, then the traffic is redirected to the central - * instance of the l3dgw_port. */ - ds_clear(match); - ds_put_format(match, "outport == %s", - od->l3dgw_port->json_key); - ds_clear(actions); - ds_put_format(actions, "outport = %s; next;", - od->l3redirect_port->json_key); - ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50, - ds_cstr(match), ds_cstr(actions), - stage_hint); + if (od->l3dgw_ports[i]->nbrp) { + stage_hint = &od->l3dgw_ports[i]->nbrp->header_; } - /* Packets are allowed by default. */ - ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0, "1", "next;"); + /* For traffic with outport == l3dgw_port, if the + * packet did not match any higher priority redirect + * rule, then the traffic is redirected to the central + * instance of the l3dgw_port. */ + ds_clear(match); + ds_put_format(match, "outport == %s", + od->l3dgw_ports[i]->json_key); + ds_clear(actions); + ds_put_format(actions, "outport = %s; next;", + od->l3dgw_ports[i]->cr_port->json_key); + ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50, + ds_cstr(match), ds_cstr(actions), + stage_hint); } + /* Packets are allowed by default. */ + ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0, "1", "next;"); } /* Local router ingress table ARP_REQUEST: ARP request. @@ -11203,7 +11258,7 @@ build_egress_delivery_flows_for_lrouter_port( return; } - if (op->derived) { + if (is_cr_port(op)) { /* No egress packets should be processed in the context of * a chassisredirect port. The chassisredirect port should * be replaced by the l3dgw port in the local output @@ -11293,7 +11348,7 @@ build_dhcpv6_reply_flows_for_lrouter_port( struct ovn_port *op, struct hmap *lflows, struct ds *match) { - if (op->nbrp && (!op->derived)) { + if (op->nbrp && (!op->l3dgw_port)) { for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { ds_clear(match); ds_put_format(match, "ip6.dst == %s && udp.src == 547 &&" @@ -11313,7 +11368,7 @@ build_ipv6_input_flows_for_lrouter_port( struct ds *match, struct ds *actions, struct shash *meter_groups) { - if (op->nbrp && (!op->derived)) { + if (op->nbrp && (!op->l3dgw_port)) { /* No ingress packets are accepted on a chassisredirect * port, so no need to program flows for that port. */ if (op->lrp_networks.n_ipv6_addrs) { @@ -11339,15 +11394,14 @@ build_ipv6_input_flows_for_lrouter_port( * router's own IP address. */ for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { ds_clear(match); - if (op->od->l3dgw_port && op == op->od->l3dgw_port - && op->od->l3redirect_port) { + if (is_l3dgw_port(op)) { /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s * should only be sent from the gateway chassi, so that * upstream MAC learning points to the gateway chassis. * Also need to avoid generation of multiple ND replies * from different chassis. */ ds_put_format(match, "is_chassis_resident(%s)", - op->od->l3redirect_port->json_key); + op->cr_port->json_key); } build_lrouter_nd_flow(op->od, op, "nd_na_router", @@ -11358,7 +11412,7 @@ build_ipv6_input_flows_for_lrouter_port( } /* UDP/TCP/SCTP port unreachable */ - if (!op->od->is_gw_router && !op->od->l3dgw_port) { + if (!op->od->is_gw_router && !op->od->n_l3dgw_ports) { for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { ds_clear(match); ds_put_format(match, @@ -11528,7 +11582,7 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, { /* No ingress packets are accepted on a chassisredirect * port, so no need to program flows for that port. */ - if (op->nbrp && (!op->derived)) { + if (op->nbrp && (!op->l3dgw_port)) { if (op->lrp_networks.n_ipv4_addrs) { /* L3 admission control: drop packets that originate from an * IPv4 address owned by the router or a broadcast address @@ -11598,16 +11652,18 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, op->lrp_networks.ipv4_addrs[i].network_s, op->lrp_networks.ipv4_addrs[i].plen); - if (op->od->l3dgw_port && op->od->l3redirect_port && op->peer + if (op->od->n_l3dgw_ports && op->peer && op->peer->od->n_localnet_ports) { bool add_chassis_resident_check = false; - if (op == op->od->l3dgw_port) { + const char *json_key; + if (is_l3dgw_port(op)) { /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s * should only be sent from the gateway chassis, so that * upstream MAC learning points to the gateway chassis. * Also need to avoid generation of multiple ARP responses * from different chassis. */ add_chassis_resident_check = true; + json_key = op->cr_port->json_key; } else { /* Check if the option 'reside-on-redirect-chassis' * is set to true on the router port. If set to true @@ -11619,12 +11675,14 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, */ add_chassis_resident_check = smap_get_bool( &op->nbrp->options, - "reside-on-redirect-chassis", false); + "reside-on-redirect-chassis", false) && + op->od->n_l3dgw_ports == 1; + json_key = op->od->l3dgw_ports[0]->cr_port->json_key; } if (add_chassis_resident_check) { ds_put_format(match, " && is_chassis_resident(%s)", - op->od->l3redirect_port->json_key); + json_key); } } @@ -11637,9 +11695,9 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, const char *ip_address; if (sset_count(&op->od->lb_ips_v4)) { ds_clear(match); - if (op == op->od->l3dgw_port) { + if (is_l3dgw_port(op)) { ds_put_format(match, "is_chassis_resident(%s)", - op->od->l3redirect_port->json_key); + op->cr_port->json_key); } struct ds load_balancer_ips_v4 = DS_EMPTY_INITIALIZER; @@ -11657,9 +11715,9 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, SSET_FOR_EACH (ip_address, &op->od->lb_ips_v6) { ds_clear(match); - if (op == op->od->l3dgw_port) { + if (is_l3dgw_port(op)) { ds_put_format(match, "is_chassis_resident(%s)", - op->od->l3redirect_port->json_key); + op->cr_port->json_key); } build_lrouter_nd_flow(op->od, op, "nd_na", @@ -11668,7 +11726,7 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, lflows, meter_groups); } - if (!op->od->is_gw_router && !op->od->l3dgw_port) { + if (!op->od->is_gw_router && !op->od->n_l3dgw_ports) { /* UDP/TCP/SCTP port unreachable. */ for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) { ds_clear(match); @@ -11765,7 +11823,7 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op, * exception is on the l3dgw_port where we might need to use a * different ETH address. */ - if (op != op->od->l3dgw_port) { + if (!is_l3dgw_port(op)) { return; } @@ -11847,12 +11905,12 @@ build_lrouter_in_unsnat_flow(struct hmap *lflows, struct ovn_datapath *od, ds_clear(actions); ds_put_format(match, "ip && ip%s.dst == %s && inport == %s", is_v6 ? "6" : "4", nat->external_ip, - od->l3dgw_port->json_key); - if (!distributed && od->l3redirect_port) { + od->l3dgw_ports[0]->json_key); + if (!distributed && od->n_l3dgw_ports) { /* Flows for NAT rules that are centralized are only * programmed on the gateway chassis. */ ds_put_format(match, " && is_chassis_resident(%s)", - od->l3redirect_port->json_key); + od->l3dgw_ports[0]->cr_port->json_key); } if (!strcmp(nat->type, "dnat_and_snat") && stateless) { @@ -11924,12 +11982,12 @@ build_lrouter_in_dnat_flow(struct hmap *lflows, struct ovn_datapath *od, ds_clear(match); ds_put_format(match, "ip && ip%s.dst == %s && inport == %s", is_v6 ? "6" : "4", nat->external_ip, - od->l3dgw_port->json_key); - if (!distributed && od->l3redirect_port) { + od->l3dgw_ports[0]->json_key); + if (!distributed && od->n_l3dgw_ports) { /* Flows for NAT rules that are centralized are only * programmed on the gateway chassis. */ ds_put_format(match, " && is_chassis_resident(%s)", - od->l3redirect_port->json_key); + od->l3dgw_ports[0]->cr_port->json_key); } ds_clear(actions); if (nat->allowed_ext_ips || nat->exempted_ext_ips) { @@ -11968,7 +12026,7 @@ build_lrouter_out_undnat_flow(struct hmap *lflows, struct ovn_datapath *od, * * Note that this only applies for NAT on a distributed router. */ - if (!od->l3dgw_port || + if (!od->n_l3dgw_ports || (strcmp(nat->type, "dnat") && strcmp(nat->type, "dnat_and_snat"))) { return; } @@ -11976,12 +12034,12 @@ build_lrouter_out_undnat_flow(struct hmap *lflows, struct ovn_datapath *od, ds_clear(match); ds_put_format(match, "ip && ip%s.src == %s && outport == %s", is_v6 ? "6" : "4", nat->logical_ip, - od->l3dgw_port->json_key); - if (!distributed && od->l3redirect_port) { + od->l3dgw_ports[0]->json_key); + if (!distributed && od->n_l3dgw_ports) { /* Flows for NAT rules that are centralized are only * programmed on the gateway chassis. */ ds_put_format(match, " && is_chassis_resident(%s)", - od->l3redirect_port->json_key); + od->l3dgw_ports[0]->cr_port->json_key); } ds_clear(actions); if (distributed) { @@ -12054,13 +12112,13 @@ build_lrouter_out_snat_flow(struct hmap *lflows, struct ovn_datapath *od, ds_clear(match); ds_put_format(match, "ip && ip%s.src == %s && outport == %s", is_v6 ? "6" : "4", nat->logical_ip, - od->l3dgw_port->json_key); - if (!distributed && od->l3redirect_port) { + od->l3dgw_ports[0]->json_key); + if (!distributed && od->n_l3dgw_ports) { /* Flows for NAT rules that are centralized are only * programmed on the gateway chassis. */ priority += 128; ds_put_format(match, " && is_chassis_resident(%s)", - od->l3redirect_port->json_key); + od->l3dgw_ports[0]->cr_port->json_key); } ds_clear(actions); @@ -12101,11 +12159,11 @@ build_lrouter_ingress_flow(struct hmap *lflows, struct ovn_datapath *od, struct ds *actions, struct eth_addr mac, bool distributed, bool is_v6) { - if (od->l3dgw_port && !strcmp(nat->type, "snat")) { + if (od->n_l3dgw_ports && !strcmp(nat->type, "snat")) { ds_clear(match); ds_put_format( match, "inport == %s && %s == %s", - od->l3dgw_port->json_key, + od->l3dgw_ports[0]->json_key, is_v6 ? "ip6.src" : "ip4.src", nat->external_ip); ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_IP_INPUT, 120, ds_cstr(match), "next;", @@ -12123,16 +12181,16 @@ build_lrouter_ingress_flow(struct hmap *lflows, struct ovn_datapath *od, */ ds_clear(actions); - build_check_pkt_len_action_string(od->l3dgw_port, actions); + build_check_pkt_len_action_string(od->l3dgw_ports[0], actions); ds_put_format(actions, REG_INPORT_ETH_ADDR " = %s; next;", - od->l3dgw_port->lrp_networks.ea_s); + od->l3dgw_ports[0]->lrp_networks.ea_s); ds_clear(match); ds_put_format(match, "eth.dst == "ETH_ADDR_FMT" && inport == %s" " && is_chassis_resident(\"%s\")", ETH_ADDR_ARGS(mac), - od->l3dgw_port->json_key, + od->l3dgw_ports[0]->json_key, nat->logical_port); ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_ADMISSION, 50, ds_cstr(match), ds_cstr(actions), @@ -12214,7 +12272,7 @@ lrouter_check_nat_entry(struct ovn_datapath *od, const struct nbrec_nat *nat, /* For distributed router NAT, determine whether this NAT rule * satisfies the conditions for distributed NAT processing. */ *distributed = false; - if (od->l3dgw_port && !strcmp(nat->type, "dnat_and_snat") && + if (od->n_l3dgw_ports && !strcmp(nat->type, "dnat_and_snat") && nat->logical_port && nat->external_mac) { if (eth_addr_from_string(nat->external_mac, mac)) { *distributed = true; @@ -12259,7 +12317,7 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows, * not committed, it would produce ongoing datapath flows with the ct.new * flag set. Some NICs are unable to offload these flows. */ - if ((od->is_gw_router || od->l3dgw_port) && + if ((od->is_gw_router || od->n_l3dgw_ports) && (od->nbr->n_nat || od->nbr->n_load_balancer)) { ovn_lflow_add(lflows, od, S_ROUTER_OUT_UNDNAT, 50, "ip", "flags.loopback = 1; ct_dnat;"); @@ -12275,7 +12333,7 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows, /* NAT rules are only valid on Gateway routers and routers with * l3dgw_port (router has a port with gateway chassis * specified). */ - if (!od->is_gw_router && !od->l3dgw_port) { + if (!od->is_gw_router && !od->n_l3dgw_ports) { return; } @@ -12316,14 +12374,14 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows, ds_clear(match); ds_put_format( match, "outport == %s && %s == %s", - od->l3dgw_port->json_key, + od->l3dgw_ports[0]->json_key, is_v6 ? REG_NEXT_HOP_IPV6 : REG_NEXT_HOP_IPV4, nat->external_ip); ds_clear(actions); ds_put_format( actions, "eth.dst = %s; next;", distributed ? nat->external_mac : - od->l3dgw_port->lrp_networks.ea_s); + od->l3dgw_ports[0]->lrp_networks.ea_s); ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_ARP_RESOLVE, 100, ds_cstr(match), @@ -12359,7 +12417,7 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows, ds_put_format(match, "ip%s.src == %s && outport == %s", is_v6 ? "6" : "4", nat->logical_ip, - od->l3dgw_port->json_key); + od->l3dgw_ports[0]->json_key); /* Add a rule to drop traffic from a distributed NAT if * the virtual port has not claimed yet becaused otherwise * the traffic will be centralized misconfiguring the TOR switch. @@ -12386,16 +12444,16 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows, * gateway port have ip.dst matching a NAT external IP, then * loop a clone of the packet back to the beginning of the * ingress pipeline with inport = outport. */ - if (od->l3dgw_port) { + if (od->n_l3dgw_ports) { /* Distributed router. */ ds_clear(match); ds_put_format(match, "ip%s.dst == %s && outport == %s", is_v6 ? "6" : "4", nat->external_ip, - od->l3dgw_port->json_key); + od->l3dgw_ports[0]->json_key); if (!distributed) { ds_put_format(match, " && is_chassis_resident(%s)", - od->l3redirect_port->json_key); + od->l3dgw_ports[0]->cr_port->json_key); } else { ds_put_format(match, " && is_chassis_resident(\"%s\")", nat->logical_port); diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl index d7141294e..9c5576d16 100644 --- a/northd/ovn_northd.dl +++ b/northd/ovn_northd.dl @@ -185,7 +185,7 @@ OutProxy_Port_Binding(._uuid = lsp._uuid, }, Some{var router_port} = lsp.options.get("router-port"), var opt_chassis = peer.and_then(|p| p.router.options.get("chassis")), - var l3dgw_port = peer.and_then(|p| p.router.l3dgw_port), + var l3dgw_port = peer.and_then(|p| p.router.l3dgw_ports.nth(0)), (var __type, var options) = { var options = ["peer" -> router_port]; match (opt_chassis) { @@ -241,7 +241,7 @@ OutProxy_Port_Binding(._uuid = lsp._uuid, Some{rport} -> match ( (rport.lrp.options.get_bool_def("reside-on-redirect-chassis", false) and l3dgw_port.is_some()) or - Some{rport.lrp} == l3dgw_port or + rport.is_redirect or (rport.router.options.contains_key("chassis") and not sw.localnet_ports.is_empty())) { false -> set_empty(), @@ -335,7 +335,7 @@ function get_router_load_balancer_ips(router: Intern<Router>, function get_nat_addresses(rport: Intern<RouterPort>, routable_only: bool): Set<string> = { var addresses = set_empty(); - var has_redirect = rport.router.l3dgw_port.is_some(); + var has_redirect = not rport.router.l3dgw_ports.is_empty(); match (eth_addr_from_string(rport.lrp.mac)) { None -> addresses, Some{mac} -> { @@ -402,7 +402,10 @@ function get_nat_addresses(rport: Intern<RouterPort>, routable_only: bool): Set< /* Gratuitous ARP for centralized NAT rules on distributed gateway * ports should be restricted to the gateway chassis. */ if (has_redirect) { - c_addresses = c_addresses ++ " is_chassis_resident(${rport.router.redirect_port_name})" + c_addresses = c_addresses ++ match (rport.router.l3dgw_ports.nth(0)) { + None -> "", + Some {var gw_port} -> " is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})" + } } else (); addresses.insert(c_addresses) @@ -417,8 +420,10 @@ function get_garp_nat_addresses(rport: Intern<RouterPort>): string = { for (ipv4_addr in rport.networks.ipv4_addrs) { garp_info.push("${ipv4_addr.addr}") }; - if (rport.router.redirect_port_name != "") { - garp_info.push("is_chassis_resident(${rport.router.redirect_port_name})") + match (rport.router.l3dgw_ports.nth(0)) { + None -> (), + Some {var gw_port} -> garp_info.push( + "is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})") }; garp_info.join(" ") } @@ -455,7 +460,7 @@ OutProxy_Port_Binding(// lrp._uuid is already in use; generate a new UUID by .nat_addresses = set_empty(), .external_ids = lrp.external_ids) :- DistributedGatewayPort(lrp, lr_uuid), - LogicalRouterHAChassisGroup(lr_uuid, hacg_uuid), + DistributedGatewayPortHAChassisGroup(lrp, hacg_uuid), var redirect_type = match (lrp.options.get("redirect-type")) { Some{var value} -> ["redirect-type" -> value], _ -> map_empty() @@ -511,7 +516,8 @@ sb::Out_Port_Binding(._uuid = pbinding._uuid, * chassis. RefChassisSet has a row for every logical router. */ relation RefChassis(lr_uuid: uuid, chassis_uuid: uuid) RefChassis(lr_uuid, chassis_uuid) :- - LogicalRouterHAChassisGroup(lr_uuid, _), + DistributedGatewayPortHAChassisGroup(lrp, _), + DistributedGatewayPort(lrp, lr_uuid), ConnectedLogicalRouter[(lr_uuid, set_uuid)], ConnectedLogicalRouter[(lr2_uuid, set_uuid)], FirstHopLogicalRouter(lr2_uuid, ls_uuid), @@ -538,7 +544,8 @@ RefChassisSet(lr_uuid, set_empty()) :- relation HAChassisGroupRefChassisSet(hacg_uuid: uuid, chassis_uuids: Set<uuid>) HAChassisGroupRefChassisSet(hacg_uuid, chassis_uuids) :- - LogicalRouterHAChassisGroup(lr_uuid, hacg_uuid), + DistributedGatewayPortHAChassisGroup(lrp, hacg_uuid), + DistributedGatewayPort(lrp, lr_uuid), RefChassisSet(lr_uuid, chassis_uuids), var chassis_uuids = chassis_uuids.group_by(hacg_uuid).union(). @@ -4451,7 +4458,7 @@ for (&SwitchPort(.lsp = lsp, .peer = Some{&RouterPort{.lrp = lrp, .is_redirect = is_redirect, .router = &Router{._uuid = lr_uuid, - .redirect_port_name = redirect_port_name}}}) + .l3dgw_ports = l3dgw_ports}}}) if (lsp.addresses.contains("router") and lsp.__type != "external")) { Some{var mac} = scan_eth_addr(lrp.mac) in { @@ -4471,6 +4478,14 @@ for (&SwitchPort(.lsp = lsp, */ lrp.options.get_bool_def("reside-on-redirect-chassis", false)) in var __match = if (add_chassis_resident_check) { + var redirect_port_name = if (is_redirect) { + json_string_escape(chassis_redirect_name(lrp.name)) + } else { + match (l3dgw_ports.nth(0)) { + Some {var gw_port} -> json_string_escape(chassis_redirect_name(gw_port.name)), + None -> "" + } + }; /* The destination lookup flow for the router's * distributed gateway port MAC address should only be * programmed on the "redirect-chassis". */ @@ -4876,13 +4891,8 @@ var rLNIR = rEGBIT_LOOKUP_NEIGHBOR_IP_RESULT() in /* Check if we need to learn mac-binding from ARP requests. */ for (RouterPortNetworksIPv4Addr(rp@&RouterPort{.router = router}, addr)) { - var is_l3dgw_port = match (router.l3dgw_port) { - Some{l3dgw_lrp} -> l3dgw_lrp._uuid == rp.lrp._uuid, - None -> false - } in - var has_redirect_port = router.redirect_port_name != "" in - var chassis_residence = match (is_l3dgw_port and has_redirect_port) { - true -> " && is_chassis_resident(${router.redirect_port_name})", + var chassis_residence = match (rp.is_redirect) { + true -> " && is_chassis_resident(${json_string_escape(chassis_redirect_name(rp.lrp.name))})", false -> "" } in var rLNR = rEGBIT_LOOKUP_NEIGHBOR_RESULT() in @@ -5042,7 +5052,7 @@ relation AddChassisResidentCheck_(lrp: uuid, add_check: bool) AddChassisResidentCheck_(lrp._uuid, res) :- &SwitchPort(.peer = Some{&RouterPort{.lrp = lrp, .router = router, .is_redirect = is_redirect}}, .sw = sw), - router.l3dgw_port.is_some(), + not router.l3dgw_ports.is_empty(), not sw.localnet_ports.is_empty(), var res = if (is_redirect) { /* Traffic with eth.src = l3dgw_port->lrp_networks.ea @@ -5147,7 +5157,8 @@ LogicalRouterArpNdFlow(router, nat, None, rEG_INPORT_ETH_ADDR(), None, false, 90 * different ETH address. */ LogicalRouterPortNatArpNdFlow(router, nat, l3dgw_port) :- - router in &Router(._uuid = lr_uuid, .l3dgw_port = Some{l3dgw_port}), + router in &Router(._uuid = lr_uuid, .l3dgw_ports = l3dgw_ports), + Some {var l3dgw_port} = l3dgw_ports.nth(0), LogicalRouterNAT(lr_uuid, nat), /* Skip SNAT entries for now, we handle unique SNAT IPs separately * below. @@ -5155,7 +5166,8 @@ LogicalRouterPortNatArpNdFlow(router, nat, l3dgw_port) :- nat.nat.__type != "snat". /* Now handle SNAT entries too, one per unique SNAT IP. */ LogicalRouterPortNatArpNdFlow(router, nat, l3dgw_port) :- - router in &Router(.l3dgw_port = Some{l3dgw_port}, .snat_ips = snat_ips), + router in &Router(.l3dgw_ports = l3dgw_ports, .snat_ips = snat_ips), + Some {var l3dgw_port} = l3dgw_ports.nth(0), var snat_ip = FlatMap(snat_ips), (var ip, var nats) = snat_ip, Some{var nat} = nats.nth(0). @@ -5185,9 +5197,9 @@ LogicalRouterArpNdFlow(router, nat, Some{lrp}, mac, None, true, 91) :- * upstream MAC learning points to the gateway chassis. * Also need to avoid generation of multiple ARP responses * from different chassis. */ - match (router.redirect_port_name) { - "" -> "", - s -> "is_chassis_resident(${s})" + match (router.l3dgw_ports.nth(0)) { + None -> "", + Some {var gw_port} -> "is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})" } ) }. @@ -5332,7 +5344,15 @@ for (RouterPortNetworksIPv4Addr(.port = &RouterPort{.lrp = lrp, var __match = "arp.spa == ${addr.match_network()}" ++ if (add_chassis_resident_check) { - " && is_chassis_resident(${router.redirect_port_name})" + var redirect_port_name = if (is_redirect) { + json_string_escape(chassis_redirect_name(lrp.name)) + } else { + match (router.l3dgw_ports.nth(0)) { + None -> "", + Some {var gw_port} -> json_string_escape(chassis_redirect_name(gw_port.name)) + } + }; + " && is_chassis_resident(${redirect_port_name})" } else "" in LogicalRouterArpFlow(.lr = router, .lrp = Some{lrp}, @@ -5351,7 +5371,7 @@ for (&RouterPort(.lrp = lrp, .networks = networks, .is_redirect = is_redirect)) var residence_check = match (is_redirect) { - true -> Some{"is_chassis_resident(${router.redirect_port_name})"}, + true -> Some{"is_chassis_resident(${json_string_escape(chassis_redirect_name(lrp.name))})"}, false -> None } in { (var all_ips_v4, _) = get_router_load_balancer_ips(router, false) in { @@ -5421,7 +5441,7 @@ Flow(.logical_datapath = lr_uuid, for (RouterPortNetworksIPv4Addr( .port = &RouterPort{ .router = &Router{._uuid = lr_uuid, - .l3dgw_port = None, + .l3dgw_ports = vec_empty(), .is_gateway = false, .copp = copp}, .lrp = lrp}, @@ -5557,7 +5577,7 @@ for (RouterPortNetworksIPv6Addr(.port = &RouterPort{.lrp = lrp, /* UDP/TCP/SCTP port unreachable */ for (RouterPortNetworksIPv6Addr( .port = &RouterPort{.router = &Router{._uuid = lr_uuid, - .l3dgw_port = None, + .l3dgw_ports = vec_empty(), .is_gateway = false, .copp = copp}, .lrp = lrp, @@ -5685,11 +5705,11 @@ for (r in &Router(._uuid = lr_uuid)) { } for (r in &Router(._uuid = lr_uuid, - .l3dgw_port = l3dgw_port, + .l3dgw_ports = l3dgw_ports, .is_gateway = is_gateway, .nat = nat, .load_balancer = load_balancer) - if (l3dgw_port.is_some() or is_gateway) and (not is_empty(nat) or not is_empty(load_balancer))) { + if (l3dgw_ports.len() > 0 or is_gateway) and (not is_empty(nat) or not is_empty(load_balancer))) { /* If the router has load balancer or DNAT rules, re-circulate every packet * through the DNAT zone so that packets that need to be unDNATed in the * reverse direction get unDNATed. @@ -5772,7 +5792,7 @@ function lrouter_nat_add_ext_ip_match( }, false -> { /* S_ROUTER_OUT_SNAT uses priority (mask + 1 + 128 + 1) */ - var is_gw_router = router.l3dgw_port == None; + var is_gw_router = router.l3dgw_ports.is_empty(); var mask_1bits = mask.cidr_bits().unwrap_or(8'd0) as integer; mask_1bits + 2 + { if (not is_gw_router) 128 else 0 } } @@ -5877,10 +5897,9 @@ VirtualLogicalPort(Some{logical_port}) :- * l3dgw_port (router has a port with "redirect-chassis" * specified). */ for (r in &Router(._uuid = lr_uuid, - .l3dgw_port = l3dgw_port, - .redirect_port_name = redirect_port_name, + .l3dgw_ports = l3dgw_ports, .is_gateway = is_gateway) - if l3dgw_port.is_some() or is_gateway) + if not l3dgw_ports.is_empty() or is_gateway) { for (LogicalRouterNAT(.lr = lr_uuid, .nat = nat)) { var ipX = nat.external_ip.ipX() in @@ -5898,7 +5917,7 @@ for (r in &Router(._uuid = lr_uuid, } in /* For distributed router NAT, determine whether this NAT rule * satisfies the conditions for distributed NAT processing. */ - var mac = match ((l3dgw_port.is_some() and nat.nat.__type == "dnat_and_snat", + var mac = match ((not l3dgw_ports.is_empty() and nat.nat.__type == "dnat_and_snat", nat.nat.logical_port, nat.external_mac)) { (true, Some{_}, Some{mac}) -> Some{mac}, _ -> None @@ -5916,7 +5935,7 @@ for (r in &Router(._uuid = lr_uuid, * not know about the possibility of eventual additional SNAT in * egress pipeline. */ if (nat.nat.__type == "snat" or nat.nat.__type == "dnat_and_snat") { - if (l3dgw_port == None) { + if (l3dgw_ports.is_empty()) { /* Gateway router. */ var actions = if (stateless) { "${ipX}.dst=${nat.nat.logical_ip}; next;" @@ -5930,7 +5949,7 @@ for (r in &Router(._uuid = lr_uuid, .actions = actions, .external_ids = stage_hint(nat.nat._uuid)) }; - Some{var gwport} = l3dgw_port in { + Some {var gwport} = l3dgw_ports.nth(0) in { /* Distributed router. */ /* Traffic received on l3dgw_port is subject to NAT. */ @@ -5940,7 +5959,7 @@ for (r in &Router(._uuid = lr_uuid, if (mac == None) { /* Flows for NAT rules that are centralized are only * programmed on the "redirect-chassis". */ - " && is_chassis_resident(${redirect_port_name})" + " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gwport.name))})" } else { "" } in var actions = if (stateless) { "${ipX}.dst=${nat.nat.logical_ip}; next;" @@ -5966,7 +5985,7 @@ for (r in &Router(._uuid = lr_uuid, "" } in if (nat.nat.__type == "dnat" or nat.nat.__type == "dnat_and_snat") { - None = l3dgw_port in + l3dgw_ports.is_empty() in var __match = "ip && ${ipX}.dst == ${nat.nat.external_ip}" in (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match( r, nat, __match, ipX, true, mask) in @@ -5998,14 +6017,14 @@ for (r in &Router(._uuid = lr_uuid, .external_ids = stage_hint(nat.nat._uuid)) }; - Some{var gwport} = l3dgw_port in + Some {var gwport} = l3dgw_ports.nth(0) in var __match = "ip && ${ipX}.dst == ${nat.nat.external_ip}" " && inport == ${json_string_escape(gwport.name)}" ++ if (mac == None) { /* Flows for NAT rules that are centralized are only * programmed on the "redirect-chassis". */ - " && is_chassis_resident(${redirect_port_name})" + " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gwport.name))})" } else { "" } in (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match( r, nat, __match, ipX, true, mask) in @@ -6029,7 +6048,7 @@ for (r in &Router(._uuid = lr_uuid, }; /* ARP resolve for NAT IPs. */ - Some{var gwport} = l3dgw_port in { + Some {var gwport} = l3dgw_ports.nth(0) in { var gwport_name = json_string_escape(gwport.name) in { if (nat.nat.__type == "snat") { var __match = "inport == ${gwport_name} && " @@ -6066,14 +6085,14 @@ for (r in &Router(._uuid = lr_uuid, * Note that this only applies for NAT on a distributed router. */ if ((nat.nat.__type == "dnat" or nat.nat.__type == "dnat_and_snat")) { - Some{var gwport} = l3dgw_port in + Some {var gwport} = l3dgw_ports.nth(0) in var __match = "ip && ${ipX}.src == ${nat.nat.logical_ip}" " && outport == ${json_string_escape(gwport.name)}" ++ if (mac == None) { /* Flows for NAT rules that are centralized are only * programmed on the "redirect-chassis". */ - " && is_chassis_resident(${redirect_port_name})" + " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gwport.name))})" } else { "" } in var actions = match (mac) { @@ -6103,7 +6122,7 @@ for (r in &Router(._uuid = lr_uuid, "" } in if (nat.nat.__type == "snat" or nat.nat.__type == "dnat_and_snat") { - None = l3dgw_port in + l3dgw_ports.is_empty() in var __match = "ip && ${ipX}.src == ${nat.nat.logical_ip}" in (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match( r, nat, __match, ipX, false, mask) in @@ -6128,14 +6147,14 @@ for (r in &Router(._uuid = lr_uuid, .external_ids = stage_hint(nat.nat._uuid)) }; - Some{var gwport} = l3dgw_port in + Some {var gwport} = l3dgw_ports.nth(0) in var __match = "ip && ${ipX}.src == ${nat.nat.logical_ip}" " && outport == ${json_string_escape(gwport.name)}" ++ if (mac == None) { /* Flows for NAT rules that are centralized are only * programmed on the "redirect-chassis". */ - " && is_chassis_resident(${redirect_port_name})" + " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gwport.name))})" } else { "" } in (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match( r, nat, __match, ipX, false, mask) in @@ -6173,7 +6192,7 @@ for (r in &Router(._uuid = lr_uuid, * on the l3dgw_port instance where nat->logical_port is * resident. */ Some{var mac_addr} = mac in - Some{var gwport} = l3dgw_port in + Some{var gwport} = l3dgw_ports.nth(0) in Some{var logical_port} = nat.nat.logical_port in var __match = "eth.dst == ${mac_addr} && inport == ${json_string_escape(gwport.name)}" @@ -6199,7 +6218,7 @@ for (r in &Router(._uuid = lr_uuid, * stage is sent out with proper IP/MAC src addresses */ Some{var mac_addr} = mac in - Some{var gwport} = l3dgw_port in + Some{var gwport} = l3dgw_ports.nth(0) in Some{var logical_port} = nat.nat.logical_port in Some{var external_mac} = nat.nat.external_mac in var __match = @@ -6218,7 +6237,7 @@ for (r in &Router(._uuid = lr_uuid, .external_ids = stage_hint(nat.nat._uuid)); for (VirtualLogicalPort(nat.nat.logical_port)) { - Some{var gwport} = l3dgw_port in + Some{var gwport} = l3dgw_ports.nth(0) in Flow(.logical_datapath = lr_uuid, .stage = s_ROUTER_IN_GW_REDIRECT(), .priority = 80, @@ -6233,14 +6252,14 @@ for (r in &Router(._uuid = lr_uuid, * gateway port have ip.dst matching a NAT external IP, then * loop a clone of the packet back to the beginning of the * ingress pipeline with inport = outport. */ - Some{var gwport} = l3dgw_port in + Some{var gwport} = l3dgw_ports.nth(0) in /* Distributed router. */ Some{var port} = match (mac) { Some{_} -> match (nat.nat.logical_port) { Some{name} -> Some{json_string_escape(name)}, None -> None: Option<string> }, - None -> Some{redirect_port_name} + None -> Some{json_string_escape(chassis_redirect_name(gwport.name))} } in var __match = "${ipX}.dst == ${nat.nat.external_ip} && outport == ${json_string_escape(gwport.name)} && is_chassis_resident(${port})" in var regs = { @@ -6268,7 +6287,7 @@ for (r in &Router(._uuid = lr_uuid, }; /* Handle force SNAT options set in the gateway router. */ - if (l3dgw_port == None) { + if (l3dgw_ports.is_empty()) { var dnat_force_snat_ips = get_force_snat_ip(r.options, "dnat") in if (not dnat_force_snat_ips.is_empty()) LogicalRouterForceSnatFlows(.logical_router = lr_uuid, @@ -6296,14 +6315,13 @@ function nats_contain_vip(nats: Vec<NAT>, vip: v46_ip): bool { * Gateway routers or router with gateway port. */ for (RouterLBVIP( .router = r@&Router{._uuid = lr_uuid, - .l3dgw_port = l3dgw_port, - .redirect_port_name = redirect_port_name, + .l3dgw_ports = l3dgw_ports, .is_gateway = is_gateway, .nats = nats}, .lb = lb, .vip = vip, .backends = backends) - if l3dgw_port.is_some() or is_gateway) + if not l3dgw_ports.is_empty() or is_gateway) { if (backends == "" and not lb.options.get_bool_def("reject", false)) { for (LoadBalancerEmptyEvents(lb)) { @@ -6372,8 +6390,8 @@ for (RouterLBVIP( (110, "") } in var __match = match1 ++ match2 ++ - match ((l3dgw_port, backends != "" or lb.options.get_bool_def("reject", false))) { - (Some{gwport}, true) -> " && is_chassis_resident(${redirect_port_name})", + match ((l3dgw_ports.nth(0), backends != "" or lb.options.get_bool_def("reject", false))) { + (Some{gw_port}, true) -> " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})", _ -> "" } in var snat_for_lb = snat_for_lb(r.options, lb) in @@ -6385,8 +6403,8 @@ for (RouterLBVIP( } else { "" } ++ - match ((l3dgw_port, backends != "" or lb.options.get_bool_def("reject", false))) { - (Some{gwport}, true) -> " && is_chassis_resident(${redirect_port_name})", + match ((l3dgw_ports.nth(0), backends != "" or lb.options.get_bool_def("reject", false))) { + (Some {var gw_port}, true) -> " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})", _ -> "" } in var actions = @@ -6425,7 +6443,7 @@ for (RouterLBVIP( .external_ids = stage_hint(lb._uuid)) }; - Some{var gwport} = l3dgw_port in + Some{var gwport} = l3dgw_ports.nth(0) in /* Add logical flows to UNDNAT the load balanced reverse traffic in * the router egress pipleine stage - S_ROUTER_OUT_UNDNAT if the logical * router has a gateway router port associated. @@ -6450,7 +6468,7 @@ for (RouterLBVIP( var undnat_match = "${ip_address.ipX()} && (" ++ conds.join(" || ") ++ ") && outport == ${json_string_escape(gwport.name)} && " - "is_chassis_resident(${redirect_port_name})" in + "is_chassis_resident(${json_string_escape(chassis_redirect_name(gwport.name))})" in var action = match (snat_for_lb) { SkipSNAT -> "flags.skip_snat_for_lb = 1; ct_dnat;", @@ -6481,14 +6499,14 @@ MeteredFlow(.logical_datapath = r._uuid, .controller_meter = meter, .external_ids = stage_hint(lb._uuid)) :- r in &Router(), - r.l3dgw_port.is_some() or r.is_gateway, + r.l3dgw_ports.len() > 0 or r.is_gateway, LBVIPWithStatus[lbvip@&LBVIPWithStatus{.lb = lb}], r.load_balancer.contains(lb._uuid), var __match = "ct.new && " ++ get_match_for_lb_key(lbvip.vip_addr, lbvip.vip_port, lb.protocol, true, true) ++ - match (r.l3dgw_port) { - Some{gwport} -> " && is_chassis_resident(${r.redirect_port_name})", + match (r.l3dgw_ports.nth(0)) { + Some{gw_port} -> " && is_chassis_resident(${json_string_escape(chassis_redirect_name(gw_port.name))})", _ -> "" }, var priority = if (lbvip.vip_port != 0) 120 else 110, @@ -7294,11 +7312,11 @@ Flow(.logical_datapath = router._uuid, .stage = s_ROUTER_IN_ARP_RESOLVE(), .priority = 50, .__match = "outport == ${rp.json_name} && " - "!is_chassis_resident(${router.redirect_port_name})", + "!is_chassis_resident(${json_string_escape(chassis_redirect_name(l3dgw_port.name))})", .actions = "eth.dst = ${rp.networks.ea}; next;", .external_ids = stage_hint(lrp._uuid)) :- rp in &RouterPort(.lrp = lrp, .router = router), - router.redirect_port_name != "", + Some{var l3dgw_port} = router.l3dgw_ports.nth(0), Some{"bridged"} = lrp.options.get("redirect-type"). @@ -7672,21 +7690,20 @@ MeteredFlow(.logical_datapath = lr_uuid, * of the traffic to the l3redirect_port which represents * the central instance of the l3dgw_port. */ -for (&Router(._uuid = lr_uuid, - .l3dgw_port = l3dgw_port, - .redirect_port_name = redirect_port_name)) +for (&Router(._uuid = lr_uuid)) { /* For traffic with outport == l3dgw_port, if the * packet did not match any higher priority redirect * rule, then the traffic is redirected to the central * instance of the l3dgw_port. */ - Some{var gwport} = l3dgw_port in - Flow(.logical_datapath = lr_uuid, - .stage = s_ROUTER_IN_GW_REDIRECT(), - .priority = 50, - .__match = "outport == ${json_string_escape(gwport.name)}", - .actions = "outport = ${redirect_port_name}; next;", - .external_ids = stage_hint(gwport._uuid)); + for (DistributedGatewayPort(lrp, lr_uuid)) { + Flow(.logical_datapath = lr_uuid, + .stage = s_ROUTER_IN_GW_REDIRECT(), + .priority = 50, + .__match = "outport == ${json_string_escape(lrp.name)}", + .actions = "outport = ${json_string_escape(chassis_redirect_name(lrp.name))}; next;", + .external_ids = stage_hint(lrp._uuid)) + }; /* Packets are allowed by default. */ Flow(.logical_datapath = lr_uuid, diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml index 0eef9b739..3598b5073 100644 --- a/ovn-architecture.7.xml +++ b/ovn-architecture.7.xml @@ -731,6 +731,13 @@ highest-priority gateway that is online. </p> + <p> + A logical router can have multiple distributed gateway ports, each + connecting different external networks. However, some features, such as NAT + and load balancers, are not supported yet for logical routers with more + than one distributed gateway port configured. + </p> + <h4>Physical VLAN MTU Issues</h4> <p> @@ -1968,8 +1975,9 @@ <p> If the logical router doesn't have a distributed gateway port connecting - to the localnet logical switch which provides external connectivity, - then this option is ignored by <code>OVN</code>. + to the localnet logical switch which provides external connectivity, or + if it has more than one distributed gateway ports, then this option is + ignored by <code>OVN</code>. </p> <p> @@ -2086,6 +2094,13 @@ a tunnel. </p> + <p> + If the logical router doesn't have a distributed gateway port connecting + to the localnet logical switch which provides external connectivity, or + if it has more than one distributed gateway ports, then this option is + ignored by <code>OVN</code>. + </p> + <p> Following happens for bridged redirection: </p> diff --git a/ovn-nb.xml b/ovn-nb.xml index c1176e81f..ec51b5608 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -2032,13 +2032,14 @@ <column name="nat"> One or more NAT rules for the router. NAT rules only work on - Gateway routers, and on distributed routers with logical gateway ports. + Gateway routers, and on distributed routers with one and only one + distributed gateway port. </column> <column name="load_balancer"> Load balance a virtual ip address to a set of logical port ip addresses. Load balancer rules only work on the Gateway routers or - routers with distributed gateway ports. + routers with one and only one distributed gateway port. </column> <group title="Naming"> @@ -2453,8 +2454,7 @@ If either of these are set, this logical router port represents a distributed gateway port that connects this router to a logical switch with a <code>localnet</code> port or a - connection to another OVN deployment. There may be at most - one such logical router port on each logical router. + connection to another OVN deployment. </p> <p> @@ -2476,8 +2476,16 @@ </p> <p> - When more than one gateway chassis is specified, OVN only uses - one at a time. OVN can rely on OVS BFD implementation to monitor + There can be more than one distributed gateway ports configured + on each logical router, each connecting to different L2 segments. + However, features such as NAT and load-balancer are not supported + on logical routers with more than one distributed gateway ports. + </p> + + <p> + For each distributed gateway port, it may have more than one gateway + chassises. When more than one gateway chassis is specified, OVN only + uses one at a time. OVN can rely on OVS BFD implementation to monitor gateway connectivity, preferring the highest-priority gateway that is online. Priorities are specified in the <code>priority</code> column of <ref table="Gateway_Chassis"/> or <ref table="HA_Chassis"/>. @@ -2563,8 +2571,8 @@ </p> <p> - OVN honors this option only if the logical router has a distributed - gateway port and if the LRP's peer switch has a + OVN honors this option only if the logical router has one and only + one distributed gateway port and if the LRP's peer switch has a <code>localnet</code> port. </p> </column> @@ -2588,7 +2596,8 @@ <p> Setting this option to <code>overlay</code> or leaving it unset has no effect. This option may usefully be set only on a distributed - gateway port. It is otherwise ignored. + gateway port when there is one and only one distributed gateway + port on the logical router. It is otherwise ignored. </p> </column> </group> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index 2098b1c19..27c93a8b9 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -4968,3 +4968,85 @@ AT_CHECK([grep -e "chk_pkt_len" -e "lr_in_larger_pkts" lr0flows | sort], [0], [d AT_CLEANUP ]) + +OVN_FOR_EACH_NORTHD([ +AT_SETUP([ovn-northd -- lr multiple gw ports]) +AT_KEYWORDS([multiple-l3dgw-ports]) +ovn_start + +# Logical network: +# 1 Logical Router, 3 bridged Logical Switches, +# 1 gateway chassis attached to each corresponding LRP. +# +# | S1 (gw1) +# | +# ls ---- DR -- S3 (gw3) +# (20.0.0.0/24) | +# | S2 (gw2) +# +# Validate basic LR logical flows. + +check ovn-sbctl chassis-add gw1 geneve 127.0.0.1 +check ovn-sbctl chassis-add gw2 geneve 128.0.0.1 +check ovn-sbctl chassis-add gw3 geneve 129.0.0.1 + +check ovn-nbctl lr-add DR +check ovn-nbctl lrp-add DR DR-S1 02:ac:10:01:00:01 172.16.1.1/24 +check ovn-nbctl lrp-add DR DR-S2 03:ac:10:01:00:01 172.16.2.1/24 +check ovn-nbctl lrp-add DR DR-S3 04:ac:10:01:00:01 172.16.3.1/24 +check ovn-nbctl lrp-add DR DR-ls 05:ac:10:01:00:01 20.0.0.1/24 + +check ovn-nbctl ls-add S1 +check ovn-nbctl lsp-add S1 S1-DR +check ovn-nbctl lsp-set-type S1-DR router +check ovn-nbctl lsp-set-addresses S1-DR router +check ovn-nbctl --wait=sb lsp-set-options S1-DR router-port=DR-S1 + +check ovn-nbctl ls-add S2 +check ovn-nbctl lsp-add S2 S2-DR +check ovn-nbctl lsp-set-type S2-DR router +check ovn-nbctl lsp-set-addresses S2-DR router +check ovn-nbctl --wait=sb lsp-set-options S2-DR router-port=DR-S2 + +check ovn-nbctl ls-add S3 +check ovn-nbctl lsp-add S3 S3-DR +check ovn-nbctl lsp-set-type S3-DR router +check ovn-nbctl lsp-set-addresses S3-DR router +check ovn-nbctl --wait=sb lsp-set-options S3-DR router-port=DR-S3 + +check ovn-nbctl ls-add ls +check ovn-nbctl lsp-add ls ls-DR +check ovn-nbctl lsp-set-type ls-DR router +check ovn-nbctl lsp-set-addresses ls-DR router +check ovn-nbctl --wait=sb lsp-set-options ls-DR router-port=DR-ls + +check ovn-nbctl lrp-set-gateway-chassis DR-S1 gw1 +check ovn-nbctl lrp-set-gateway-chassis DR-S2 gw2 +check ovn-nbctl lrp-set-gateway-chassis DR-S3 gw3 + +check ovn-nbctl --wait=sb sync + +ovn-sbctl dump-flows DR > lrflows +AT_CAPTURE_FILE([lrflows]) + +# Check the flows in lr_in_admission stage +AT_CHECK([grep lr_in_admission lrflows | grep cr-DR | sort], [0], [dnl + table=0 (lr_in_admission ), priority=50 , match=(eth.dst == 02:ac:10:01:00:01 && inport == "DR-S1" && is_chassis_resident("cr-DR-S1")), action=(xreg0[[0..47]] = 02:ac:10:01:00:01; next;) + table=0 (lr_in_admission ), priority=50 , match=(eth.dst == 03:ac:10:01:00:01 && inport == "DR-S2" && is_chassis_resident("cr-DR-S2")), action=(xreg0[[0..47]] = 03:ac:10:01:00:01; next;) + table=0 (lr_in_admission ), priority=50 , match=(eth.dst == 04:ac:10:01:00:01 && inport == "DR-S3" && is_chassis_resident("cr-DR-S3")), action=(xreg0[[0..47]] = 04:ac:10:01:00:01; next;) +]) +# Check the flows in lr_in_lookup_neighbor stage +AT_CHECK([grep lr_in_lookup_neighbor lrflows | grep cr-DR | sort], [0], [dnl + table=1 (lr_in_lookup_neighbor), priority=100 , match=(inport == "DR-S1" && arp.spa == 172.16.1.0/24 && arp.op == 1 && is_chassis_resident("cr-DR-S1")), action=(reg9[[2]] = lookup_arp(inport, arp.spa, arp.sha); next;) + table=1 (lr_in_lookup_neighbor), priority=100 , match=(inport == "DR-S2" && arp.spa == 172.16.2.0/24 && arp.op == 1 && is_chassis_resident("cr-DR-S2")), action=(reg9[[2]] = lookup_arp(inport, arp.spa, arp.sha); next;) + table=1 (lr_in_lookup_neighbor), priority=100 , match=(inport == "DR-S3" && arp.spa == 172.16.3.0/24 && arp.op == 1 && is_chassis_resident("cr-DR-S3")), action=(reg9[[2]] = lookup_arp(inport, arp.spa, arp.sha); next;) +]) +# Check the flows in lr_in_gw_redirect stage +AT_CHECK([grep lr_in_gw_redirect lrflows | grep cr-DR | sort], [0], [dnl + table=17(lr_in_gw_redirect ), priority=50 , match=(outport == "DR-S1"), action=(outport = "cr-DR-S1"; next;) + table=17(lr_in_gw_redirect ), priority=50 , match=(outport == "DR-S2"), action=(outport = "cr-DR-S2"; next;) + table=17(lr_in_gw_redirect ), priority=50 , match=(outport == "DR-S3"), action=(outport = "cr-DR-S3"; next;) +]) + +AT_CLEANUP +]) diff --git a/tests/ovn.at b/tests/ovn.at index 7ae136ad9..b571bbb49 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -27529,3 +27529,310 @@ AT_CHECK([ovs-ofctl dump-flows br-int table=44 | grep 10.0.0.144], [0], [ignore] OVN_CLEANUP([hv1]) AT_CLEANUP ]) + +OVN_FOR_EACH_NORTHD([ +AT_SETUP([ovn -- lr multiple gw ports]) +AT_KEYWORDS([multiple-l3dgw-ports]) +ovn_start + +# Logical network: +# 1 LR, 3 Logical Switches, +# 1 gateway chassis attached to each corresponding LRP. +# +# | S1 (gw1) +# | +# ls ---- DR -- S3 (gw3) +# (20.0.0.0/24) | +# | S2 (gw2) +# +# S1 - VLAN 1000 +# S2 - VLAN 2000 +# S3 - VLAN 3000 +# +# 5 chassis(s), HV1----HV5 +# +# HV1 - VIF11 +# HV2 - Gateway chassis gw1 +# HV3 - Gateway chassis gw2 +# HV4 - Gateway chassis gw3 +# HV5 - North endpoint + +ovn-nbctl lr-add DR +ovn-nbctl lrp-add DR DR-S1 02:ac:10:01:00:01 172.16.1.1/24 +ovn-nbctl lrp-add DR DR-S2 08:ac:10:01:00:01 10.0.0.1/24 +ovn-nbctl lrp-add DR DR-S3 04:ac:10:01:00:01 192.168.0.1/24 +ovn-nbctl lrp-add DR DR-ls 06:ac:10:01:00:01 20.0.0.1/24 + +ovn-nbctl ls-add S1 +ovn-nbctl lsp-add S1 S1-DR +ovn-nbctl lsp-set-type S1-DR router +ovn-nbctl lsp-set-addresses S1-DR router +ovn-nbctl --wait=sb lsp-set-options S1-DR router-port=DR-S1 +ovn-nbctl lsp-add S1 ln1 "" 1000 +ovn-nbctl lsp-set-addresses ln1 unknown +ovn-nbctl lsp-set-type ln1 localnet +ovn-nbctl lsp-set-options ln1 network_name=phys + +ovn-nbctl ls-add S2 +ovn-nbctl lsp-add S2 S2-DR +ovn-nbctl lsp-set-type S2-DR router +ovn-nbctl lsp-set-addresses S2-DR router +ovn-nbctl --wait=sb lsp-set-options S2-DR router-port=DR-S2 +ovn-nbctl lsp-add S2 ln2 "" 2000 +ovn-nbctl lsp-set-addresses ln2 unknown +ovn-nbctl lsp-set-type ln2 localnet +ovn-nbctl lsp-set-options ln2 network_name=phys + +ovn-nbctl ls-add S3 +ovn-nbctl lsp-add S3 S3-DR +ovn-nbctl lsp-set-type S3-DR router +ovn-nbctl lsp-set-addresses S3-DR router +ovn-nbctl --wait=sb lsp-set-options S3-DR router-port=DR-S3 +ovn-nbctl lsp-add S3 ln3 "" 3000 +ovn-nbctl lsp-set-addresses ln3 unknown +ovn-nbctl lsp-set-type ln3 localnet +ovn-nbctl lsp-set-options ln3 network_name=phys + +ovn-nbctl ls-add ls +ovn-nbctl lsp-add ls ls-DR +ovn-nbctl lsp-set-type ls-DR router +ovn-nbctl lsp-set-addresses ls-DR router +ovn-nbctl --wait=sb lsp-set-options ls-DR router-port=DR-ls + +# Add the lsp lp11 to ls. This will map to VIF11. +ovn-nbctl lsp-add ls lp11 +ovn-nbctl lsp-set-addresses lp11 "f0:00:00:00:00:10 20.0.0.10" +ovn-nbctl lsp-set-port-security lp11 f0:00:00:00:00:10 + +# Add the Northbound endpoint, lp-north1 +ovn-nbctl ls-add ls-north1 +ovn-nbctl lsp-add ls-north1 ln4 "" 1000 +ovn-nbctl lsp-set-addresses ln4 unknown +ovn-nbctl lsp-set-type ln4 localnet +ovn-nbctl lsp-set-options ln4 network_name=phys + +ovn-nbctl lsp-add ls-north1 lp-north1 +ovn-nbctl lsp-set-addresses lp-north1 "f0:f0:00:00:00:11 172.16.1.10" +ovn-nbctl lsp-set-port-security lp-north1 f0:f0:00:00:00:11 + +# Add the Northbound endpoint, lp-north2 +ovn-nbctl ls-add ls-north2 +ovn-nbctl lsp-add ls-north2 ln5 "" 2000 +ovn-nbctl lsp-set-addresses ln5 unknown +ovn-nbctl lsp-set-type ln5 localnet +ovn-nbctl lsp-set-options ln5 network_name=phys + +ovn-nbctl lsp-add ls-north2 lp-north2 +ovn-nbctl lsp-set-addresses lp-north2 "f0:f0:00:00:00:22 10.0.0.10" +ovn-nbctl lsp-set-port-security lp-north2 f0:f0:00:00:00:22 + +# Add the Northbound endpoint, lp-north3 +ovn-nbctl ls-add ls-north3 +ovn-nbctl lsp-add ls-north3 ln6 "" 3000 +ovn-nbctl lsp-set-addresses ln6 unknown +ovn-nbctl lsp-set-type ln6 localnet +ovn-nbctl lsp-set-options ln6 network_name=phys + +ovn-nbctl lsp-add ls-north3 lp-north3 +ovn-nbctl lsp-set-addresses lp-north3 "f0:f0:00:00:00:33 192.168.0.10" +ovn-nbctl lsp-set-port-security lp-north3 f0:f0:00:00:00:33 + +# Add 5 chassis +net_add n1 +for i in 1 2 3 4 5; do + sim_add hv$i + as hv$i + ovs-vsctl add-br br-phys + ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys + ovn_attach n1 br-phys 192.168.0.$i 24 $encap +done + +# Add a vif on HV1 +as hv1 ovs-vsctl add-port br-int vif11 -- \ + set Interface vif11 external-ids:iface-id=lp11 \ + options:tx_pcap=hv1/vif11-tx.pcap \ + options:rxq_pcap=hv1/vif11-rx.pcap \ + ofport-request=11 +OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up lp11` = xup]) + +as hv5 ovs-vsctl add-port br-int vif-north1 -- \ + set Interface vif-north1 external-ids:iface-id=lp-north1 \ + options:tx_pcap=hv5/vif-north1-tx.pcap \ + options:rxq_pcap=hv5/vif-north1-rx.pcap \ + ofport-request=44 + +as hv5 ovs-vsctl add-port br-int vif-north2 -- \ + set Interface vif-north2 external-ids:iface-id=lp-north2 \ + options:tx_pcap=hv5/vif-north2-tx.pcap \ + options:rxq_pcap=hv5/vif-north2-rx.pcap \ + ofport-request=45 + +as hv5 ovs-vsctl add-port br-int vif-north3 -- \ + set Interface vif-north3 external-ids:iface-id=lp-north3 \ + options:tx_pcap=hv5/vif-north3-tx.pcap \ + options:rxq_pcap=hv5/vif-north3-rx.pcap \ + ofport-request=46 + +ovn-nbctl lrp-set-gateway-chassis DR-S1 hv2 +ovn-nbctl lrp-set-gateway-chassis DR-S2 hv3 +ovn-nbctl lrp-set-gateway-chassis DR-S3 hv4 + +ovn-nbctl --wait=sb sync +OVN_POPULATE_ARP + +vif_to_ls () { + case ${1} in dnl ( + vif?[[11]]) echo ls ;; dnl ( + vif-north1) echo ls-north1 ;; dnl ( + vif-north2) echo ls-north2 ;; dnl ( + vif-north3) echo ls-north3 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +vif_to_hv () { + case ${1} in dnl ( + vif[[1]]?) echo hv1 ;; dnl ( + vif-north1) echo hv5 ;; dnl ( + vif-north2) echo hv5 ;; dnl ( + vif-north3) echo hv5 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +vif_to_lrp () { + case ${1} in dnl ( + vif?[[11]]) echo DR-ls ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac + +} + +ip_to_hex() { + printf "%02x%02x%02x%02x" "${@}" +} + +# test_arp INPORT SHA SPA TPA +# +# Causes a packet to be received on INPORT. The packet is an ARP +# request with SHA, SPA, and TPA as specified. +test_arp() { + local inport=$1 sha=$2 spa=$3 tpa=$4 + local request=ffffffffffff${sha}08060001080006040001${sha}${spa}ffffffffffff${tpa} + hv=`vif_to_hv $inport` + as $hv ovs-appctl netdev-dummy/receive $inport $request +} + + +test_ip() { + # This packet has bad checksums but logical L3 routing doesn't check. + local inport=${1} src_mac=${2} dst_mac=${3} src_ip=${4} dst_ip=${5} outport=${6} + local packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 + shift; shift; shift; shift; shift + hv=`vif_to_hv $inport` + as $hv ovs-appctl netdev-dummy/receive $inport $packet + in_ls=`vif_to_ls $inport` + for outport; do + out_ls=`vif_to_ls $outport` + if test $in_ls = $out_ls; then + # Ports on the same logical switch receive exactly the same packet. + echo $packet + else + # Routing decrements TTL and updates source and dest MAC + # (and checksum). + # For North-South, packet will come via gateway chassis, i.e hv3 + if test $inport = vif-north1; then + echo f0000000001006ac1001000108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected + fi + if test $outport = vif-north1; then + echo f0f00000001102ac1001000108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected + fi + if test $outport = vif-north2; then + echo f0f00000002208ac1001000108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected + fi + if test $outport = vif-north3; then + echo f0f00000003304ac1001000108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected + fi + fi >> $outport.expected + done +} + +echo "------ OVN dump ------" +ovn-nbctl show +ovn-sbctl show +ovn-sbctl list port_binding +ovn-sbctl list mac_binding +ovn-sbctl list datapath_binding + +ovn-sbctl dump-flows DR +ovn-sbctl dump-flows S1 +ovn-sbctl dump-flows ls + +echo "------ hv1 dump ------" +as hv1 ovs-vsctl show +as hv1 ovs-vsctl list Open_Vswitch +as hv1 ovs-ofctl dump-flows br-int + +echo "------ hv2 dump ------" +as hv2 ovs-vsctl show +as hv2 ovs-vsctl list Open_Vswitch +as hv2 ovs-ofctl dump-flows br-int + +echo "------ hv3 dump ------" +as hv3 ovs-vsctl show +as hv3 ovs-vsctl list Open_Vswitch +as hv3 ovs-ofctl dump-flows br-int + +echo "------ hv4 dump ------" +as hv4 ovs-vsctl show +as hv4 ovs-vsctl list Open_Vswitch +as hv5 ovs-ofctl dump-flows br-int + +# N-S with lp-north1 +echo "Send Dummy ARP" +sip=`ip_to_hex 172 16 1 10` +tip=`ip_to_hex 172 16 1 50` +test_arp vif-north1 f0f000000011 $sip $tip + +echo "Send traffic North to South" +sip=`ip_to_hex 172 16 1 10` +dip=`ip_to_hex 20 0 0 10` +test_ip vif-north1 f0f000000011 02ac10010001 $sip $dip vif11 +# Confirm that North to south traffic works fine. +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv1/vif11-tx.pcap], [vif11.expected]) + +echo "Send traffic South to North1" +sip=`ip_to_hex 20 0 0 10` +dip=`ip_to_hex 172 16 1 10` +test_ip vif11 f00000000010 06ac10010001 $sip $dip vif-north1 +# Confirm that South to North traffic works fine. +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv5/vif-north1-tx.pcap], [vif-north1.expected]) + +# N-S with lp-north2 +echo "Send Dummy ARP" +sip=`ip_to_hex 10 0 0 10` +tip=`ip_to_hex 10 0 0 50` +test_arp vif-north2 f0f000000022 $sip $tip + +echo "Send traffic South to North2" +sip=`ip_to_hex 20 0 0 10` +dip=`ip_to_hex 10 0 0 10` +test_ip vif11 f00000000010 06ac10010001 $sip $dip vif-north2 +# Confirm that South to North traffic works fine. +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv5/vif-north2-tx.pcap], [vif-north2.expected]) + +# N-S with lp-north3 +echo "Send Dummy ARP" +sip=`ip_to_hex 192 168 0 10` +tip=`ip_to_hex 192 168 0 50` +test_arp vif-north3 f0f000000033 $sip $tip + +echo "Send traffic South to North3" +sip=`ip_to_hex 20 0 0 10` +dip=`ip_to_hex 192 168 0 10` +test_ip vif11 f00000000010 06ac10010001 $sip $dip vif-north3 +# Confirm that South to North traffic works fine. +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv5/vif-north3-tx.pcap], [vif-north3.expected]) + +AT_CLEANUP +])