Message ID | 1560295002-28128-1-git-send-email-ankur.sharma@nutanix.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,v10] OVN: Enable E-W Traffic, Vlan backed DVR | expand |
On Wed, Jun 12, 2019 at 4:47 AM Ankur Sharma <ankur.sharma@nutanix.com> wrote: > Background: > [1] > https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html > [2] > https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing > > Key difference between an overlay logical switch and > vlan backed logical switch is that for vlan logical switches > packets are not encapsulated. > > Hence, if a distributed router port is connected to vlan backed > logical switch, then router port mac as source mac could be > seen from multiple hypervisors. Same <mac,vlan> pairs coming > from multiple ports from a top of the rack switch (TOR) perspective > could be seen as a security threat and it could send alarms, drop > the packets or block the ports etc. > > This patch addresses the same by introducing the concept of chassis mac. > A chassis mac is CMS provisioned unique mac per chassis. For any routed > packet > (i.e source mac is router port mac) going on the wire on a vlan type > logical switch, we will replace its source mac with chassis mac. > > This replacing of source mac with chassis mac will happen in table=65 > of the logical switch datapath. A flow is added at priority 150, which > matches the source mac and replaces it with chassis mac if the value > is a router port mac. > > Example flow: > cookie=0x0, duration=67765.830s, table=65, n_packets=0, n_bytes=0, > idle_age=65534, hard_age=65534, priority=150,reg15=0x1,metadata=0x4, > dl_src=00:00:01:01:02:03 actions=mod_dl_src:aa:bb:cc:dd:ee:ff, > mod_vlan_vid:1000,output:16 > > Here, 00:00:01:01:02:03 is router port mac and aa:bb:cc:dd:ee:ff > is chassis mac. > > Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com> > Thanks Ankur for the patch. Acked-by: Numan Siddique <nusiddiq@redhat.com> There is just one small minor comment. It would be nice if you can address it, Thanks Numan > --- > ovn/controller/binding.c | 12 +-- > ovn/controller/chassis.c | 64 +++++++++++- > ovn/controller/chassis.h | 4 + > ovn/controller/ovn-controller.8.xml | 10 ++ > ovn/controller/ovn-controller.c | 4 +- > ovn/controller/ovn-controller.h | 5 +- > ovn/controller/physical.c | 95 +++++++++++++++++ > ovn/ovn-architecture.7.xml | 24 +++++ > ovn/ovn-sb.xml | 8 ++ > tests/ovn.at | 197 > ++++++++++++++++++++++++++++++++++++ > 10 files changed, 411 insertions(+), 12 deletions(-) > > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c > index b62b3da..c73d1aa 100644 > --- a/ovn/controller/binding.c > +++ b/ovn/controller/binding.c > @@ -159,13 +159,11 @@ add_local_datapath__(struct ovsdb_idl_index > *sbrec_datapath_binding_by_key, > sbrec_port_binding_by_name, > peer->datapath, false, > depth + 1, local_datapaths); > - ld->n_peer_dps++; > - ld->peer_dps = xrealloc( > - ld->peer_dps, > - ld->n_peer_dps * sizeof *ld->peer_dps); > - ld->peer_dps[ld->n_peer_dps - 1] = > datapath_lookup_by_key( > - sbrec_datapath_binding_by_key, > - peer->datapath->tunnel_key); > + ld->n_peer_ports++; > + ld->peer_ports = xrealloc(ld->peer_ports, > + ld->n_peer_ports * > + sizeof *ld->peer_ports); > + ld->peer_ports[ld->n_peer_ports - 1] = peer; > } > } > } > diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c > index 0f537f1..8403212 100644 > --- a/ovn/controller/chassis.c > +++ b/ovn/controller/chassis.c > @@ -23,6 +23,7 @@ > #include "lib/vswitch-idl.h" > #include "openvswitch/dynamic-string.h" > #include "openvswitch/vlog.h" > +#include "openvswitch/ofp-parse.h" > #include "ovn/lib/chassis-index.h" > #include "ovn/lib/ovn-sb-idl.h" > #include "ovn-controller.h" > @@ -69,6 +70,12 @@ get_bridge_mappings(const struct smap *ext_ids) > } > > static const char * > +get_chassis_mac_mappings(const struct smap *ext_ids) > +{ > + return smap_get_def(ext_ids, "ovn-chassis-mac-mappings", ""); > +} > + > +static const char * > get_cms_options(const struct smap *ext_ids) > { > return smap_get_def(ext_ids, "ovn-cms-options", ""); > @@ -162,6 +169,7 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn, > const char *datapath_type = > br_int && br_int->datapath_type ? br_int->datapath_type : ""; > const char *cms_options = get_cms_options(&cfg->external_ids); > + const char *chassis_macs = > get_chassis_mac_mappings(&cfg->external_ids); > > struct ds iface_types = DS_EMPTY_INITIALIZER; > ds_put_cstr(&iface_types, ""); > @@ -190,18 +198,22 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn, > = smap_get_def(&chassis_rec->external_ids, "iface-types", ""); > const char *chassis_cms_options > = get_cms_options(&chassis_rec->external_ids); > + const char *chassis_mac_mappings > + = get_chassis_mac_mappings(&chassis_rec->external_ids); > > /* If any of the external-ids should change, update them. */ > if (strcmp(bridge_mappings, chassis_bridge_mappings) || > strcmp(datapath_type, chassis_datapath_type) || > strcmp(iface_types_str, chassis_iface_types) || > - strcmp(cms_options, chassis_cms_options)) { > + strcmp(cms_options, chassis_cms_options) || > + strcmp(chassis_macs, chassis_mac_mappings)) { > struct smap new_ids; > smap_clone(&new_ids, &chassis_rec->external_ids); > smap_replace(&new_ids, "ovn-bridge-mappings", > bridge_mappings); > smap_replace(&new_ids, "datapath-type", datapath_type); > smap_replace(&new_ids, "iface-types", iface_types_str); > smap_replace(&new_ids, "ovn-cms-options", cms_options); > + smap_replace(&new_ids, "ovn-chassis-mac-mappings", > chassis_macs); > sbrec_chassis_verify_external_ids(chassis_rec); > sbrec_chassis_set_external_ids(chassis_rec, &new_ids); > smap_destroy(&new_ids); > @@ -319,6 +331,56 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn, > return chassis_rec; > } > > +bool > +chassis_get_mac(const struct sbrec_chassis *chassis_rec, > + const char *bridge_mapping, > + struct eth_addr *chassis_mac) > +{ > + const char *tokens > + = get_chassis_mac_mappings(&chassis_rec->external_ids); > + > + if (!strlen(tokens)) { > + return false; > + } > + > + char *save_ptr = NULL; > + char *token; > + bool ret = false; > + char *tokstr = xstrdup(tokens); > + > + /* Format for a chassis mac configuration is: > + * ovn-chassis-mac-mappings="bridge-name1:MAC1,bridge-name2:MAC2" > + */ > + for (token = strtok_r(tokstr, ",", &save_ptr); > + token != NULL; > + token = strtok_r(NULL, ",", &save_ptr)) { > + char *save_ptr2 = NULL; > + char *chassis_mac_bridge = strtok_r(token, ":", &save_ptr2); > + char *chassis_mac_str = strtok_r(NULL, "", &save_ptr2); > + > + if (!strcmp(chassis_mac_bridge, bridge_mapping)) { > + struct eth_addr temp_mac; > + char *err_str = NULL; > + > + ret = true; > + > + /* Return the first chassis mac. */ > + if ((err_str = str_to_mac(chassis_mac_str, &temp_mac))) { > + free(err_str); > + ret = false; > + continue; > + } > + > + *chassis_mac = temp_mac; > + break; > + } > + } > + > + free(tokstr); > + > + return ret; > +} > + > /* Returns true if the database is all cleaned up, false if more work is > * required. */ > bool > diff --git a/ovn/controller/chassis.h b/ovn/controller/chassis.h > index 9847e19..e3fbc31 100644 > --- a/ovn/controller/chassis.h > +++ b/ovn/controller/chassis.h > @@ -26,6 +26,7 @@ struct ovsrec_open_vswitch_table; > struct sbrec_chassis; > struct sbrec_chassis_table; > struct sset; > +struct eth_addr; > > void chassis_register_ovs_idl(struct ovsdb_idl *); > const struct sbrec_chassis *chassis_run( > @@ -36,5 +37,8 @@ const struct sbrec_chassis *chassis_run( > const struct sset *transport_zones); > bool chassis_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn, > const struct sbrec_chassis *); > +bool chassis_get_mac(const struct sbrec_chassis *chassis, > + const char *bridge_mapping, > + struct eth_addr *chassis_mac); > > #endif /* ovn/chassis.h */ > diff --git a/ovn/controller/ovn-controller.8.xml > b/ovn/controller/ovn-controller.8.xml > index 9721d9a..18f66fe 100644 > --- a/ovn/controller/ovn-controller.8.xml > +++ b/ovn/controller/ovn-controller.8.xml > @@ -182,6 +182,16 @@ > transport zone. > </p> > </dd> > + <dt><code>external_ids:ovn-chassis-mac-mappings</code></dt> > + <dd> > + A list of key-value pairs that map a chassis specific mac to > + a physical network name. An example > + value mapping two chassis macs to two physical network names > would be: > + > <code>physnet1:aa:bb:cc:dd:ee:ff,physnet2:a1:b2:c3:d4:e5:f6</code>. > + These are the macs that ovn-controller will replace a router port > + mac with, if packet is going from a distributed router port on > + vlan type logical switch. > + </dd> > </dl> > > <p> > diff --git a/ovn/controller/ovn-controller.c > b/ovn/controller/ovn-controller.c > index 6019016..315a88b 100644 > --- a/ovn/controller/ovn-controller.c > +++ b/ovn/controller/ovn-controller.c > @@ -899,7 +899,7 @@ en_runtime_data_cleanup(struct engine_node *node) > struct local_datapath *cur_node, *next_node; > HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node, > &data->local_datapaths) { > - free(cur_node->peer_dps); > + free(cur_node->peer_ports); > hmap_remove(&data->local_datapaths, &cur_node->hmap_node); > free(cur_node); > } > @@ -929,7 +929,7 @@ en_runtime_data_run(struct engine_node *node) > } else { > struct local_datapath *cur_node, *next_node; > HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node, > local_datapaths) { > - free(cur_node->peer_dps); > + free(cur_node->peer_ports); > hmap_remove(local_datapaths, &cur_node->hmap_node); > free(cur_node); > } > diff --git a/ovn/controller/ovn-controller.h > b/ovn/controller/ovn-controller.h > index 6afd727..a4c1309 100644 > --- a/ovn/controller/ovn-controller.h > +++ b/ovn/controller/ovn-controller.h > @@ -59,8 +59,9 @@ struct local_datapath { > /* True if this datapath contains an l3gateway port located on this > * hypervisor. */ > bool has_local_l3gateway; > - const struct sbrec_datapath_binding **peer_dps; > - size_t n_peer_dps; > + > + const struct sbrec_port_binding **peer_ports; > + size_t n_peer_ports; > }; > > struct local_datapath *get_local_datapath(const struct hmap *, > diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c > index c8dc282..af587a5 100644 > --- a/ovn/controller/physical.c > +++ b/ovn/controller/physical.c > @@ -20,6 +20,7 @@ > #include "ha-chassis.h" > #include "lflow.h" > #include "lport.h" > +#include "chassis.h" > #include "lib/bundle.h" > #include "openvswitch/poll-loop.h" > #include "lib/uuid.h" > @@ -30,6 +31,7 @@ > #include "openvswitch/ofp-actions.h" > #include "openvswitch/ofpbuf.h" > #include "openvswitch/vlog.h" > +#include "openvswitch/ofp-parse.h" > #include "ovn-controller.h" > #include "ovn/lib/chassis-index.h" > #include "ovn/lib/ovn-sb-idl.h" > @@ -236,6 +238,92 @@ get_zone_ids(const struct sbrec_port_binding *binding, > } > > static void > +put_replace_router_port_mac_flows(const struct > + sbrec_port_binding *localnet_port, > + const struct sbrec_chassis *chassis, > + const struct hmap *local_datapaths, > + struct ofpbuf *ofpacts_p, > + ofp_port_t ofport, > + struct ovn_desired_flow_table > *flow_table) > +{ > + struct local_datapath *ld = get_local_datapath(local_datapaths, > + > localnet_port->datapath-> > + tunnel_key); > + ovs_assert(ld); > + > + uint32_t dp_key = localnet_port->datapath->tunnel_key; > + uint32_t port_key = localnet_port->tunnel_key; > + int tag = localnet_port->tag ? *localnet_port->tag : 0; > + const char *network = smap_get(&localnet_port->options, > "network_name"); > + struct eth_addr chassis_mac; > + > + if (!network) { > + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); > + VLOG_WARN_RL(&rl, "Physical network not configured for datapath: > %ld " > + "with localnet port", > + localnet_port->datapath->tunnel_key); > + return; > + } > + > + /* Get chassis mac */ > + if (!chassis_get_mac(chassis, network, &chassis_mac)) { > + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); > + /* Keeping the log level low for backward compatibility. > + * Chassis mac is a new configuration. > + */ > + VLOG_DBG_RL(&rl, "Could not get chassis mac for network: %s", > network); > + return; > + } > + > + for (int i = 0; i < ld->n_peer_ports; i++) { > + const struct sbrec_port_binding *rport_binding = > ld->peer_ports[i]; > + struct eth_addr router_port_mac; > + char *err_str = NULL; > + struct match match; > + struct ofpact_mac *replace_mac; > + > + /* Table 65, priority 150. > + * ======================= > + * > + * Implements output to localnet port. > + * a. Flow replaces ingress router port mac with a chassis mac. > + * b. Flow appends the vlan id localnet port is configured with. > + */ > + match_init_catchall(&match); > + ofpbuf_clear(ofpacts_p); > + > + ovs_assert(rport_binding->n_mac == 1); > + if ((err_str = str_to_mac(rport_binding->mac[0], > &router_port_mac))) { > + /* Parsing of mac failed. */ > + VLOG_WARN("Parsing or router port mac failed for router port: > %s, " > + "with error: %s", rport_binding->logical_port, > err_str); > + free(err_str); > + return; > + } > + > + /* Replace Router mac flow */ > + match_set_metadata(&match, htonll(dp_key)); > + match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key); > + match_set_dl_src(&match, router_port_mac); > + > + replace_mac = ofpact_put_SET_ETH_SRC(ofpacts_p); > + replace_mac->mac = chassis_mac; > + > + if (tag) { > + struct ofpact_vlan_vid *vlan_vid; > + vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p); > + vlan_vid->vlan_vid = tag; > + vlan_vid->push_vlan_if_needed = true; > + } > + > + ofpact_put_OUTPUT(ofpacts_p)->port = ofport; > + > + ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 150, 0, > + &match, ofpacts_p, &localnet_port->header_.uuid); > + } > +} > + > +static void > put_local_common_flows(uint32_t dp_key, uint32_t port_key, > uint32_t parent_port_key, > const struct zone_ids *zone_ids, > @@ -707,6 +795,13 @@ consider_port_binding(struct ovsdb_idl_index > *sbrec_port_binding_by_name, > } > ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 100, 0, > &match, ofpacts_p, &binding->header_.uuid); > + > + if (!strcmp(binding->type, "localnet")) { > + put_replace_router_port_mac_flows(binding, chassis, > + local_datapaths, ofpacts_p, > + ofport, flow_table); > + } > + > } else if (!tun && !is_ha_remote) { > /* Remote port connected by localnet port */ > /* Table 33, priority 100. > diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml > index 8c9e106..6275db1 100644 > --- a/ovn/ovn-architecture.7.xml > +++ b/ovn/ovn-architecture.7.xml > @@ -1407,6 +1407,30 @@ > egress pipeline of the destination localnet logical switch datapath > and goes out of the integration bridge to the provider bridge ( > belonging to the destination logical switch) via the localnet port. > + While sending the packet to provider bridge, we also replace router > + port mac as source mac with a chassis unique mac. > + > + This chassis unique mac is configured as global ovs config on each > + chassis (eg. via "<code>ovs-vsctl set open . external-ids: > + ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"</code>").More > + details on this config are present in > <code>ovn-controller</code>(8). > + > + If the above is not configured, then source mac would be the router > + port mac. This could create problem if we have more than one > chassis. > + This is because, since the router port is distributed, hence same > + mac,vlan tuple will seen by physical network from other chassis > + as well. This could cause some/all of these issues: > + <ul> > + <li> > + Continous mac moves in top of the rack switch (TOR). > + </li> > + <li> > + TOR dropping the traffic, which is causing continous mac moves. > + </li> > + <li> > + TOR blocking the ports from which mac moves are happening. > + </li> > + </ul> > </li> > > <li> > diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml > index 1a2bc1d..89e88c4 100644 > --- a/ovn/ovn-sb.xml > +++ b/ovn/ovn-sb.xml > @@ -301,6 +301,14 @@ > See <code>ovn-controller</code>(8) for more information. > </column> > > + <column name="external_ids" key="ovn-chassis-mac-mappings"> > + <code>ovn-controller</code> populates this key with the set of > options > + configured in the <ref table="Open_vSwitch" > + column="external_ids:ovn-chassis-mac-mappings"/> column of the > + Open_vSwitch database's <ref table="Open_vSwitch" > db="Open_vSwitch"/> > + table. See <code>ovn-controller</code>(8) for more information. > + </column> > + > <group title="Common Columns"> > The overall purpose of these columns is described under <code>Common > Columns</code> at the beginning of this document. > diff --git a/tests/ovn.at b/tests/ovn.at > index daf85a5..d6cbb7b 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -14017,3 +14017,200 @@ ovn-hv4-0 > > OVN_CLEANUP([hv1], [hv2], [hv3]) > AT_CLEANUP > + > +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac]) > +ovn_start > + > + > +# In this test cases we create 2 switches, all connected to same > +# physical network (through br-phys on each HV). Each switch has > +# 1 VIF. Each HV has 1 VIF port. The first digit > +# of VIF port name indicates the hypervisor it is bound to, e.g. > +# lp23 means VIF 3 on hv2. > +# > +# Each switch's VLAN tag and their logical switch ports are: > +# - ls1: > +# - tagged with VLAN 101 > +# - ports: lp11 > +# - ls2: > +# - tagged with VLAN 201 > +# - ports: lp22 > +# > +# Note: a localnet port is created for each switch to connect to > +# physical network. > + > +for i in 1 2; do > + ls_name=ls$i > + ovn-nbctl ls-add $ls_name > + ln_port_name=ln$i > + if test $i -eq 1; then > + ovn-nbctl lsp-add $ls_name $ln_port_name "" 101 > + elif test $i -eq 2; then > + ovn-nbctl lsp-add $ls_name $ln_port_name "" 201 > + fi > + ovn-nbctl lsp-set-addresses $ln_port_name unknown > + ovn-nbctl lsp-set-type $ln_port_name localnet > + ovn-nbctl lsp-set-options $ln_port_name network_name=phys > +done > + > +# lsp_to_ls LSP > +# > +# Prints the name of the logical switch that contains LSP. > +lsp_to_ls () { > + case $1 in dnl ( > + lp?[[11]]) echo ls1 ;; dnl ( > + lp?[[12]]) echo ls2 ;; dnl ( > + *) AT_FAIL_IF([:]) ;; > + esac > +} > + > +vif_to_ls () { > + case $1 in dnl ( > + vif?[[11]]) echo ls1 ;; dnl ( > + vif?[[12]]) echo ls2 ;; dnl ( > + *) AT_FAIL_IF([:]) ;; > + esac > +} > + > +hv_to_num () { > + case $1 in dnl ( > + hv1) echo 1 ;; dnl ( > + hv2) echo 2 ;; dnl ( > + *) AT_FAIL_IF([:]) ;; > + esac > +} > + > +vif_to_num () { > + case $1 in dnl ( > + vif22) echo 22 ;; dnl ( > + vif21) echo 21 ;; dnl ( > + *) AT_FAIL_IF([:]) ;; > + esac > +} > + > +vif_to_hv () { > + case $1 in dnl ( > + vif[[1]]?) echo hv1 ;; dnl ( > + vif[[2]]?) echo hv2 ;; dnl ( > + *) AT_FAIL_IF([:]) ;; > + esac > +} > + > +vif_to_lrp () { > + echo router-to-`vif_to_ls $1` > +} > + > +hv_to_chassis_mac () { > + case $1 in dnl ( > + hv[[1]]) echo aa:bb:cc:dd:ee:11 ;; dnl ( > + hv[[2]]) echo aa:bb:cc:dd:ee:22 ;; dnl ( > + *) AT_FAIL_IF([:]) ;; > + esac > +} > + > +ip_to_hex() { > + printf "%02x%02x%02x%02x" "$@" > +} > + > +net_add n1 > +for i in 1 2; do > + sim_add hv$i > + as hv$i > + ovs-vsctl add-br br-phys > + ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys > + ovs-vsctl set open . > external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i" > + ovn_attach n1 br-phys 192.168.0.$i > + > + ovs-vsctl add-port br-int vif$i$i -- \ > + set Interface vif$i$i external-ids:iface-id=lp$i$i \ > + options:tx_pcap=hv$i/vif$i$i-tx.pcap \ > + options:rxq_pcap=hv$i/vif$i$i-rx.pcap \ > + ofport-request=$i$i > + > + lsp_name=lp$i$i > + ls_name=$(lsp_to_ls $lsp_name) > + > + ovn-nbctl lsp-add $ls_name $lsp_name > + ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i > 192.168.$i.$i" > + ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i > + > + OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup]) > + > +done > + > +ovn-nbctl lr-add router > +ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24 > +ovn-nbctl <http://192.168.1.3/24+ovn-nbctl> lrp-add router router-to-ls2 > 00:00:01:01:02:05 192.168.2.3/24 > + > +ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port > ls1-to-router type=router options:router-port=router-to-ls1 -- > lsp-set-addresses ls1-to-router router > +ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port > ls2-to-router type=router options:router-port=router-to-ls2 -- > lsp-set-addresses ls2-to-router router > + > +ovn-nbctl --wait=sb sync > +#ovn-sbctl dump-flows > + > +ovn-nbctl show > +ovn-sbctl show > + > +OVN_POPULATE_ARP > + > +test_ip() { > + # This packet has bad checksums but logical L3 routing doesn't check. > + local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5 > + local > packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 > + shift; shift; shift; shift; shift > + hv=`vif_to_hv $inport` > + hv_num=`hv_to_num $hv` > + chassis_mac=`hv_to_chassis_mac $hv` > + as $hv ovs-appctl netdev-dummy/receive $inport $packet > + #as $hv ovs-appctl ofproto/trace br-int in_port=$inport $packet > + in_ls=`vif_to_ls $inport` > + in_lrp=`vif_to_lrp $inport` > + for outport; do > + out_ls=`vif_to_ls $outport` > + if test $in_ls = $out_ls; then > + # Ports on the same logical switch receive exactly the same > packet. > + echo $packet > + else > + # Routing decrements TTL and updates source and dest MAC > + # (and checksum). > + outport_num=`vif_to_num $outport` > + out_lrp=`vif_to_lrp $outport` > + echo > f000000000${outport_num}aabbccddee${hv_num}${hv_num}08004500001c00000000"3f1101"00${src_ip}${dst_ip}0035111100080000 > + fi >> $outport.expected > + done > +} > + > +# Dump a bunch of info helpful for debugging if there's a failure. > + > +echo "------ OVN dump ------" > +ovn-nbctl show > +ovn-sbctl show > + > +echo "------ hv1 dump ------" > +as hv1 ovs-vsctl show > +as hv1 ovs-vsctl list Open_Vswitch > + > +echo "------ hv2 dump ------" > +as hv2 ovs-vsctl show > +as hv2 ovs-vsctl list Open_Vswitch > + > +echo "Send traffic" > +sip=`ip_to_hex 192 168 1 1` > +dip=`ip_to_hex 192 168 2 2` > +test_ip vif11 f00000000011 000001010203 $sip $dip vif22 > + > +sleep 1 > I think you can delete this sleep. It adds no value. > + > +echo "----------- Post Traffic hv1 dump -----------" > +as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int > +as hv1 ovs-appctl fdb/show br-phys > + > +echo "----------- Post Traffic hv2 dump -----------" > +as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int > +as hv2 ovs-appctl fdb/show br-phys > + > +OVN_CHECK_PACKETS([hv2/vif22-tx.pcap], [vif22.expected]) > + > +OVN_CLEANUP([hv1],[hv2]) > + > +AT_CLEANUP > -- > 1.8.3.1 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
Hi Numan, Thank for the Ack. Sent out v11, addressing your comment. Thanks again. Regards, Ankur From: Numan Siddique <nusiddiq@redhat.com> Sent: Monday, June 17, 2019 3:53 AM To: Ankur Sharma <ankur.sharma@nutanix.com> Cc: ovs-dev@openvswitch.org Subject: Re: [ovs-dev] [PATCH v10] OVN: Enable E-W Traffic, Vlan backed DVR On Wed, Jun 12, 2019 at 4:47 AM Ankur Sharma <ankur.sharma@nutanix.com<mailto:ankur.sharma@nutanix.com>> wrote: Background: [1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html [mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2018-2DOctober_353066.html&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=wxz7gTPh2rjmCdqqfwx-1bR-TmO4cH5vUWwounmM7bI&e=> [2] https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing [docs.google.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU_edit-3Fusp-3Dsharing&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3viJBXBU_4-d5yneJW8CgdfdmpDFL_vbjyydTdEZzrI&e=> Key difference between an overlay logical switch and vlan backed logical switch is that for vlan logical switches packets are not encapsulated. Hence, if a distributed router port is connected to vlan backed logical switch, then router port mac as source mac could be seen from multiple hypervisors. Same <mac,vlan> pairs coming from multiple ports from a top of the rack switch (TOR) perspective could be seen as a security threat and it could send alarms, drop the packets or block the ports etc. This patch addresses the same by introducing the concept of chassis mac. A chassis mac is CMS provisioned unique mac per chassis. For any routed packet (i.e source mac is router port mac) going on the wire on a vlan type logical switch, we will replace its source mac with chassis mac. This replacing of source mac with chassis mac will happen in table=65 of the logical switch datapath. A flow is added at priority 150, which matches the source mac and replaces it with chassis mac if the value is a router port mac. Example flow: cookie=0x0, duration=67765.830s, table=65, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=150,reg15=0x1,metadata=0x4, dl_src=00:00:01:01:02:03 actions=mod_dl_src:aa:bb:cc:dd:ee:ff, mod_vlan_vid:1000,output:16 Here, 00:00:01:01:02:03 is router port mac and aa:bb:cc:dd:ee:ff is chassis mac. Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com<mailto:ankur.sharma@nutanix.com>> Thanks Ankur for the patch. Acked-by: Numan Siddique <nusiddiq@redhat.com<mailto:nusiddiq@redhat.com>> There is just one small minor comment. It would be nice if you can address it, Thanks Numan --- ovn/controller/binding.c | 12 +-- ovn/controller/chassis.c | 64 +++++++++++- ovn/controller/chassis.h | 4 + ovn/controller/ovn-controller.8.xml | 10 ++ ovn/controller/ovn-controller.c | 4 +- ovn/controller/ovn-controller.h | 5 +- ovn/controller/physical.c | 95 +++++++++++++++++ ovn/ovn-architecture.7.xml | 24 +++++ ovn/ovn-sb.xml | 8 ++ tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3liFn9Ja4awYUt3zaPN5Wsw1PO0SLdzD9kiAPJ09Oco&e=> | 197 ++++++++++++++++++++++++++++++++++++ 10 files changed, 411 insertions(+), 12 deletions(-) diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c index b62b3da..c73d1aa 100644 --- a/ovn/controller/binding.c +++ b/ovn/controller/binding.c @@ -159,13 +159,11 @@ add_local_datapath__(struct ovsdb_idl_index *sbrec_datapath_binding_by_key, sbrec_port_binding_by_name, peer->datapath, false, depth + 1, local_datapaths); - ld->n_peer_dps++; - ld->peer_dps = xrealloc( - ld->peer_dps, - ld->n_peer_dps * sizeof *ld->peer_dps); - ld->peer_dps[ld->n_peer_dps - 1] = datapath_lookup_by_key( - sbrec_datapath_binding_by_key, - peer->datapath->tunnel_key); + ld->n_peer_ports++; + ld->peer_ports = xrealloc(ld->peer_ports, + ld->n_peer_ports * + sizeof *ld->peer_ports); + ld->peer_ports[ld->n_peer_ports - 1] = peer; } } } diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c index 0f537f1..8403212 100644 --- a/ovn/controller/chassis.c +++ b/ovn/controller/chassis.c @@ -23,6 +23,7 @@ #include "lib/vswitch-idl.h" #include "openvswitch/dynamic-string.h" #include "openvswitch/vlog.h" +#include "openvswitch/ofp-parse.h" #include "ovn/lib/chassis-index.h" #include "ovn/lib/ovn-sb-idl.h" #include "ovn-controller.h" @@ -69,6 +70,12 @@ get_bridge_mappings(const struct smap *ext_ids) } static const char * +get_chassis_mac_mappings(const struct smap *ext_ids) +{ + return smap_get_def(ext_ids, "ovn-chassis-mac-mappings", ""); +} + +static const char * get_cms_options(const struct smap *ext_ids) { return smap_get_def(ext_ids, "ovn-cms-options", ""); @@ -162,6 +169,7 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn, const char *datapath_type = br_int && br_int->datapath_type ? br_int->datapath_type : ""; const char *cms_options = get_cms_options(&cfg->external_ids); + const char *chassis_macs = get_chassis_mac_mappings(&cfg->external_ids); struct ds iface_types = DS_EMPTY_INITIALIZER; ds_put_cstr(&iface_types, ""); @@ -190,18 +198,22 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn, = smap_get_def(&chassis_rec->external_ids, "iface-types", ""); const char *chassis_cms_options = get_cms_options(&chassis_rec->external_ids); + const char *chassis_mac_mappings + = get_chassis_mac_mappings(&chassis_rec->external_ids); /* If any of the external-ids should change, update them. */ if (strcmp(bridge_mappings, chassis_bridge_mappings) || strcmp(datapath_type, chassis_datapath_type) || strcmp(iface_types_str, chassis_iface_types) || - strcmp(cms_options, chassis_cms_options)) { + strcmp(cms_options, chassis_cms_options) || + strcmp(chassis_macs, chassis_mac_mappings)) { struct smap new_ids; smap_clone(&new_ids, &chassis_rec->external_ids); smap_replace(&new_ids, "ovn-bridge-mappings", bridge_mappings); smap_replace(&new_ids, "datapath-type", datapath_type); smap_replace(&new_ids, "iface-types", iface_types_str); smap_replace(&new_ids, "ovn-cms-options", cms_options); + smap_replace(&new_ids, "ovn-chassis-mac-mappings", chassis_macs); sbrec_chassis_verify_external_ids(chassis_rec); sbrec_chassis_set_external_ids(chassis_rec, &new_ids); smap_destroy(&new_ids); @@ -319,6 +331,56 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn, return chassis_rec; } +bool +chassis_get_mac(const struct sbrec_chassis *chassis_rec, + const char *bridge_mapping, + struct eth_addr *chassis_mac) +{ + const char *tokens + = get_chassis_mac_mappings(&chassis_rec->external_ids); + + if (!strlen(tokens)) { + return false; + } + + char *save_ptr = NULL; + char *token; + bool ret = false; + char *tokstr = xstrdup(tokens); + + /* Format for a chassis mac configuration is: + * ovn-chassis-mac-mappings="bridge-name1:MAC1,bridge-name2:MAC2" + */ + for (token = strtok_r(tokstr, ",", &save_ptr); + token != NULL; + token = strtok_r(NULL, ",", &save_ptr)) { + char *save_ptr2 = NULL; + char *chassis_mac_bridge = strtok_r(token, ":", &save_ptr2); + char *chassis_mac_str = strtok_r(NULL, "", &save_ptr2); + + if (!strcmp(chassis_mac_bridge, bridge_mapping)) { + struct eth_addr temp_mac; + char *err_str = NULL; + + ret = true; + + /* Return the first chassis mac. */ + if ((err_str = str_to_mac(chassis_mac_str, &temp_mac))) { + free(err_str); + ret = false; + continue; + } + + *chassis_mac = temp_mac; + break; + } + } + + free(tokstr); + + return ret; +} + /* Returns true if the database is all cleaned up, false if more work is * required. */ bool diff --git a/ovn/controller/chassis.h b/ovn/controller/chassis.h index 9847e19..e3fbc31 100644 --- a/ovn/controller/chassis.h +++ b/ovn/controller/chassis.h @@ -26,6 +26,7 @@ struct ovsrec_open_vswitch_table; struct sbrec_chassis; struct sbrec_chassis_table; struct sset; +struct eth_addr; void chassis_register_ovs_idl(struct ovsdb_idl *); const struct sbrec_chassis *chassis_run( @@ -36,5 +37,8 @@ const struct sbrec_chassis *chassis_run( const struct sset *transport_zones); bool chassis_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn, const struct sbrec_chassis *); +bool chassis_get_mac(const struct sbrec_chassis *chassis, + const char *bridge_mapping, + struct eth_addr *chassis_mac); #endif /* ovn/chassis.h */ diff --git a/ovn/controller/ovn-controller.8.xml b/ovn/controller/ovn-controller.8.xml index 9721d9a..18f66fe 100644 --- a/ovn/controller/ovn-controller.8.xml +++ b/ovn/controller/ovn-controller.8.xml @@ -182,6 +182,16 @@ transport zone. </p> </dd> + <dt><code>external_ids:ovn-chassis-mac-mappings</code></dt> + <dd> + A list of key-value pairs that map a chassis specific mac to + a physical network name. An example + value mapping two chassis macs to two physical network names would be: + <code>physnet1:aa:bb:cc:dd:ee:ff,physnet2:a1:b2:c3:d4:e5:f6</code>. + These are the macs that ovn-controller will replace a router port + mac with, if packet is going from a distributed router port on + vlan type logical switch. + </dd> </dl> <p> diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c index 6019016..315a88b 100644 --- a/ovn/controller/ovn-controller.c +++ b/ovn/controller/ovn-controller.c @@ -899,7 +899,7 @@ en_runtime_data_cleanup(struct engine_node *node) struct local_datapath *cur_node, *next_node; HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node, &data->local_datapaths) { - free(cur_node->peer_dps); + free(cur_node->peer_ports); hmap_remove(&data->local_datapaths, &cur_node->hmap_node); free(cur_node); } @@ -929,7 +929,7 @@ en_runtime_data_run(struct engine_node *node) } else { struct local_datapath *cur_node, *next_node; HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node, local_datapaths) { - free(cur_node->peer_dps); + free(cur_node->peer_ports); hmap_remove(local_datapaths, &cur_node->hmap_node); free(cur_node); } diff --git a/ovn/controller/ovn-controller.h b/ovn/controller/ovn-controller.h index 6afd727..a4c1309 100644 --- a/ovn/controller/ovn-controller.h +++ b/ovn/controller/ovn-controller.h @@ -59,8 +59,9 @@ struct local_datapath { /* True if this datapath contains an l3gateway port located on this * hypervisor. */ bool has_local_l3gateway; - const struct sbrec_datapath_binding **peer_dps; - size_t n_peer_dps; + + const struct sbrec_port_binding **peer_ports; + size_t n_peer_ports; }; struct local_datapath *get_local_datapath(const struct hmap *, diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c index c8dc282..af587a5 100644 --- a/ovn/controller/physical.c +++ b/ovn/controller/physical.c @@ -20,6 +20,7 @@ #include "ha-chassis.h" #include "lflow.h" #include "lport.h" +#include "chassis.h" #include "lib/bundle.h" #include "openvswitch/poll-loop.h" #include "lib/uuid.h" @@ -30,6 +31,7 @@ #include "openvswitch/ofp-actions.h" #include "openvswitch/ofpbuf.h" #include "openvswitch/vlog.h" +#include "openvswitch/ofp-parse.h" #include "ovn-controller.h" #include "ovn/lib/chassis-index.h" #include "ovn/lib/ovn-sb-idl.h" @@ -236,6 +238,92 @@ get_zone_ids(const struct sbrec_port_binding *binding, } static void +put_replace_router_port_mac_flows(const struct + sbrec_port_binding *localnet_port, + const struct sbrec_chassis *chassis, + const struct hmap *local_datapaths, + struct ofpbuf *ofpacts_p, + ofp_port_t ofport, + struct ovn_desired_flow_table *flow_table) +{ + struct local_datapath *ld = get_local_datapath(local_datapaths, + localnet_port->datapath-> + tunnel_key); + ovs_assert(ld); + + uint32_t dp_key = localnet_port->datapath->tunnel_key; + uint32_t port_key = localnet_port->tunnel_key; + int tag = localnet_port->tag ? *localnet_port->tag : 0; + const char *network = smap_get(&localnet_port->options, "network_name"); + struct eth_addr chassis_mac; + + if (!network) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + VLOG_WARN_RL(&rl, "Physical network not configured for datapath: %ld " + "with localnet port", + localnet_port->datapath->tunnel_key); + return; + } + + /* Get chassis mac */ + if (!chassis_get_mac(chassis, network, &chassis_mac)) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + /* Keeping the log level low for backward compatibility. + * Chassis mac is a new configuration. + */ + VLOG_DBG_RL(&rl, "Could not get chassis mac for network: %s", network); + return; + } + + for (int i = 0; i < ld->n_peer_ports; i++) { + const struct sbrec_port_binding *rport_binding = ld->peer_ports[i]; + struct eth_addr router_port_mac; + char *err_str = NULL; + struct match match; + struct ofpact_mac *replace_mac; + + /* Table 65, priority 150. + * ======================= + * + * Implements output to localnet port. + * a. Flow replaces ingress router port mac with a chassis mac. + * b. Flow appends the vlan id localnet port is configured with. + */ + match_init_catchall(&match); + ofpbuf_clear(ofpacts_p); + + ovs_assert(rport_binding->n_mac == 1); + if ((err_str = str_to_mac(rport_binding->mac[0], &router_port_mac))) { + /* Parsing of mac failed. */ + VLOG_WARN("Parsing or router port mac failed for router port: %s, " + "with error: %s", rport_binding->logical_port, err_str); + free(err_str); + return; + } + + /* Replace Router mac flow */ + match_set_metadata(&match, htonll(dp_key)); + match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key); + match_set_dl_src(&match, router_port_mac); + + replace_mac = ofpact_put_SET_ETH_SRC(ofpacts_p); + replace_mac->mac = chassis_mac; + + if (tag) { + struct ofpact_vlan_vid *vlan_vid; + vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p); + vlan_vid->vlan_vid = tag; + vlan_vid->push_vlan_if_needed = true; + } + + ofpact_put_OUTPUT(ofpacts_p)->port = ofport; + + ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 150, 0, + &match, ofpacts_p, &localnet_port->header_.uuid); + } +} + +static void put_local_common_flows(uint32_t dp_key, uint32_t port_key, uint32_t parent_port_key, const struct zone_ids *zone_ids, @@ -707,6 +795,13 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_port_binding_by_name, } ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 100, 0, &match, ofpacts_p, &binding->header_.uuid); + + if (!strcmp(binding->type, "localnet")) { + put_replace_router_port_mac_flows(binding, chassis, + local_datapaths, ofpacts_p, + ofport, flow_table); + } + } else if (!tun && !is_ha_remote) { /* Remote port connected by localnet port */ /* Table 33, priority 100. diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml index 8c9e106..6275db1 100644 --- a/ovn/ovn-architecture.7.xml +++ b/ovn/ovn-architecture.7.xml @@ -1407,6 +1407,30 @@ egress pipeline of the destination localnet logical switch datapath and goes out of the integration bridge to the provider bridge ( belonging to the destination logical switch) via the localnet port. + While sending the packet to provider bridge, we also replace router + port mac as source mac with a chassis unique mac. + + This chassis unique mac is configured as global ovs config on each + chassis (eg. via "<code>ovs-vsctl set open . external-ids: + ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"</code>").More + details on this config are present in <code>ovn-controller</code>(8). + + If the above is not configured, then source mac would be the router + port mac. This could create problem if we have more than one chassis. + This is because, since the router port is distributed, hence same + mac,vlan tuple will seen by physical network from other chassis + as well. This could cause some/all of these issues: + <ul> + <li> + Continous mac moves in top of the rack switch (TOR). + </li> + <li> + TOR dropping the traffic, which is causing continous mac moves. + </li> + <li> + TOR blocking the ports from which mac moves are happening. + </li> + </ul> </li> <li> diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml index 1a2bc1d..89e88c4 100644 --- a/ovn/ovn-sb.xml +++ b/ovn/ovn-sb.xml @@ -301,6 +301,14 @@ See <code>ovn-controller</code>(8) for more information. </column> + <column name="external_ids" key="ovn-chassis-mac-mappings"> + <code>ovn-controller</code> populates this key with the set of options + configured in the <ref table="Open_vSwitch" + column="external_ids:ovn-chassis-mac-mappings"/> column of the + Open_vSwitch database's <ref table="Open_vSwitch" db="Open_vSwitch"/> + table. See <code>ovn-controller</code>(8) for more information. + </column> + <group title="Common Columns"> The overall purpose of these columns is described under <code>Common Columns</code> at the beginning of this document. diff --git a/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3liFn9Ja4awYUt3zaPN5Wsw1PO0SLdzD9kiAPJ09Oco&e=> b/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3liFn9Ja4awYUt3zaPN5Wsw1PO0SLdzD9kiAPJ09Oco&e=> index daf85a5..d6cbb7b 100644 --- a/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3liFn9Ja4awYUt3zaPN5Wsw1PO0SLdzD9kiAPJ09Oco&e=> +++ b/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3liFn9Ja4awYUt3zaPN5Wsw1PO0SLdzD9kiAPJ09Oco&e=> @@ -14017,3 +14017,200 @@ ovn-hv4-0 OVN_CLEANUP([hv1], [hv2], [hv3]) AT_CLEANUP + +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac]) +ovn_start + + +# In this test cases we create 2 switches, all connected to same +# physical network (through br-phys on each HV). Each switch has +# 1 VIF. Each HV has 1 VIF port. The first digit +# of VIF port name indicates the hypervisor it is bound to, e.g. +# lp23 means VIF 3 on hv2. +# +# Each switch's VLAN tag and their logical switch ports are: +# - ls1: +# - tagged with VLAN 101 +# - ports: lp11 +# - ls2: +# - tagged with VLAN 201 +# - ports: lp22 +# +# Note: a localnet port is created for each switch to connect to +# physical network. + +for i in 1 2; do + ls_name=ls$i + ovn-nbctl ls-add $ls_name + ln_port_name=ln$i + if test $i -eq 1; then + ovn-nbctl lsp-add $ls_name $ln_port_name "" 101 + elif test $i -eq 2; then + ovn-nbctl lsp-add $ls_name $ln_port_name "" 201 + fi + ovn-nbctl lsp-set-addresses $ln_port_name unknown + ovn-nbctl lsp-set-type $ln_port_name localnet + ovn-nbctl lsp-set-options $ln_port_name network_name=phys +done + +# lsp_to_ls LSP +# +# Prints the name of the logical switch that contains LSP. +lsp_to_ls () { + case $1 in dnl ( + lp?[[11]]) echo ls1 ;; dnl ( + lp?[[12]]) echo ls2 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +vif_to_ls () { + case $1 in dnl ( + vif?[[11]]) echo ls1 ;; dnl ( + vif?[[12]]) echo ls2 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +hv_to_num () { + case $1 in dnl ( + hv1) echo 1 ;; dnl ( + hv2) echo 2 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +vif_to_num () { + case $1 in dnl ( + vif22) echo 22 ;; dnl ( + vif21) echo 21 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +vif_to_hv () { + case $1 in dnl ( + vif[[1]]?) echo hv1 ;; dnl ( + vif[[2]]?) echo hv2 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +vif_to_lrp () { + echo router-to-`vif_to_ls $1` +} + +hv_to_chassis_mac () { + case $1 in dnl ( + hv[[1]]) echo aa:bb:cc:dd:ee:11 ;; dnl ( + hv[[2]]) echo aa:bb:cc:dd:ee:22 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +ip_to_hex() { + printf "%02x%02x%02x%02x" "$@" +} + +net_add n1 +for i in 1 2; do + sim_add hv$i + as hv$i + ovs-vsctl add-br br-phys + ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys + ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i" + ovn_attach n1 br-phys 192.168.0.$i + + ovs-vsctl add-port br-int vif$i$i -- \ + set Interface vif$i$i external-ids:iface-id=lp$i$i \ + options:tx_pcap=hv$i/vif$i$i-tx.pcap \ + options:rxq_pcap=hv$i/vif$i$i-rx.pcap \ + ofport-request=$i$i + + lsp_name=lp$i$i + ls_name=$(lsp_to_ls $lsp_name) + + ovn-nbctl lsp-add $ls_name $lsp_name + ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i 192.168.$i.$i" + ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i + + OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup]) + +done + +ovn-nbctl lr-add router +ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24 +ovn-nbctl [192.168.1.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.1.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=aRxfkK_fs5bvaH5xX0Jl7E-WPVOkqXaaCWuJiLRCbaI&e=> lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24 [192.168.2.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.2.3_24&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=Ux6gDi23oYndvNl_Gz2PaF7lMjb7jcqK6AdHBjCaHIo&e=> + +ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port ls1-to-router type=router options:router-port=router-to-ls1 -- lsp-set-addresses ls1-to-router router +ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port ls2-to-router type=router options:router-port=router-to-ls2 -- lsp-set-addresses ls2-to-router router + +ovn-nbctl --wait=sb sync +#ovn-sbctl dump-flows + +ovn-nbctl show +ovn-sbctl show + +OVN_POPULATE_ARP + +test_ip() { + # This packet has bad checksums but logical L3 routing doesn't check. + local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5 + local packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 + shift; shift; shift; shift; shift + hv=`vif_to_hv $inport` + hv_num=`hv_to_num $hv` + chassis_mac=`hv_to_chassis_mac $hv` + as $hv ovs-appctl netdev-dummy/receive $inport $packet + #as $hv ovs-appctl ofproto/trace br-int in_port=$inport $packet + in_ls=`vif_to_ls $inport` + in_lrp=`vif_to_lrp $inport` + for outport; do + out_ls=`vif_to_ls $outport` + if test $in_ls = $out_ls; then + # Ports on the same logical switch receive exactly the same packet. + echo $packet + else + # Routing decrements TTL and updates source and dest MAC + # (and checksum). + outport_num=`vif_to_num $outport` + out_lrp=`vif_to_lrp $outport` + echo f000000000${outport_num}aabbccddee${hv_num}${hv_num}08004500001c00000000"3f1101"00${src_ip}${dst_ip}0035111100080000 + fi >> $outport.expected + done +} + +# Dump a bunch of info helpful for debugging if there's a failure. + +echo "------ OVN dump ------" +ovn-nbctl show +ovn-sbctl show + +echo "------ hv1 dump ------" +as hv1 ovs-vsctl show +as hv1 ovs-vsctl list Open_Vswitch + +echo "------ hv2 dump ------" +as hv2 ovs-vsctl show +as hv2 ovs-vsctl list Open_Vswitch + +echo "Send traffic" +sip=`ip_to_hex 192 168 1 1` +dip=`ip_to_hex 192 168 2 2` +test_ip vif11 f00000000011 000001010203 $sip $dip vif22 + +sleep 1 I think you can delete this sleep. It adds no value. + +echo "----------- Post Traffic hv1 dump -----------" +as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int +as hv1 ovs-appctl fdb/show br-phys + +echo "----------- Post Traffic hv2 dump -----------" +as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int +as hv2 ovs-appctl fdb/show br-phys + +OVN_CHECK_PACKETS([hv2/vif22-tx.pcap], [vif22.expected]) + +OVN_CLEANUP([hv1],[hv2]) + +AT_CLEANUP -- 1.8.3.1
diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c index b62b3da..c73d1aa 100644 --- a/ovn/controller/binding.c +++ b/ovn/controller/binding.c @@ -159,13 +159,11 @@ add_local_datapath__(struct ovsdb_idl_index *sbrec_datapath_binding_by_key, sbrec_port_binding_by_name, peer->datapath, false, depth + 1, local_datapaths); - ld->n_peer_dps++; - ld->peer_dps = xrealloc( - ld->peer_dps, - ld->n_peer_dps * sizeof *ld->peer_dps); - ld->peer_dps[ld->n_peer_dps - 1] = datapath_lookup_by_key( - sbrec_datapath_binding_by_key, - peer->datapath->tunnel_key); + ld->n_peer_ports++; + ld->peer_ports = xrealloc(ld->peer_ports, + ld->n_peer_ports * + sizeof *ld->peer_ports); + ld->peer_ports[ld->n_peer_ports - 1] = peer; } } } diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c index 0f537f1..8403212 100644 --- a/ovn/controller/chassis.c +++ b/ovn/controller/chassis.c @@ -23,6 +23,7 @@ #include "lib/vswitch-idl.h" #include "openvswitch/dynamic-string.h" #include "openvswitch/vlog.h" +#include "openvswitch/ofp-parse.h" #include "ovn/lib/chassis-index.h" #include "ovn/lib/ovn-sb-idl.h" #include "ovn-controller.h" @@ -69,6 +70,12 @@ get_bridge_mappings(const struct smap *ext_ids) } static const char * +get_chassis_mac_mappings(const struct smap *ext_ids) +{ + return smap_get_def(ext_ids, "ovn-chassis-mac-mappings", ""); +} + +static const char * get_cms_options(const struct smap *ext_ids) { return smap_get_def(ext_ids, "ovn-cms-options", ""); @@ -162,6 +169,7 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn, const char *datapath_type = br_int && br_int->datapath_type ? br_int->datapath_type : ""; const char *cms_options = get_cms_options(&cfg->external_ids); + const char *chassis_macs = get_chassis_mac_mappings(&cfg->external_ids); struct ds iface_types = DS_EMPTY_INITIALIZER; ds_put_cstr(&iface_types, ""); @@ -190,18 +198,22 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn, = smap_get_def(&chassis_rec->external_ids, "iface-types", ""); const char *chassis_cms_options = get_cms_options(&chassis_rec->external_ids); + const char *chassis_mac_mappings + = get_chassis_mac_mappings(&chassis_rec->external_ids); /* If any of the external-ids should change, update them. */ if (strcmp(bridge_mappings, chassis_bridge_mappings) || strcmp(datapath_type, chassis_datapath_type) || strcmp(iface_types_str, chassis_iface_types) || - strcmp(cms_options, chassis_cms_options)) { + strcmp(cms_options, chassis_cms_options) || + strcmp(chassis_macs, chassis_mac_mappings)) { struct smap new_ids; smap_clone(&new_ids, &chassis_rec->external_ids); smap_replace(&new_ids, "ovn-bridge-mappings", bridge_mappings); smap_replace(&new_ids, "datapath-type", datapath_type); smap_replace(&new_ids, "iface-types", iface_types_str); smap_replace(&new_ids, "ovn-cms-options", cms_options); + smap_replace(&new_ids, "ovn-chassis-mac-mappings", chassis_macs); sbrec_chassis_verify_external_ids(chassis_rec); sbrec_chassis_set_external_ids(chassis_rec, &new_ids); smap_destroy(&new_ids); @@ -319,6 +331,56 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn, return chassis_rec; } +bool +chassis_get_mac(const struct sbrec_chassis *chassis_rec, + const char *bridge_mapping, + struct eth_addr *chassis_mac) +{ + const char *tokens + = get_chassis_mac_mappings(&chassis_rec->external_ids); + + if (!strlen(tokens)) { + return false; + } + + char *save_ptr = NULL; + char *token; + bool ret = false; + char *tokstr = xstrdup(tokens); + + /* Format for a chassis mac configuration is: + * ovn-chassis-mac-mappings="bridge-name1:MAC1,bridge-name2:MAC2" + */ + for (token = strtok_r(tokstr, ",", &save_ptr); + token != NULL; + token = strtok_r(NULL, ",", &save_ptr)) { + char *save_ptr2 = NULL; + char *chassis_mac_bridge = strtok_r(token, ":", &save_ptr2); + char *chassis_mac_str = strtok_r(NULL, "", &save_ptr2); + + if (!strcmp(chassis_mac_bridge, bridge_mapping)) { + struct eth_addr temp_mac; + char *err_str = NULL; + + ret = true; + + /* Return the first chassis mac. */ + if ((err_str = str_to_mac(chassis_mac_str, &temp_mac))) { + free(err_str); + ret = false; + continue; + } + + *chassis_mac = temp_mac; + break; + } + } + + free(tokstr); + + return ret; +} + /* Returns true if the database is all cleaned up, false if more work is * required. */ bool diff --git a/ovn/controller/chassis.h b/ovn/controller/chassis.h index 9847e19..e3fbc31 100644 --- a/ovn/controller/chassis.h +++ b/ovn/controller/chassis.h @@ -26,6 +26,7 @@ struct ovsrec_open_vswitch_table; struct sbrec_chassis; struct sbrec_chassis_table; struct sset; +struct eth_addr; void chassis_register_ovs_idl(struct ovsdb_idl *); const struct sbrec_chassis *chassis_run( @@ -36,5 +37,8 @@ const struct sbrec_chassis *chassis_run( const struct sset *transport_zones); bool chassis_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn, const struct sbrec_chassis *); +bool chassis_get_mac(const struct sbrec_chassis *chassis, + const char *bridge_mapping, + struct eth_addr *chassis_mac); #endif /* ovn/chassis.h */ diff --git a/ovn/controller/ovn-controller.8.xml b/ovn/controller/ovn-controller.8.xml index 9721d9a..18f66fe 100644 --- a/ovn/controller/ovn-controller.8.xml +++ b/ovn/controller/ovn-controller.8.xml @@ -182,6 +182,16 @@ transport zone. </p> </dd> + <dt><code>external_ids:ovn-chassis-mac-mappings</code></dt> + <dd> + A list of key-value pairs that map a chassis specific mac to + a physical network name. An example + value mapping two chassis macs to two physical network names would be: + <code>physnet1:aa:bb:cc:dd:ee:ff,physnet2:a1:b2:c3:d4:e5:f6</code>. + These are the macs that ovn-controller will replace a router port + mac with, if packet is going from a distributed router port on + vlan type logical switch. + </dd> </dl> <p> diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c index 6019016..315a88b 100644 --- a/ovn/controller/ovn-controller.c +++ b/ovn/controller/ovn-controller.c @@ -899,7 +899,7 @@ en_runtime_data_cleanup(struct engine_node *node) struct local_datapath *cur_node, *next_node; HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node, &data->local_datapaths) { - free(cur_node->peer_dps); + free(cur_node->peer_ports); hmap_remove(&data->local_datapaths, &cur_node->hmap_node); free(cur_node); } @@ -929,7 +929,7 @@ en_runtime_data_run(struct engine_node *node) } else { struct local_datapath *cur_node, *next_node; HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node, local_datapaths) { - free(cur_node->peer_dps); + free(cur_node->peer_ports); hmap_remove(local_datapaths, &cur_node->hmap_node); free(cur_node); } diff --git a/ovn/controller/ovn-controller.h b/ovn/controller/ovn-controller.h index 6afd727..a4c1309 100644 --- a/ovn/controller/ovn-controller.h +++ b/ovn/controller/ovn-controller.h @@ -59,8 +59,9 @@ struct local_datapath { /* True if this datapath contains an l3gateway port located on this * hypervisor. */ bool has_local_l3gateway; - const struct sbrec_datapath_binding **peer_dps; - size_t n_peer_dps; + + const struct sbrec_port_binding **peer_ports; + size_t n_peer_ports; }; struct local_datapath *get_local_datapath(const struct hmap *, diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c index c8dc282..af587a5 100644 --- a/ovn/controller/physical.c +++ b/ovn/controller/physical.c @@ -20,6 +20,7 @@ #include "ha-chassis.h" #include "lflow.h" #include "lport.h" +#include "chassis.h" #include "lib/bundle.h" #include "openvswitch/poll-loop.h" #include "lib/uuid.h" @@ -30,6 +31,7 @@ #include "openvswitch/ofp-actions.h" #include "openvswitch/ofpbuf.h" #include "openvswitch/vlog.h" +#include "openvswitch/ofp-parse.h" #include "ovn-controller.h" #include "ovn/lib/chassis-index.h" #include "ovn/lib/ovn-sb-idl.h" @@ -236,6 +238,92 @@ get_zone_ids(const struct sbrec_port_binding *binding, } static void +put_replace_router_port_mac_flows(const struct + sbrec_port_binding *localnet_port, + const struct sbrec_chassis *chassis, + const struct hmap *local_datapaths, + struct ofpbuf *ofpacts_p, + ofp_port_t ofport, + struct ovn_desired_flow_table *flow_table) +{ + struct local_datapath *ld = get_local_datapath(local_datapaths, + localnet_port->datapath-> + tunnel_key); + ovs_assert(ld); + + uint32_t dp_key = localnet_port->datapath->tunnel_key; + uint32_t port_key = localnet_port->tunnel_key; + int tag = localnet_port->tag ? *localnet_port->tag : 0; + const char *network = smap_get(&localnet_port->options, "network_name"); + struct eth_addr chassis_mac; + + if (!network) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + VLOG_WARN_RL(&rl, "Physical network not configured for datapath: %ld " + "with localnet port", + localnet_port->datapath->tunnel_key); + return; + } + + /* Get chassis mac */ + if (!chassis_get_mac(chassis, network, &chassis_mac)) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + /* Keeping the log level low for backward compatibility. + * Chassis mac is a new configuration. + */ + VLOG_DBG_RL(&rl, "Could not get chassis mac for network: %s", network); + return; + } + + for (int i = 0; i < ld->n_peer_ports; i++) { + const struct sbrec_port_binding *rport_binding = ld->peer_ports[i]; + struct eth_addr router_port_mac; + char *err_str = NULL; + struct match match; + struct ofpact_mac *replace_mac; + + /* Table 65, priority 150. + * ======================= + * + * Implements output to localnet port. + * a. Flow replaces ingress router port mac with a chassis mac. + * b. Flow appends the vlan id localnet port is configured with. + */ + match_init_catchall(&match); + ofpbuf_clear(ofpacts_p); + + ovs_assert(rport_binding->n_mac == 1); + if ((err_str = str_to_mac(rport_binding->mac[0], &router_port_mac))) { + /* Parsing of mac failed. */ + VLOG_WARN("Parsing or router port mac failed for router port: %s, " + "with error: %s", rport_binding->logical_port, err_str); + free(err_str); + return; + } + + /* Replace Router mac flow */ + match_set_metadata(&match, htonll(dp_key)); + match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key); + match_set_dl_src(&match, router_port_mac); + + replace_mac = ofpact_put_SET_ETH_SRC(ofpacts_p); + replace_mac->mac = chassis_mac; + + if (tag) { + struct ofpact_vlan_vid *vlan_vid; + vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p); + vlan_vid->vlan_vid = tag; + vlan_vid->push_vlan_if_needed = true; + } + + ofpact_put_OUTPUT(ofpacts_p)->port = ofport; + + ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 150, 0, + &match, ofpacts_p, &localnet_port->header_.uuid); + } +} + +static void put_local_common_flows(uint32_t dp_key, uint32_t port_key, uint32_t parent_port_key, const struct zone_ids *zone_ids, @@ -707,6 +795,13 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_port_binding_by_name, } ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 100, 0, &match, ofpacts_p, &binding->header_.uuid); + + if (!strcmp(binding->type, "localnet")) { + put_replace_router_port_mac_flows(binding, chassis, + local_datapaths, ofpacts_p, + ofport, flow_table); + } + } else if (!tun && !is_ha_remote) { /* Remote port connected by localnet port */ /* Table 33, priority 100. diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml index 8c9e106..6275db1 100644 --- a/ovn/ovn-architecture.7.xml +++ b/ovn/ovn-architecture.7.xml @@ -1407,6 +1407,30 @@ egress pipeline of the destination localnet logical switch datapath and goes out of the integration bridge to the provider bridge ( belonging to the destination logical switch) via the localnet port. + While sending the packet to provider bridge, we also replace router + port mac as source mac with a chassis unique mac. + + This chassis unique mac is configured as global ovs config on each + chassis (eg. via "<code>ovs-vsctl set open . external-ids: + ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"</code>").More + details on this config are present in <code>ovn-controller</code>(8). + + If the above is not configured, then source mac would be the router + port mac. This could create problem if we have more than one chassis. + This is because, since the router port is distributed, hence same + mac,vlan tuple will seen by physical network from other chassis + as well. This could cause some/all of these issues: + <ul> + <li> + Continous mac moves in top of the rack switch (TOR). + </li> + <li> + TOR dropping the traffic, which is causing continous mac moves. + </li> + <li> + TOR blocking the ports from which mac moves are happening. + </li> + </ul> </li> <li> diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml index 1a2bc1d..89e88c4 100644 --- a/ovn/ovn-sb.xml +++ b/ovn/ovn-sb.xml @@ -301,6 +301,14 @@ See <code>ovn-controller</code>(8) for more information. </column> + <column name="external_ids" key="ovn-chassis-mac-mappings"> + <code>ovn-controller</code> populates this key with the set of options + configured in the <ref table="Open_vSwitch" + column="external_ids:ovn-chassis-mac-mappings"/> column of the + Open_vSwitch database's <ref table="Open_vSwitch" db="Open_vSwitch"/> + table. See <code>ovn-controller</code>(8) for more information. + </column> + <group title="Common Columns"> The overall purpose of these columns is described under <code>Common Columns</code> at the beginning of this document. diff --git a/tests/ovn.at b/tests/ovn.at index daf85a5..d6cbb7b 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -14017,3 +14017,200 @@ ovn-hv4-0 OVN_CLEANUP([hv1], [hv2], [hv3]) AT_CLEANUP + +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac]) +ovn_start + + +# In this test cases we create 2 switches, all connected to same +# physical network (through br-phys on each HV). Each switch has +# 1 VIF. Each HV has 1 VIF port. The first digit +# of VIF port name indicates the hypervisor it is bound to, e.g. +# lp23 means VIF 3 on hv2. +# +# Each switch's VLAN tag and their logical switch ports are: +# - ls1: +# - tagged with VLAN 101 +# - ports: lp11 +# - ls2: +# - tagged with VLAN 201 +# - ports: lp22 +# +# Note: a localnet port is created for each switch to connect to +# physical network. + +for i in 1 2; do + ls_name=ls$i + ovn-nbctl ls-add $ls_name + ln_port_name=ln$i + if test $i -eq 1; then + ovn-nbctl lsp-add $ls_name $ln_port_name "" 101 + elif test $i -eq 2; then + ovn-nbctl lsp-add $ls_name $ln_port_name "" 201 + fi + ovn-nbctl lsp-set-addresses $ln_port_name unknown + ovn-nbctl lsp-set-type $ln_port_name localnet + ovn-nbctl lsp-set-options $ln_port_name network_name=phys +done + +# lsp_to_ls LSP +# +# Prints the name of the logical switch that contains LSP. +lsp_to_ls () { + case $1 in dnl ( + lp?[[11]]) echo ls1 ;; dnl ( + lp?[[12]]) echo ls2 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +vif_to_ls () { + case $1 in dnl ( + vif?[[11]]) echo ls1 ;; dnl ( + vif?[[12]]) echo ls2 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +hv_to_num () { + case $1 in dnl ( + hv1) echo 1 ;; dnl ( + hv2) echo 2 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +vif_to_num () { + case $1 in dnl ( + vif22) echo 22 ;; dnl ( + vif21) echo 21 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +vif_to_hv () { + case $1 in dnl ( + vif[[1]]?) echo hv1 ;; dnl ( + vif[[2]]?) echo hv2 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +vif_to_lrp () { + echo router-to-`vif_to_ls $1` +} + +hv_to_chassis_mac () { + case $1 in dnl ( + hv[[1]]) echo aa:bb:cc:dd:ee:11 ;; dnl ( + hv[[2]]) echo aa:bb:cc:dd:ee:22 ;; dnl ( + *) AT_FAIL_IF([:]) ;; + esac +} + +ip_to_hex() { + printf "%02x%02x%02x%02x" "$@" +} + +net_add n1 +for i in 1 2; do + sim_add hv$i + as hv$i + ovs-vsctl add-br br-phys + ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys + ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i" + ovn_attach n1 br-phys 192.168.0.$i + + ovs-vsctl add-port br-int vif$i$i -- \ + set Interface vif$i$i external-ids:iface-id=lp$i$i \ + options:tx_pcap=hv$i/vif$i$i-tx.pcap \ + options:rxq_pcap=hv$i/vif$i$i-rx.pcap \ + ofport-request=$i$i + + lsp_name=lp$i$i + ls_name=$(lsp_to_ls $lsp_name) + + ovn-nbctl lsp-add $ls_name $lsp_name + ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i 192.168.$i.$i" + ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i + + OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup]) + +done + +ovn-nbctl lr-add router +ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24 +ovn-nbctl lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24 + +ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port ls1-to-router type=router options:router-port=router-to-ls1 -- lsp-set-addresses ls1-to-router router +ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port ls2-to-router type=router options:router-port=router-to-ls2 -- lsp-set-addresses ls2-to-router router + +ovn-nbctl --wait=sb sync +#ovn-sbctl dump-flows + +ovn-nbctl show +ovn-sbctl show + +OVN_POPULATE_ARP + +test_ip() { + # This packet has bad checksums but logical L3 routing doesn't check. + local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5 + local packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 + shift; shift; shift; shift; shift + hv=`vif_to_hv $inport` + hv_num=`hv_to_num $hv` + chassis_mac=`hv_to_chassis_mac $hv` + as $hv ovs-appctl netdev-dummy/receive $inport $packet + #as $hv ovs-appctl ofproto/trace br-int in_port=$inport $packet + in_ls=`vif_to_ls $inport` + in_lrp=`vif_to_lrp $inport` + for outport; do + out_ls=`vif_to_ls $outport` + if test $in_ls = $out_ls; then + # Ports on the same logical switch receive exactly the same packet. + echo $packet + else + # Routing decrements TTL and updates source and dest MAC + # (and checksum). + outport_num=`vif_to_num $outport` + out_lrp=`vif_to_lrp $outport` + echo f000000000${outport_num}aabbccddee${hv_num}${hv_num}08004500001c00000000"3f1101"00${src_ip}${dst_ip}0035111100080000 + fi >> $outport.expected + done +} + +# Dump a bunch of info helpful for debugging if there's a failure. + +echo "------ OVN dump ------" +ovn-nbctl show +ovn-sbctl show + +echo "------ hv1 dump ------" +as hv1 ovs-vsctl show +as hv1 ovs-vsctl list Open_Vswitch + +echo "------ hv2 dump ------" +as hv2 ovs-vsctl show +as hv2 ovs-vsctl list Open_Vswitch + +echo "Send traffic" +sip=`ip_to_hex 192 168 1 1` +dip=`ip_to_hex 192 168 2 2` +test_ip vif11 f00000000011 000001010203 $sip $dip vif22 + +sleep 1 + +echo "----------- Post Traffic hv1 dump -----------" +as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int +as hv1 ovs-appctl fdb/show br-phys + +echo "----------- Post Traffic hv2 dump -----------" +as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int +as hv2 ovs-appctl fdb/show br-phys + +OVN_CHECK_PACKETS([hv2/vif22-tx.pcap], [vif22.expected]) + +OVN_CLEANUP([hv1],[hv2]) + +AT_CLEANUP
Background: [1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html [2] https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing Key difference between an overlay logical switch and vlan backed logical switch is that for vlan logical switches packets are not encapsulated. Hence, if a distributed router port is connected to vlan backed logical switch, then router port mac as source mac could be seen from multiple hypervisors. Same <mac,vlan> pairs coming from multiple ports from a top of the rack switch (TOR) perspective could be seen as a security threat and it could send alarms, drop the packets or block the ports etc. This patch addresses the same by introducing the concept of chassis mac. A chassis mac is CMS provisioned unique mac per chassis. For any routed packet (i.e source mac is router port mac) going on the wire on a vlan type logical switch, we will replace its source mac with chassis mac. This replacing of source mac with chassis mac will happen in table=65 of the logical switch datapath. A flow is added at priority 150, which matches the source mac and replaces it with chassis mac if the value is a router port mac. Example flow: cookie=0x0, duration=67765.830s, table=65, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=150,reg15=0x1,metadata=0x4, dl_src=00:00:01:01:02:03 actions=mod_dl_src:aa:bb:cc:dd:ee:ff, mod_vlan_vid:1000,output:16 Here, 00:00:01:01:02:03 is router port mac and aa:bb:cc:dd:ee:ff is chassis mac. Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com> --- ovn/controller/binding.c | 12 +-- ovn/controller/chassis.c | 64 +++++++++++- ovn/controller/chassis.h | 4 + ovn/controller/ovn-controller.8.xml | 10 ++ ovn/controller/ovn-controller.c | 4 +- ovn/controller/ovn-controller.h | 5 +- ovn/controller/physical.c | 95 +++++++++++++++++ ovn/ovn-architecture.7.xml | 24 +++++ ovn/ovn-sb.xml | 8 ++ tests/ovn.at | 197 ++++++++++++++++++++++++++++++++++++ 10 files changed, 411 insertions(+), 12 deletions(-)